Course Description

This crash course serves as a practical introduction to the tools and methods used to create a dataset of microbiome data and visualize environmental diversity. The first part of the course will consist of applying the DADA2 algorithm to a set of paired-end fastq files, split by sample and with the barcodes/adapters removed.

From this dataset, participants are expected to obtain an amplicon sequence variant (ASV) table, which is a higher-resolution equivalent of the traditional OTU table.

After assigning taxonomy to the output sequences, the dataset will be converted to the BIOM format, and, using the Biome-Shiny tool, its composition and diversity will be visualized through a series of interactive plots. By the end of the course, participants should be able to apply the learned methods to their own data.

Target Audience

Scientists who want to learn the basics of microbiome analysis in Bioinformatics, and have little knowledge of programming.

Course Documentation

The dataset provided for this course is the same as those used in the mothur MiSeq SOP. The data consists of fastq files, which were generated by 2x250 Illumina MiSeq amplicon sequencing of the V4 region of the 16S rRNA gene collected longitudinally from a mouse post-weaning.

Note - The dataset used for this crash course is available by clicking the link below. You need to unzip this file and follow the instructions throughout the documentation.

Download Course Dataset

Morning Period

Creating an analysis-ready dataset with the DADA2 pipeline

During the morning period, you will be introduced to next-generation sequencing (NGS) of 16S rRNA gene amplicon processing techniques, that allow you to convert large amounts of raw sequence data into clusters of similar sequences, known as Operational Taxonomic Units (OTUs) that can be annotated against a sequence database. You will apply the DADA2 algorithm to eliminate sequencing errors from reads (denoising) and obtain a table of Amplicon Sequence Variants (ASVs), which have less genetic divergence than OTUs.


Afternoon Period

Visualizing microbiome diversity and composition with Biome-Shiny

The afternoon period is a hands-on demonstration of Biome-Shiny, a user-friendly web application to visualize the composition and diversity of microbial communities. Participants will upload the dataset they generated during the morning period and visualize the microbial composition, primary microbial communities and alpha- and beta-diversity of the dataset.


Learning Objectives

By the end of this course, participants are expected to be able to apply the bioinformatics pipeline included in the “dada2” library to their own data, to process sets of paired-end sequences into an ASV table, going through the process of filtering and trimming reads, estimating error rates, dereplicating and denoising sequences, creating the ASV table with the merged sequences and assigning taxonomy to the ASVs after removing chimeras. Participants should understand the basics of exploring and analyzing a microbial community, and how it can be done through the Biome-Shiny application.

Course Pre-requisites

This course has no pre-requisites, although basic knowledge of R scripting is recommended. If you intend to replicate the DADA2 pipeline with your own data, some knowledge of R may be needed.



About is the Portuguese distributed e-infrastructure for biological data and the Portuguese ELIXIR node.

It supports the national scientific system through best practices in data management and state of the art data analysis, and interfaces with both academia and industry, making research available for innovation, namely in sectors such as agro-food and forestry, sea, and health. services include ELIXIR services such as our training programme and computing facilities, as well as consulting services in data analysis and management, and a number of community services.

The source for this course webpage is on github.

Creative Commons License Crash Course: Microbiome Visualization with Biome-Shiny by is licensed under a Creative Commons Attribution 4.0 International License.