AI and Data Science


More information
° DA2122-M14 EN
Tags: Postacademische opleiding


High-throughput sequencing technologies allow easy characterisation of the microbiome, but the data analysis faces many particular issues and difficulties. The data analysis starts with the processing of the raw read counts to turn them into an OTU table. In this process, quality control, filtering and clustering into OTUs are essential steps. Once the OTU count table is ready, the choice of data analysis method depends on the research objectives, but very often a first visual data exploration is performed.

Ordination methods, which often originate from ecology, are well suited for this purpose, but new methods tailored to microbiome data behave better for the overdispersed, zero inflated sequencing data. Formal statistical data analysis methods are required for identifying species that are differentially abundant between several conditions; again there is a need for special methods that can deal with overdispersion, zero-inflation, library size variability and potentially with the compositional nature of microbiome data.

The data analysis becomes even more elaborated for longitudinal data when studying the evolution of the microbiome over time. These analyses may focus on either individual taxa or on diversity of the microbial community (richness, alpha and beta diversity, ...). We focus on 16S rRNA amplicon sequencing data.

This course is part of a larger course series in Data Analysis consisting of 19 individual modules. Find more information and enroll for this module via


The course starts with a brief overview of the processing of raw reads data into an OTU table (including filtering, trimming and clustering into OTUs). We continue with summarizing, exploring and plotting the high dimensional data with ordination and clustering methods. Next we focus on the estimation of diversity (including eveness, richness, beta diversity) and relative abundances, while spending attention on normalization issues. We will discuss several methods for testing for differential abundance and diversity, including methods for longitudinal data analysis.

During the practical exercises we will use R and several packages that will be provided later on.

Course number:
Short- en long-term programmes
Area of interest:
AI and Data Science, Sciences
Academic year:
2021 - 2022
Starting date:
Olivier Thas
Contact person:
More information

We use cookies to to give you the best possible user experience on our website. You can refuse the installation of cookies. By doing so some parts of this website will not work in an optimal way. Read more.

Your browser does not meet the minimum requirements to view this website. The browsers below are compatible. If you do not have one of these browsers, click on the icon to download the desired browser.