AI en Data Science


Meer informatie
° DA2122-M14 Engels


High-throughput sequencing technologies allow easy characterisation of the microbiome, but the data analysis faces many particular issues and difficulties. The data analysis starts with the processing of the raw read counts to turn them into an OTU table. In this process, quality control, filtering and clustering into OTUs are essential steps. Once the OTU count table is ready, the choice of data analysis method depends on the research objectives, but very often a first visual data exploration is performed.

Ordination methods, which often originate from ecology, are well suited for this purpose, but new methods tailored to microbiome data behave better for the overdispersed, zero inflated sequencing data. Formal statistical data analysis methods are required for identifying species that are differentially abundant between several conditions; again there is a need for special methods that can deal with overdispersion, zero-inflation, library size variability and potentially with the compositional nature of microbiome data.

The data analysis becomes even more elaborated for longitudinal data when studying the evolution of the microbiome over time. These analyses may focus on either individual taxa or on diversity of the microbial community (richness, alpha and beta diversity, ...). We focus on 16S rRNA amplicon sequencing data.

This course is part of a larger course series in Data Analysis consisting of 19 individual modules. Find more information and enroll for this module via


The course starts with a brief overview of the processing of raw reads data into an OTU table (including filtering, trimming and clustering into OTUs). We continue with summarizing, exploring and plotting the high dimensional data with ordination and clustering methods. Next we focus on the estimation of diversity (including eveness, richness, beta diversity) and relative abundances, while spending attention on normalization issues. We will discuss several methods for testing for differential abundance and diversity, including methods for longitudinal data analysis.

During the practical exercises we will use R and several packages that will be provided later on.

Type opleiding:
Kort- en langlopende opleidingen
AI en Data Science, Wetenschappen
2021 - 2022
Olivier Thas
Meer informatie

Uw browser voldoet niet aan de minimale vereisten om deze website te bekijken. Onderstaande browsers zijn compatibel. Mocht je geen van deze browsers hebben, klik dan op het icoontje om de gewenste browser te downloaden.