The Zeller team develops computational tools for metagenomics data analysis to elucidate the microbiome’s role in human health and disease and its responses to chemical perturbations, such as drug treatments.

Previous and current research

The human microbiome, the complex ecosystem of microorganisms colonising our body, has increasingly been recognised as an important determinant of human physiology. Detailed investigations of microbes in situ (without culturing) have become possible through advances in sequencing technology and computational analysis methodology. These have now started to be applied in large clinical studies to associate changes in microbiome composition and function with human diseases. However, analysis and interpretation of such data remains challenging:

  • Quantifying microbial (sub-)species and functions in an accurate manner consistently across various sequencing readouts (16S, shotgun metagenomics and metatranscriptomics) is still difficult for complex communities consisting of many uncultured organisms.
  • Microbiome data interpretation is often complicated by many factors that vary in addition to the phenomenon of interest; typical confounders include differences in life-style, co-morbidities or treatments. Comparisons across studies (meta-analyses) are hampered by batch effects arising from technical variation in sample preservation and preparation.
  • Perturbations of the microbiome are poorly understood to date. Systematic data and predictive models on the specific effects of environmental exposures (such as host-targeted drugs) on the microbiome are lacking despite this being a key aspect of personalised health and a potential entry point for designing intervention strategies targeted at the microbiome.

To address these challenges, we are actively contributing to the development of software tools for accurate profiling of both previously sequenced as well as uncharacterised microbial species and the functions encoded in their genomes and transcriptomes. To associate changes in these profiles with various host phenotypes of interest, we have investigated various statistics and machine learning tools and evaluated their applicability to microbiome sequencing data and are currently making software pipelines publicly available that automate such analyses. Using these, we have recently demonstrated that gastro-intestinal diseases can be accurately detected from faecal microbiome readouts. For colorectal cancer in particular this has potential for developing novel non-invasive screening methods (see figure).

Figure 1: Colorectal cancer (CRC) can be detected using a classification approach based on microbial markers (top panel) quantified in faecal samples by metagenomic sequencing.

Figure 1: Colorectal cancer (CRC) can be detected using a classification approach based on microbial markers (top panel) quantified in faecal samples by metagenomic sequencing; its accuracy was evaluated in cross-validation and independent external validation (bottom panels) in comparison to the standard non-invasive screening test (FOBT Hemoccult).

Future projects and goals

  • Develop better statistical models specifically for the analysis of microbiome data which are characterised by much larger dispersion than many other types of count data and which also handle confounding in a principled way.
  • Develop meta-analysis tools for microbiome research that assess batch effects and try to correct for them where possible.
  • Integrative analysis of 16S, metagenomics and metatranscriptomics data with the goal of associating these microbiome readouts to molecular profiles of host health states.
  • Contribute analysis methodology to the collaborative efforts involving the Typas, Bork and Patil groups aiming to systematically investigate the effect of chemical (human drugs) and dietary perturbations on the gut microbiome.