Located at EMBL-EBI, Hinxton, United Kingdom and within the Genome Biology Unit, EMBL Heidelberg.

The Stegle group develops advanced statistical approaches for unravelling molecular variation at a genome-wide scale, accounting for both genetic and environmental factors.

Previous and current research

Our interest lies in computational approaches for unravelling the genotype-phenotype map on a genome-wide scale. How do genetic background and environment jointly shape phenotypic traits or cause diseases? How are genetic and external factors integrated at different molecular levels, and how variable are these molecular readouts between individual cells?

We use statistics as our main tool to answer these questions. To make accurate inferences from high-dimensional omics datasets, it is essential to account for biological and technical noise and to propagate evidence strength between different steps in the analysis. To address these needs, we develop statistical analysis methods in the areas of gene regulation, genome-wide association studies (GWAS) and causal reasoning in molecular systems.

Our methodological work ties in with experimental collaborations and we are actively developing methods to fully exploit large-scale datasets that are obtained using the most recent technologies. In doing so, we derive computational methods to dissect phenotypic variability at the level of the transcriptome and the proteome and we derive new tools for single-cell biology.

Future projects and goals

We will continue to develop innovative statistical approaches to analyze data from high-throughput genetic and molecular profiling studies. We are particularly interested in following up our recent efforts to model single-cell variation data. A major challenge in this area will be the integration of multiple modalities in single-cell genomics, for example linking single-cell epigenome variation with single-cell RNA-seq. We are particularly interested in applying these methods to data from the Human Induced Pluripotent Stem Cell Initiative (HipSci), in which we are a partner. 

Figure 1: Illustration of statistical methodology to dissect transcriptional heterogenetiy in single-cell RNA-Seq datasets (adapted from Buettner et al. 2015).

Figure 1: Illustration of statistical methodology for dissecting transcriptional heterogeneity in single-cell RNA-seq datasets (adapted from Buettner et al., 2015). Left: Underlying source of variation in single-cell transcriptome data. Right: Illustration of our scLVM approach to identify and account for such factors.