Figure 1: Automated multivariate phenotyping of cells by combinatorial RNAi and automated image analysis.

The Huber group develops large-scale statistical models that integrate genomic, molecular and phenotypic data to understand the variations between individuals in health and disease.

Previous and current research

A central challenge of biomedicine is to understand how the biological systems that underlie healthy life and disease react to variations in their make-up (genetic variation, for example) or their environment (drugs, for example). Our group brings together researchers from quantitative disciplines – mathematics, statistics, physics and computer science – and from different fields of biology and medicine.

We employ statistics and machine learning to discover patterns in large datasets, understand mechanisms, and act upon predictive and causal relationships to, ultimately, address questions in personal genomics and molecular medicine. More specifically, we use large-scale data acquisition and quantitative modelling of phenotypes and molecular profiles, systematic perturbations (such as drugs or high-throughput genetics) and computational analysis of non-linear, epistatic interaction networks.

Genomics and other molecular profiling technologies have resulted in increasingly detailed biology-based understanding of human disease. The next challenge is using this knowledge to engineer treatments and cures. We integrate observational data – such as from large-scale sequencing and molecular profiling –, with interventional data – systematic genetic or chemical screens – to reconstruct a fuller picture of the underlying causal relationships and actionable intervention points. A fascinating example is our work on genotype-specific vulnerability and resistance of tumours to targeted drugs in our precision oncology project.

As we engage with new data types, our aim is to develop high-quality computational and statistical methods of wide applicability. We consider the release and maintenance of scientific software an integral part of scientific publishing, and we contribute to the Bioconductor Project, an open source software collaboration to provide tools for the analysis and study of high-throughput genomic data. An example is our DESeq2 package for analysing count data from high-throughput sequencing.

Future projects and goals

We aim to develop the computational techniques needed to analyse exciting biological data types:

  • Clinical multi-omics: we work with clinical researchers to develop predictive assays and algorithms.
  • Many powerful mathematical ideas exist but are difficult to access. We translate them into practical methods and software that make a real difference to biomedical researchers, an approach we term ‘translational statistics’.
  • Quantitative proteomics and in vivo drug-target mapping.
  • Single-cell and single-molecule -omics.
  • High-throughput multidimensional phenotyping: mapping gene-gene and gene-drug interactions through computational image analysis of cell and tissue microscopy, machine learning and mathematical modelling.
Figure 2: Ternary plots of relative sensitivities to targeted kinase inhibitors for a cohort of primary tumour samples of chronic lymphocytic leukaemia (CLL).

Figure 2: Ternary plots of relative sensitivities to targeted kinase inhibitors for a cohort of primary tumour samples of chronic lymphocytic leukaemia (CLL).