Causal inference

Genetic variation is abundant across individuals within a population, and it can explain a substantial proportion of the risk of disease and variation in quantitative phenotypes. An interesting thing about genetic variation is that it can be treated like a natural experiment, in which observing the genotype at a relevant locus is akin to observing the outcome of a randomly allocated treatment effect. This is the basis of Mendelian randomisation, a method for inferring causal relationships between phenotypes. Much of my current work focuses on developing statistical and computational methods to improve the scope and reliability of Mendelian randomisation.


The world’s largest collection of genome-wide association study (GWAS) summary data. The goal of this project is to systematically harmonise complete GWAS summary datasets against the human genome reference panel in order to enable rapid and robust programmatic analysis of thousands of molecular traits, complex traits and diseases.

Main website:

Genetics of DNA methylation

DNA methylation levels in human blood varies across tissues and between individuals. The goal of GoDMC is to map the genetic factors that drive that variability. We can use these factors as natural experiments, to understand the consequences of perturbations of DNA methylation to human trait and disease variation.

Main website:

Genetics of Huntington’s disease

Huntington’s disease is caused by a well-known expansion of repetitive DNA sequence in the HTT gene. However, there is substantial variation between individuals with the pathogenic allele in terms of age of onset and symptoms experienced. I work with the CHDI foundation to develop a better understanding of the complex genetic influences on disease progression of Huntington’s disease/

Statistical methods for Covid-19 research

The world is desperate to learn the epidemiological factors that influence risk of infection by the SARS-CoV-2 virus, and severity of ensuing Covid-19 disease. As a consequence, data is being gathered rapidly and in many different contexts to answer these questions - but the non-random manner in which data is being collected opens up problems in causal inference and prediction models. I am leading an effort to understand these biases and to develop tools to counteract them.



  • Ildar Sadreev (Post-doc)
  • Yoonsu Cho (Post-doc)
  • Sam Neaves (Post-doc, data manager)
  • Hannah Wilson (PhD student)
  • Tom Battram (PhD student)
  • Lily Andrews (PhD student)
  • Chris Moreno-Stokoe (PhD student)
  • Stoil Ganev (PhD student)


  • Laurence Howe (PhD student)
  • Charles Laurin (Post-doc)
  • Aidan Ball (MSc student)

Available positions

I am always interested in working with talented individuals in any areas of research described above. Please email me with questions about available positions and current projects if you are interested in working with me.