Our research aims to develop computational approaches to gain biological or clinical insights into human disease. Using genetic variation as a foundation, in particular “high impact” variants from large-scale sequencing studies, we hope to shed insight into genes, disease processes, and experimental manipulations that could one day lead to improved treatments or patient care.

Rare variant association studies

Our lab grew from rare coding variant association studies, which began in earnest around 2009 as large-scale sequencing became cost-effective. Rare coding variants provide direct links between disease and a gene – arguably the most important goal of a genetic study – and therefore offer an extremely valuable complement to genome wide association studies, or GWAS, which do not immediately implicate genes in disease.

You can read about our prior work on T2D rare variant associations on our publications page, or see slides from a presentation on the history of T2D rare variant discovery if you would prefer a digest. We most recently quantified the role of coding variants in the genetic architecture of T2D and established their potential utility in genetic risk prediction.

Lab projects in this area are ideal starting points for prospective lab members with minimal experience in programming or statistics. There are lots of interesting analyses of large-scale sequencing data that can be done to help understand the role that rare coding variation plays in disease and the implications this has for future study designs.

Statistical methods for genetic support

Drug targets supported by human genetics are two-to-eight times as likely to succeed. But what does it mean for human genetics to “support” a gene? We build statistical models that integrate rare variants, common variants, gene annotations, and various types of ‘omics data to model the probability that a gene is involved in disease.

We initially advanced the idea of using rare variant associations for genetic “decision support” in our 2019 study, and we most recently outlined simple principles for using common and rare variants to evaluate a quantitative measure of genetic support.

Lab projects in this area are ideal starting points for prospective lab members with strong computational or statistical experience, even if their biological knowledge is limited. We work a lot with probabilistic models and emphasize statistically rigorous and efficient implementations of them.

Disease subtyping, precision medicine, and the merger of rare and common diseases

There is a growing realization that common, complex diseases like diabetes are better described as collections of more genetically and clinically homogeneous “subtypes”. Conversely, we hypothesize that many assumed monogenic diseases are actually caused by incompletely penetrant mutations across multiple or even many genes. A long-term aim of the lab is to understand the genetic mechanisms shared by rare and common diseases with similar clinical presentations and to use this insight to better diagnose rare diseases and develop precision medicine approaches for common diseases. 

We wrote a review article several years ago outlining this model for type 2 diabetes, and we have a recent study that demonstrates its application to youth-onset T2D. We also collaborate with the Udler Lab on more clinical approaches to this topic.

Lab projects in this area are great for prospective lab members with clinical or biological expertise, particularly if they would like to leverage this to gain a foothold into the world of statistical genetics or precision medicine.

Democratization of genetic and genomic data

Ultimately, genetic association studies and computational methods can only produce insights into biology and disease if experimentalists can make use of their outputs. This is a harder problem than it at first appears because the producers of these data (computational biologists and statistical geneticists) have different philosophies and interests than the consumers of these data (biologists).

We build a series of knowledge portals that implement cutting-edge statistical genetics methods at scale and present their results in a way that enables the insights embedded within genetic and genomic data to be accessible to any biologist. Our work has been described in several publications, including most recently our flagship portal the Type 2 Diabetes Knowledge Portal.

Lab projects in this area are great for prospective lab members who are interested in software engineering projects, or prospective lab members with either no programming or statistical experience whatsoever – we generate a ton of results that we deposit into these portals, all of which need to be investigated for quality control and potential new biological findings.