The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing millions of sequences at once. The first of the “next-generation” sequencing technologies was developed in the 1990s using a bead-based method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. The technology was sold to Solexa (later purchased by Illumina) that later led to the development of sequencing-by-synthesis to produce hundreds of thousands of short DNA sequences. With high-throughput/massively parallel sequencing, these bits of DNA sequence are generated, requiring novel methods to align small pieces of sequence to a reference genome. In cases of finding variants in the genome, sequencing reads are aligned to determine if there is conservation or polymorphism at each sequence site. Computational resources for completing these tasks require large data storage, high speed processing and novel software methods.
Within the CPHG, several projects make use of next-gen sequencing to examine variation in the genome and how it affects complex human phenotypes and risk of disease. This work can be partitioned into three major areas. First, the study of DNA variation (mutation) as it relates to human disease. Diseases can be divided into rare and common but most are determined by both genes and environment. Diseases studied in the CPHG include type 1 diabetes, type 2 diabetes, atherosclerosis, cancer and metabolic disorders. Secondly, there are population genetic analyses of the sequence data, including the enumeration of the variants in a group of subjects – the frequencies and distributions of the variants obtained from sequence analysis and how those variants reflect the evolutionary forces that occur on a population. A third focus of the CPHG is called functional genomics, how the action of the variants in genes is translated into protein and phenotype. A major focus of faculty in the CPHG is development of new analytic tools to mine next-gen sequence data, from variant detection to structural change to the statistical association of multiple rare/infrequent variants in genes on disease risk.
Historically, geneticists have dissected the genetics of disease by directly identifying changes in DNA that alter disease-related phenotypes. This approach has successfully identified a whole host of genetic variants, and in some cases the underlying genes, that increase an individual’s risk of developing diseases such as obesity, atherosclerosis, diabetes and osteoporosis. However, this approach fails to provide insight on the cellular mechanisms altered by these variants or the larger gene networks within which they function. In recent years, new technologies have paved the way to investigate the genetic basis other cellular components in a massively parallel fashion and at unprecedented resolution. These advances have enabled the quantification of molecular phenotypes such as gene expression (referred to as the “transcriptome”), metabolites (metabolome) and proteins (proteome), on a genome-wide scale. These new tools provide geneticists with the opportunity uncover networks of interacting genetic variants, transcripts and proteins and to begin to understand how their perturbation leads to disease.
Disease-associated genetic variation elicits its influence by perturbing biological components and their interactions. Systems genetics can be used to identify these perturbations, and the genes and networks involved, and the mechanisms through which they lead to disease.
This new area of research is referred to as “Systems Genetics” and it seeks to understand complexity by combining the principles of systems biology and genetics to uncover connections between genotype and complex disease. Importantly, systems genetics attempts to explain the role of genetic variation in cellular function and disease from the perspective of the entire system, not simply from the level of individual genes. The Farber lab in the CPHG is using systems genetics to identify genes and biological processes that affect bone development. This research promises to lead to the identification of novel therapeutic targets that can be use to combat diseases such as osteoporosis.
Type 1 Diabetes (T1D)
Type 1 diabetes (T1D), also known as ‘juvenile diabetes’, results from a self-destructive immune response against the insulin producing pancreatic b cells. As a result of this autoimmune response, patients lose the ability to secrete insulin and become dependent on insulin replacement therapy. Unfortunately, this form of treatment is often insufficient for natural control of insulin production to glucose, so a majority of T1D patients still succumb to a number of debilitating complications including blindness, kidney disease, amputation and heart disease.
Genomic approaches capture significant, but previously unrecognized, variability in a complex disease like T1D. The advent of genome-wide association studies (GWAS) has revealed an unexpected level of diversity in the loci that contribute to risk for most complex disorders, i.e., those having substantial contributions to risk from both genetic and environmental factors. T1D, which is a research focus within the CPHG, provides a good example. Our GWAS studies have identified more than 40 chromosomal regions (loci) that contribute to disease risk. Specific genes within these regions that are affected by causative variants are being fine mapped and defined, primarily through the development of a custom, 200,000-SNP genotyping array (ImmunoChip). The CPHG is currently genotyping over 75,000 samples on the ImmunoChip to identify candidate genes in the T1D loci for the purpose of targeted sequencing and functional studies. Similar approaches are being utilized for discovery of genes contributing to the complications of T1D.
These results create both an opportunity and a challenge for diabetes investigators. For many of the mapped chromosomal regions, the specific risk variants and genes have yet to be identified, creating an opportunity for further insights into disease pathogenesis through detailed genetic analyses. It is also clear that the mapped loci do not account for all of the genetic risk implied by familial aggregation of T1D. Thus, there are new opportunities for discovery of novel risk loci. At the same time, functional studies of T1D and other complex genetic disorders must now take into account variation at the recently mapped risk loci. Interpretation of results obtained in functional studies will require knowledge of the background genetics of cell lines utilized, and comparisons between biospecimens derived from different individuals will require genetic matching.