The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing millions of sequences at once. The first of the “next-generation” sequencing technologies was developed in the 1990s using a bead-based method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. The technology was sold to Solexa (later purchased by Illumina) that later led to the development of sequencing-by-synthesis to produce hundreds of thousands of short DNA sequences. With high-throughput/massively parallel sequencing, these bits of DNA sequence are generated, requiring novel methods to align small pieces of sequence to a reference genome. In cases of finding variants in the genome, sequencing reads are aligned to determine if there is conservation or polymorphism at each sequence site. Computational resources for completing these tasks require large data storage, high speed processing and novel software methods.
Within the CPHG, several projects make use of next-gen sequencing to examine variation in the genome and how it affects complex human phenotypes and risk of disease. This work can be partitioned into three major areas. First, the study of DNA variation (mutation) as it relates to human disease. Diseases can be divided into rare and common but most are determined by both genes and environment. Diseases studied in the CPHG include type 1 diabetes, type 2 diabetes, atherosclerosis, cancer and metabolic disorders. Secondly, there are population genetic analyses of the sequence data, including the enumeration of the variants in a group of subjects – the frequencies and distributions of the variants obtained from sequence analysis and how those variants reflect the evolutionary forces that occur on a population. A third focus of the CPHG is called functional genomics, how the action of the variants in genes is translated into protein and phenotype. A major focus of faculty in the CPHG is development of new analytic tools to mine next-gen sequence data, from variant detection to structural change to the statistical association of multiple rare/infrequent variants in genes on disease risk.