For the past 30 years, the Sanger method has been the dominant approach and gold standard for DNA sequencing. The commercial launch of the first massively parallel pyrosequencing platform in 2005 ...ushered in the new era of high-throughput genomic analysis now referred to as next-generation sequencing (NGS).
This review describes fundamental principles of commercially available NGS platforms. Although the platforms differ in their engineering configurations and sequencing chemistries, they share a technical paradigm in that sequencing of spatially separated, clonally amplified DNA templates or single DNA molecules is performed in a flow cell in a massively parallel manner. Through iterative cycles of polymerase-mediated nucleotide extensions or, in one approach, through successive oligonucleotide ligations, sequence outputs in the range of hundreds of megabases to gigabases are now obtained routinely. Highlighted in this review are the impact of NGS on basic research, bioinformatics considerations, and translation of this technology into clinical diagnostics. Also presented is a view into future technologies, including real-time single-molecule DNA sequencing and nanopore-based sequencing.
In the relatively short time frame since 2005, NGS has fundamentally altered genomics research and allowed investigators to conduct experiments that were previously not technically feasible or affordable. The various technologies that constitute this new paradigm continue to evolve, and further improvements in technology robustness and process streamlining will pave the path for translation into clinical diagnostics.
The higher throughput and lower per-base cost of next-generation sequencing (NGS) as compared to Sanger sequencing has led to its rapid adoption in clinical testing. The number of laboratories ...offering NGS-based tests has also grown considerably in the past few years, despite the fact that specific Clinical Laboratory Improvement Amendments of 1988/College of American Pathologists (CAP) laboratory standards had not yet been developed to regulate this technology.
To develop a checklist for clinical testing using NGS technology that sets standards for the analytic wet bench process and for bioinformatics or "dry bench" analyses. As NGS-based clinical tests are new to diagnostic testing and are of much greater complexity than traditional Sanger sequencing-based tests, there is an urgent need to develop new regulatory standards for laboratories offering these tests.
To develop the necessary regulatory framework for NGS and to facilitate appropriate adoption of this technology for clinical testing, CAP formed a committee in 2011, the NGS Work Group, to deliberate upon the contents to be included in the checklist. Results . -A total of 18 laboratory accreditation checklist requirements for the analytic wet bench process and bioinformatics analysis processes have been included within CAP's molecular pathology checklist (MOL).
This report describes the important issues considered by the CAP committee during the development of the new checklist requirements, which address documentation, validation, quality assurance, confirmatory testing, exception logs, monitoring of upgrades, variant interpretation and reporting, incidental findings, data storage, version traceability, and data transfer confidentiality.
High-throughput sequencing enables unbiased profiling of microbial communities, universal pathogen detection, and host response to infectious diseases. However, computation times and algorithmic ...inaccuracies have hindered adoption.
We present Taxonomer, an ultrafast, web-tool for comprehensive metagenomics data analysis and interactive results visualization. Taxonomer is unique in providing integrated nucleotide and protein-based classification and simultaneous host messenger RNA (mRNA) transcript profiling. Using real-world case-studies, we show that Taxonomer detects previously unrecognized infections and reveals antiviral host mRNA expression profiles. To facilitate data-sharing across geographic distances in outbreak settings, Taxonomer is publicly available through a web-based user interface.
Taxonomer enables rapid, accurate, and interactive analyses of metagenomics data on personal computers and mobile devices.
Phevor integrates phenotype, gene function, and disease information with personal genomic data for improved power to identify disease-causing alleles. Phevor works by combining knowledge resident in ...multiple biomedical ontologies with the outputs of variant-prioritization tools. It does so by using an algorithm that propagates information across and between ontologies. This process enables Phevor to accurately reprioritize potentially damaging alleles identified by variant-prioritization tools in light of gene function, disease, and phenotype knowledge. Phevor is especially useful for single-exome and family-trio-based diagnostic analyses, the most commonly occurring clinical scenarios and ones for which existing personal genome diagnostic tools are most inaccurate and underpowered. Here, we present a series of benchmark analyses illustrating Phevor’s performance characteristics. Also presented are three recent Utah Genome Project case studies in which Phevor was used to identify disease-causing alleles. Collectively, these results show that Phevor improves diagnostic accuracy not only for individuals presenting with established disease phenotypes but also for those with previously undescribed and atypical disease presentations. Importantly, Phevor is not limited to known diseases or known disease-causing alleles. As we demonstrate, Phevor can also use latent information in ontologies to discover genes and disease-causing alleles not previously associated with disease.
Common variable immunodeficiency (CVID) is a heterogeneous disorder characterized by antibody deficiency, poor humoral response to antigens, and recurrent infections. To investigate the molecular ...cause of CVID, we carried out exome sequence analysis of a family diagnosed with CVID and identified a heterozygous frameshift mutation, c.2564delA (p.Lys855Serfs∗7), in NFKB2 affecting the C terminus of NF-κB2 (also known as p100/p52 or p100/p49). Subsequent screening of NFKB2 in 33 unrelated CVID-affected individuals uncovered a second heterozygous nonsense mutation, c.2557C>T (p.Arg853∗), in one simplex case. Affected individuals in both families presented with an unusual combination of childhood-onset hypogammaglobulinemia with recurrent infections, autoimmune features, and adrenal insufficiency. NF-κB2 is the principal protein involved in the noncanonical NF-κB pathway, is evolutionarily conserved, and functions in peripheral lymphoid organ development, B cell development, and antibody production. In addition, Nfkb2 mouse models demonstrate a CVID-like phenotype with hypogammaglobulinemia and poor humoral response to antigens. Immunoblot analysis and immunofluorescence microscopy of transformed B cells from affected individuals show that the NFKB2 mutations affect phosphorylation and proteasomal processing of p100 and, ultimately, p52 nuclear translocation. These findings describe germline mutations in NFKB2 and establish the noncanonical NF-κB signaling pathway as a genetic etiology for this primary immunodeficiency syndrome.
Ikaros family zinc finger 1 (IKZF1) is a haematopoietic transcription factor required for mammalian B-cell development. IKZF1 deficiency also reduces plasmacytoid dendritic cell (pDC) numbers in ...mice, but its effects on human DC development are unknown. Here we show that heterozygous mutation of IKZF1 in human decreases pDC numbers and expands conventional DC1 (cDC1). Lenalidomide, a drug that induces proteosomal degradation of IKZF1, also decreases pDC numbers in vivo, and reduces the ratio of pDC/cDC1 differentiated from progenitor cells in vitro in a dose-dependent manner. In addition, non-classical monocytes are reduced by IKZF1 deficiency in vivo. DC and monocytes from patients with IKZF1 deficiency or lenalidomide-treated cultures secrete less IFN-α, TNF and IL-12. These results indicate that human DC development and function are regulated by IKZF1, providing further insights into the consequences of IKZF1 mutation on immune function and the mechanism of immunomodulation by lenalidomide.
DNA melting analysis for genotyping and mutation scanning of PCR products by use of high-resolution instruments with special "saturation" dyes has recently been reported. The comparative performance ...of other instruments and dyes has not been evaluated.
A 110-bp fragment of the beta-globin gene including the sickle cell anemia locus (A17T) was amplified by PCR in the presence of either the saturating DNA dye, LCGreen Plus, or SYBR Green I. Amplicons of 3 different genotypes (wild-type, heterozygous, and homozygous mutants) were melted on 9 different instruments (ABI 7000 and 7900HT, Bio-Rad iCycler, Cepheid SmartCycler, Corbett Rotor-Gene 3000, Idaho Technology HR-1 and LightScanner, and the Roche LightCycler 1.2 and LightCycler 2.0) at a rate of 0.1 degrees C/s or as recommended by the manufacturer. The ability of each instrument/dye combination to genotype by melting temperature (Tm) and to scan for heterozygotes by curve shape was evaluated.
Resolution varied greatly among instruments with a 15-fold difference in Tm SD (0.018 to 0.274 degrees C) and a 19-fold (LCGreen Plus) or 33-fold (SYBR Green I) difference in the signal-to-noise ratio. These factors limit the ability of most instruments to accurately genotype single-nucleotide polymorphisms by amplicon melting. Plate instruments (96-well) showed the greatest variance with spatial differences across the plates. Either SYBR Green I or LCGreen Plus could be used for genotyping by T(m), but only LCGreen Plus was useful for heterozygote scanning. However, LCGreen Plus could not be used on instruments with an argon laser because of spectral mismatch. All instruments compatible with LCGreen Plus were able to detect heterozygotes by altered melting curve shape. However, instruments specifically designed for high-resolution melting displayed the least variation, suggesting better scanning sensitivity and specificity.
Different instruments and dyes vary widely in their ability to genotype homozygous variants and scan for heterozygotes by whole-amplicon melting analysis.
High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based ...sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present pedigree-VAAST (pVAAST), a disease-gene identification tool designed for high-throughput sequence data in pedigrees. pVAAST uses a sequence-based model to perform variant and gene-based linkage analysis. Linkage information is then combined with functional prediction and rare variant case-control association information in a unified statistical framework. pVAAST outperformed linkage and rare-variant association tests in simulations and identified disease-causing genes from whole-genome sequence data in three human pedigrees with dominant, recessive and de novo inheritance patterns. The approach is robust to incomplete penetrance and locus heterogeneity and is applicable to a wide variety of genetic traits. pVAAST maintains high power across studies of monogenic, high-penetrance phenotypes in a single pedigree to highly polygenic, common phenotypes involving hundreds of pedigrees.
Additional instruments have become available since instruments for DNA melting analysis of PCR products for genotyping and mutation scanning were compared. We assessed the performance of these new ...instruments for genotyping and scanning for mutations.
A 110-bp fragment of the beta-globin gene including the sickle cell anemia locus (HBB c. 20A>T) was amplified by PCR in the presence of LCGreen Plus or SYBR Green I. Amplicons of 4 different genotypes wild-type, homozygous, and heterozygous HBB c. 20A>T and double-heterozygote HBB c. (9C>T; 20A>T) were melted on 7 different instruments Applied Biosystems 7300, Corbett Life Sciences Rotor-Gene 6500HRM, Eppendorf Mastercycler RealPlex4S, Idaho Technology LightScanner (384 well), Roche LightCycler 480 (96 and 384 well) and Stratagene Mx3005p at a rate of 0.61 degrees C/s or when this was not possible, at 0.50 degrees C steps. We evaluated the ability of each instrument to genotype by melting temperature (Tm) and to scan for heterozygotes by curve shape.
The ability of most instruments to accurately genotype single-base changes by amplicon melting was limited by spatial temperature variation across the plate (SD of Tm = 0.020 to 0.264 degrees C). Other variables such as data density, signal-to-noise ratio, and melting rate also affected heterozygote scanning.
Different instruments vary widely in their ability to genotype homozygous variants and scan for heterozygotes by whole amplicon melting analysis. Instruments specifically designed for high-resolution melting, however, displayed the least variation, suggesting better genotyping accuracy and scanning sensitivity and specificity.
-Most current proficiency testing challenges for next-generation sequencing assays are methods-based proficiency testing surveys that use DNA from characterized reference samples to test both the ...wet-bench and bioinformatics/dry-bench aspects of the tests. Methods-based proficiency testing surveys are limited by the number and types of mutations that either are naturally present or can be introduced into a single DNA sample.
-To address these limitations by exploring a model of in silico proficiency testing in which sequence data from a single well-characterized specimen are manipulated electronically.
-DNA from the College of American Pathologists reference genome was enriched using the Illumina TruSeq and Life Technologies AmpliSeq panels and sequenced on the MiSeq and Ion Torrent platforms, respectively. The resulting data were mutagenized in silico and 26 variants, including single-nucleotide variants, deletions, and dinucleotide substitutions, were added at variant allele fractions (VAFs) from 10% to 50%. Participating clinical laboratories downloaded these files and analyzed them using their clinical bioinformatics pipelines.
-Laboratories using the AmpliSeq/Ion Torrent and/or the TruSeq/MiSeq participated in the 2 surveys. On average, laboratories identified 24.6 of 26 variants (95%) overall and 21.4 of 22 variants (97%) with VAFs greater than 15%. No false-positive calls were reported. The most frequently missed variants were single-nucleotide variants with VAFs less than 15%. Across both challenges, reported VAF concordance was excellent, with less than 1% median absolute difference between the simulated VAF and mean reported VAF.
-The results indicate that in silico proficiency testing is a feasible approach for methods-based proficiency testing, and demonstrate that the sensitivity and specificity of current next-generation sequencing bioinformatics across clinical laboratories are high.