Copy-number analysis to detect disease-causing losses and gains across the genome is recommended for the evaluation of individuals with neurodevelopmental disorders and/or multiple congenital ...anomalies, as well as for fetuses with ultrasound abnormalities. In the decade that this analysis has been in widespread clinical use, tremendous strides have been made in understanding the effects of copy-number variants (CNVs) in both affected individuals and the general population. However, continued broad implementation of array and next-generation sequencing-based technologies will expand the types of CNVs encountered in the clinical setting, as well as our understanding of their impact on human health.
To assist clinical laboratories in the classification and reporting of CNVs, irrespective of the technology used to identify them, the American College of Medical Genetics and Genomics has developed the following professional standards in collaboration with the National Institutes of Health (NIH)-funded Clinical Genome Resource (ClinGen) project.
This update introduces a quantitative, evidence-based scoring framework; encourages the implementation of the five-tier classification system widely used in sequence variant classification; and recommends "uncoupling" the evidence-based classification of a variant from its potential implications for a particular individual.
These professional standards will guide the evaluation of constitutional CNVs and encourage consistency and transparency across clinical laboratories.
Genetics researchers and clinical professionals rely on diversity measures such as race, ethnicity, and ancestry (REA) to stratify study participants and patients for a variety of applications in ...research and precision medicine. However, there are no comprehensive, widely accepted standards or guidelines for collecting and using such data in clinical genetics practice. Two NIH-funded research consortia, the Clinical Genome Resource (ClinGen) and Clinical Sequencing Evidence-generating Research (CSER), have partnered to address this issue and report how REA are currently collected, conceptualized, and used. Surveying clinical genetics professionals and researchers (n = 448), we found heterogeneity in the way REA are perceived, defined, and measured, with variation in the perceived importance of REA in both clinical and research settings. The majority of respondents (>55%) felt that REA are at least somewhat important for clinical variant interpretation, ordering genetic tests, and communicating results to patients. However, there was no consensus on the relevance of REA, including how each of these measures should be used in different scenarios and what information they can convey in the context of human genetics. A lack of common definitions and applications of REA across the precision medicine pipeline may contribute to inconsistencies in data collection, missing or inaccurate classifications, and misleading or inconclusive results. Thus, our findings support the need for standardization and harmonization of REA data collection and use in clinical genetics and precision health research.
Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer ...variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. We demonstrated large gains in overlap between resources across variants, diseases and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 57% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide a freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.
The Clinical Genome Resource (ClinGen) Ancestry and Diversity Working Group highlights the need to develop guidance on race, ethnicity, and ancestry (REA) data collection and use in clinical ...genomics. We present quantitative and qualitative evidence to characterize: (1) acquisition of REA data via clinical laboratory requisition forms, and (2) information disparity across populations in the Genome Aggregation Database (gnomAD) at clinically relevant sites ascertained from annotations in ClinVar. Our requisition form analysis showed substantial heterogeneity in clinical laboratory ascertainment of REA, as well as marked incongruity among terms used to define REA categories. There was also striking disparity across REA populations in the amount of information available about clinically relevant variants in gnomAD. European ancestral populations constituted the majority of observations (55.8%), allele counts (59.7%), and private alleles (56.1%) in gnomAD at 550 loci with “pathogenic” and “likely pathogenic” expert‐reviewed variants in ClinVar. Our findings highlight the importance of implementing and supporting programs to increase diversity in genome sequencing and clinical genomics, as well as measuring uncertainty around population‐level datasets that are used in variant interpretation. Finally, we suggest the need for a standardized REA data collection framework to be developed through partnerships and collaborations and adopted across clinical genomics.
The Ancestry and Diversity Working Group of the Clinical Genome Resource (ClinGen) presents the results of quantitative and qualitative analyses about race, ethnicity, and ancestry (REA) in clinical genomics. Our findings show great heterogeneity across clinical laboratories in the way race and ethnicity are reported on requisition forms and recommend that standard methods be developed and put into practice through future collaborations. We also demonstrate disparities in the amount of information available for variants at clinically relevant sites across populations.
Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these ...structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods.
We demonstrate Parliament's efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus.
HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.
Several genes on hereditary breast and ovarian cancer susceptibility test panels have not been systematically examined for strength of association with disease. We employed the Clinical Genome ...Resource (ClinGen) clinical validity framework to assess the strength of evidence between selected genes and breast or ovarian cancer.
Thirty-one genes offered on cancer panel testing were selected for evaluation. The strength of gene–disease relationship was systematically evaluated and a clinical validity classification of either Definitive, Strong, Moderate, Limited, Refuted, Disputed, or No Reported Evidence was assigned.
Definitive clinical validity classifications were made for 10/31 and 10/32 gene–disease pairs for breast and ovarian cancer respectively. Two genes had a Moderate classification whereas, 6/31 and 6/32 genes had Limited classifications for breast and ovarian cancer respectively. Contradictory evidence resulted in Disputed or Refuted assertions for 9/31 genes for breast and 4/32 genes for ovarian cancer. No Reported Evidence of disease association was asserted for 5/31 genes for breast and 11/32 for ovarian cancer.
Evaluation of gene–disease association using the ClinGen clinical validity framework revealed a wide range of classifications. This information should aid laboratories in tailoring appropriate gene panels and assist health-care providers in interpreting results from panel testing.
Many conserved noncoding sequences function as transcriptional enhancers that regulate gene expression. Here, we report that protein-coding DNA also frequently contains enhancers functioning at the ...transcriptional level. We tested the enhancer activity of 31 protein-coding exons, which we chose based on strong sequence conservation between zebrafish and human, and occurrence in developmental genes, using a Tol2 transposable GFP reporter assay in zebrafish. For each exon we measured GFP expression in hundreds of embryos in 10 anatomies via a novel system that implements the voice-recognition capabilities of a cellular phone. We find that 24/31 (77%) exons drive GFP expression compared to a minimal promoter control, and 14/24 are anatomy-specific (expression in four anatomies or less). GFP expression driven by these coding enhancers frequently overlaps the anatomies where the host gene is expressed (60%), suggesting self-regulation. Highly conserved coding sequences and highly conserved noncoding sequences do not significantly differ in enhancer activity (coding: 24/31 vs. noncoding: 105/147) or tissue-specificity (coding: 14/24 vs. noncoding: 50/105). Furthermore, coding and noncoding enhancers display similar levels of the enhancer-related histone modification H3K4me1 (coding: 9/24 vs noncoding: 34/81). Meanwhile, coding enhancers are over three times as likely to contain an H3K4me1 mark as other exons of the host gene. Our work suggests that developmental transcriptional enhancers do not discriminate between coding and noncoding DNA and reveals widespread dual functions in protein-coding DNA.
In its landmark paper about Standards and Guidelines for the Interpretation of Sequence Variants, the American College of Medical Genetics and Genomics (ACMG), and Association for Molecular Pathology ...(AMP) did not address how to use tumor data when assessing the pathogenicity of germline variants. The Clinical Genome Resource (ClinGen) established a multidisciplinary working group, the Germline/Somatic Variant Subcommittee (GSVS) with this focus. The GSVS implemented a survey to determine current practices of integrating somatic data when classifying germline variants in cancer predisposition genes. The GSVS then reviewed and analyzed available resources of relevant somatic data, and performed integrative germline variant curation exercises. The committee determined that somatic hotspots could be systematically integrated into moderate evidence of pathogenicity (PM1). Tumor RNA sequencing data showing altered splicing may be considered as strong evidence in support of germline pathogenicity (PVS1) and tumor phenotypic features such as mutational signatures be considered supporting evidence of pathogenicity (PP4). However, at present, somatic data such as focal loss of heterozygosity and mutations occurring on the alternative allele are not recommended to be systematically integrated, instead, incorporation of this type of data should take place under the advisement of multidisciplinary cancer center tumor‐normal sequencing boards.
CIViC (Clinical Interpretation of Variants in Cancer; civicdb.org) is a crowd-sourced, public domain knowledgebase composed of literature-derived evidence characterizing the clinical utility of ...cancer variants. As clinical sequencing becomes more prevalent in cancer management, the need for cancer variant interpretation has grown beyond the capability of any single institution. CIViC contains peer-reviewed, published literature curated and expertly-moderated into structured data units (Evidence Items) that can be accessed globally and in real time, reducing barriers to clinical variant knowledge sharing. We have extended CIViC's functionality to support emergent variant interpretation guidelines, increase interoperability with other variant resources, and promote widespread dissemination of structured curated data. To support the full breadth of variant interpretation from basic to translational, including integration of somatic and germline variant knowledge and inference of drug response, we have enabled curation of three new Evidence Types (Predisposing, Oncogenic and Functional). The growing CIViC knowledgebase has over 300 contributors and distributes clinically-relevant cancer variant data currently representing >3200 variants in >470 genes from >3100 publications.