Whole-exome sequencing can provide insight into the relationship between observed clinical phenotypes and underlying genotypes.
We conducted a retrospective analysis of data from a series of 7374 ...consecutive unrelated patients who had been referred to a clinical diagnostic laboratory for whole-exome sequencing; our goal was to determine the frequency and clinical characteristics of patients for whom more than one molecular diagnosis was reported. The phenotypic similarity between molecularly diagnosed pairs of diseases was calculated with the use of terms from the Human Phenotype Ontology.
A molecular diagnosis was rendered for 2076 of 7374 patients (28.2%); among these patients, 101 (4.9%) had diagnoses that involved two or more disease loci. We also analyzed parental samples, when available, and found that de novo variants accounted for 67.8% (61 of 90) of pathogenic variants in autosomal dominant disease genes and 51.7% (15 of 29) of pathogenic variants in X-linked disease genes; both variants were de novo in 44.7% (17 of 38) of patients with two monoallelic variants. Causal copy-number variants were found in 12 patients (11.9%) with multiple diagnoses. Phenotypic similarity scores were significantly lower among patients in whom the phenotype resulted from two distinct mendelian disorders that affected different organ systems (50 patients) than among patients with disorders that had overlapping phenotypic features (30 patients) (median score, 0.21 vs. 0.36; P=1.77×10
).
In our study, we found multiple molecular diagnoses in 4.9% of cases in which whole-exome sequencing was informative. Our results show that structured clinical ontologies can be used to determine the degree of overlap between two mendelian diseases in the same patient; the diseases can be distinct or overlapping. Distinct disease phenotypes affect different organ systems, whereas overlapping disease phenotypes are more likely to be caused by two genes encoding proteins that interact within the same pathway. (Funded by the National Institutes of Health and the Ting Tsung and Wei Fong Chao Foundation.).
Discovering the genetic basis of a Mendelian phenotype establishes a causal link between genotype and phenotype, making possible carrier and population screening and direct diagnosis. Such ...discoveries also contribute to our knowledge of gene function, gene regulation, development, and biological mechanisms that can be used for developing new therapeutics. As of February 2015, 2,937 genes underlying 4,163 Mendelian phenotypes have been discovered, but the genes underlying ∼50% (i.e., 3,152) of all known Mendelian phenotypes are still unknown, and many more Mendelian conditions have yet to be recognized. This is a formidable gap in biomedical knowledge. Accordingly, in December 2011, the NIH established the Centers for Mendelian Genomics (CMGs) to provide the collaborative framework and infrastructure necessary for undertaking large-scale whole-exome sequencing and discovery of the genetic variants responsible for Mendelian phenotypes. In partnership with 529 investigators from 261 institutions in 36 countries, the CMGs assessed 18,863 samples from 8,838 families representing 579 known and 470 novel Mendelian phenotypes as of January 2015. This collaborative effort has identified 956 genes, including 375 not previously associated with human health, that underlie a Mendelian phenotype. These results provide insight into study design and analytical strategies, identify novel mechanisms of disease, and reveal the extensive clinical variability of Mendelian phenotypes. Discovering the gene underlying every Mendelian phenotype will require tackling challenges such as worldwide ascertainment and phenotypic characterization of families affected by Mendelian conditions, improvement in sequencing and analytical techniques, and pervasive sharing of phenotypic and genomic data among researchers, clinicians, and families.
Identifying genes and variants contributing to rare disease phenotypes and Mendelian conditions informs biology and medicine, yet potential phenotypic consequences for variation of >75% of the ...~20,000 annotated genes in the human genome are lacking. Technical advances to assess rare variation genome-wide, particularly exome sequencing (ES), enabled establishment in the United States of the National Institutes of Health (NIH)-supported Centers for Mendelian Genomics (CMGs) and have facilitated collaborative studies resulting in novel "disease gene" discoveries. Pedigree-based genomic studies and rare variant analyses in families with suspected Mendelian conditions have led to the elucidation of hundreds of novel disease genes and highlighted the impact of de novo mutational events, somatic variation underlying nononcologic traits, incompletely penetrant alleles, phenotypes with high locus heterogeneity, and multilocus pathogenic variation. Herein, we highlight CMG collaborative discoveries that have contributed to understanding both rare and common diseases and discuss opportunities for future discovery in single-locus Mendelian disorder genomics. Phenotypic annotation of all human genes; development of bioinformatic tools and analytic methods; exploration of non-Mendelian modes of inheritance including reduced penetrance, multilocus variation, and oligogenic inheritance; construction of allelic series at a locus; enhanced data sharing worldwide; and integration with clinical genomics are explored. Realizing the full contribution of rare disease research to functional annotation of the human genome, and further illuminating human biology and health, will lay the foundation for the Precision Medicine Initiative.
Abstract
Context
Primary ovarian insufficiency (POI) encompasses a spectrum of premature menopause, including both primary and secondary amenorrhea. For 75% to 90% of individuals with ...hypergonadotropic hypogonadism presenting as POI, the molecular etiology is unknown. Common etiologies include chromosomal abnormalities, environmental factors, and congenital disorders affecting ovarian development and function, as well as syndromic and nonsyndromic single gene disorders suggesting POI represents a complex trait.
Objective
To characterize the contribution of known disease genes to POI and identify molecular etiologies and biological underpinnings of POI.
Design, Setting, and Participants
We applied exome sequencing (ES) and family-based genomics to 42 affected female individuals from 36 unrelated Turkish families, including 31 with reported parental consanguinity.
Results
This analysis identified likely damaging, potentially contributing variants and molecular diagnoses in 16 families (44%), including 11 families with likely damaging variants in known genes and five families with predicted deleterious variants in disease genes (IGSF10, MND1, MRPS22, and SOHLH1) not previously associated with POI. Of the 16 families, 2 (13%) had evidence for potentially pathogenic variants at more than one locus. Absence of heterozygosity consistent with identity-by-descent mediated recessive disease burden contributes to molecular diagnosis in 15 of 16 (94%) families. GeneMatcher allowed identification of additional families from diverse genetic backgrounds.
Conclusions
ES analysis of a POI cohort further characterized locus heterogeneity, reaffirmed the association of genes integral to meiotic recombination, demonstrated the likely contribution of genes involved in hypothalamic development, and documented multilocus pathogenic variation suggesting the potential for oligogenic inheritance contributing to the development of POI.
Exome sequencing was used to investigate genes and mutational mechanisms contributing to primary ovarian insufficiency, and to gain insights into disease biology and underlying pathophysiology.
Genetic ataxias are associated with mutations in hundreds of genes with high phenotypic overlap complicating the clinical diagnosis. Whole‐exome sequencing (WES) has increased the overall diagnostic ...rate considerably. However, the upper limit of this method remains ill‐defined, hindering efforts to address the remaining diagnostic gap. To further assess the role of rare coding variation in ataxic disorders, we reanalyzed our previously published exome cohort of 76 predominantly adult and sporadic‐onset patients, expanded the total number of cases to 260, and introduced analyses for copy number variation and repeat expansion in a representative subset. For new cases (n = 184), our resulting clinically relevant detection rate remained stable at 47% with 24% classified as pathogenic. Reanalysis of the previously sequenced 76 patients modestly improved the pathogenic rate by 7%. For the combined cohort (n = 260), the total observed clinical detection rate was 52% with 25% classified as pathogenic. Published studies of similar neurological phenotypes report comparable rates. This consistency across multiple cohorts suggests that, despite continued technical and analytical advancements, an approximately 50% diagnostic rate marks a relative ceiling for current WES‐based methods and a more comprehensive genome‐wide assessment is needed to identify the missing causative genetic etiologies for cerebellar ataxia and related neurodegenerative diseases.
Genetic ataxias are associated with mutations in hundreds of genes with high phenotypic overlap complicating the clinical diagnosis. To assess rare coding variation in ataxic disorders, we performed whole‐exome sequencing of patients (n = 260) including analysis for copy number variation and repeat expansion in a representative subset. We suggest that, despite continued technical and analytical advancements, an approximately 50% diagnostic rate marks a relative ceiling for current WES‐based methods and a more comprehensive genome‐wide assessment is needed to further improve diagnosis.
Multilocus variation—pathogenic variants in two or more disease genes—can potentially explain the underlying genetic basis for apparent phenotypic expansion in cases for which the observed clinical ...features extend beyond those reported in association with a “known” disease gene.
Analyses focused on 106 patients, 19 for whom apparent phenotypic expansion was previously attributed to variation at known disease genes. We performed a retrospective computational reanalysis of whole-exome sequencing data using stringent Variant Call File filtering criteria to determine whether molecular diagnoses involving additional disease loci might explain the observed expanded phenotypes.
Multilocus variation was identified in 31.6% (6/19) of families with phenotypic expansion and 2.3% (2/87) without phenotypic expansion. Intrafamilial clinical variability within two families was explained by multilocus variation identified in the more severely affected sibling.
Our findings underscore the role of multiple rare variants at different loci in the etiology of genetically and clinically heterogeneous cohorts. Intrafamilial phenotypic and genotypic variability allowed a dissection of genotype–phenotype relationships in two families. Our data emphasize the critical role of the clinician in diagnostic genomic analyses and demonstrate that apparent phenotypic expansion may represent blended phenotypes resulting from pathogenic variation at more than one locus.
The goal of this study was to assess the scale of low-level parental mosaicism in exome sequencing (ES) databases.
We analyzed approximately 2000 family trio ES data sets from the Baylor-Hopkins ...Center for Mendelian Genomics (BHCMG) and Baylor Genetics (BG). Among apparent de novo single-nucleotide variants identified in the affected probands, we selected rare unique variants with variant allele fraction (VAF) between 30% and 70% in the probands and lower than 10% in one of the parents.
Of 102 candidate mosaic variants validated using amplicon-based next-generation sequencing, droplet digital polymerase chain reaction, or blocker displacement amplification, 27 (26.4%) were confirmed to be low- (VAF between 1% and 10%) or very low (VAF <1%) level mosaic. Detection precision in parental samples with two or more alternate reads was 63.6% (BHCMG) and 43.6% (BG). In nine investigated individuals, we observed variability of mosaic ratios among blood, saliva, fibroblast, buccal, hair, and urine samples.
Our computational pipeline enables robust discrimination between true and false positive candidate mosaic variants and efficient detection of low-level mosaicism in ES samples. We confirm that the presence of two or more alternate reads in the parental sample is a reliable predictor of low-level parental somatic mosaicism.
Here we describe MyGene2, Geno2MP, VariantMatcher, and Franklin; databases that provide variant‐level information and phenotypic features to researchers, clinicians, healthcare providers and ...patients. Following the footsteps of the Matchmaker Exchange project that connects exome, genome, and phenotype databases at the gene level, these databases have as one goal to facilitate connection to one another using Data Connect, a standard for discovery and search of biomedical data from the Global Alliance for Genomics and Health (GA4GH).
Abstract
Alzheimer’s disease (AD) is one of the most challenging neurodegenerative diseases because of its complicated and progressive mechanisms, and multiple risk factors. Increasing research ...evidence demonstrates that genetics may be a key factor responsible for the occurrence of the disease. Although previous reports identified quite a few AD-associated genes, they were mostly limited owing to patient sample size and selection bias. There is a lack of comprehensive research aimed to identify AD-associated risk mutations systematically. To address this challenge, we hereby construct a large-scale AD mutation and co-mutation framework (‘AD-Syn-Net’), and propose deep learning models named Deep-SMCI and Deep-CMCI configured with fully connected layers that are capable of predicting cognitive impairment of subjects effectively based on genetic mutation and co-mutation profiles. Next, we apply the customized frameworks to data sets to evaluate the importance scores of the mutations and identified mutation effectors and co-mutation combination vulnerabilities contributing to cognitive impairment. Furthermore, we evaluate the influence of mutation pairs on the network architecture to dissect the genetic organization of AD and identify novel co-mutations that could be responsible for dementia, laying a solid foundation for proposing future targeted therapy for AD precision medicine. Our deep learning model codes are available open access here: https://github.com/Pan-Bio/AD-mutation-effectors.
Kinesin proteins are critical for various cellular functions such as intracellular transport and cell division, and many members of the family have been linked to monogenic disorders and cancer. We ...report eight individuals with intellectual disability and microcephaly from four unrelated families with parental consanguinity. In the affected individuals of each family, homozygosity for likely pathogenic variants in KIF14 were detected; two loss-of-function (p.Asn83Ilefs*3 and p.Ser1478fs), and two missense substitutions (p.Ser841Phe and p.Gly459Arg). KIF14 is a mitotic motor protein that is required for spindle localization of the mitotic citron rho-interacting kinase, CIT, also mutated in microcephaly. Our results demonstrate the involvement of KIF14 in development and reveal a wide phenotypic variability ranging from fetal lethality to moderate developmental delay and microcephaly.