Advances in high-throughput DNA sequencing have revolutionized the discovery of variants in the human genome; however, interpreting the phenotypic effects of those variants is still a challenge. ...While several computational approaches to predict variant impact are available, their accuracy is limited and further improvement is needed. Here, we introduce ClinPred, an efficient tool for identifying disease-relevant nonsynonymous variants. Our predictor incorporates two machine learning algorithms that use existing pathogenicity scores and, notably, benefits from inclusion of normal population allele frequency from the gnomAD database as an input feature. Another major strength of our approach is the use of ClinVar—a rapidly growing database that allows selection of confidently annotated disease-causing variants—as a training set. Compared to other methods, ClinPred showed superior accuracy for predicting pathogenicity, achieving the highest area under the curve (AUC) score and increasing both the specificity and sensitivity in different test datasets. It also obtained the best performance according to various other metrics. Moreover, ClinPred performance remained robust with respect to disease type (cancer or rare disease) and mechanism (gain or loss of function). Importantly, we observed that adding allele frequency as a predictive feature—as opposed to setting fixed allele frequency cutoffs—boosts the performance of prediction. We provide pre-computed ClinPred scores for all possible human missense variants in the exome to facilitate its use by the community.
Genomic technologies, such as targeted, exome and short-read genome sequencing approaches, have revolutionized the care of patients with rare genetic diseases. However, more than half of patients ...remain without a diagnosis. Emerging approaches from research-based settings such as long-read genome sequencing and optical genome mapping hold promise for improving the identification of disease-causal genetic variants. In addition, new omic technologies that measure the transcriptome, epigenome, proteome or metabolome are showing great potential for variant interpretation. As genetic testing options rapidly expand, the clinical community needs to be mindful of their individual strengths and limitations, as well as remaining challenges, to select the appropriate diagnostic test, correctly interpret results and drive innovation to address insufficiencies. If used effectively - through truly integrative multi-omics approaches and data sharing - the resulting large quantities of data from these established and emerging technologies will greatly improve the interpretative power of genetic and genomic diagnostics for rare diseases.
Accurate diagnosis is the cornerstone of medicine; it is essential for informed care and promoting patient and family well-being. However, families with a rare genetic disease (RGD) often spend more ...than five years on a diagnostic odyssey of specialist visits and invasive testing that is lengthy, costly, and often futile, as 50% of patients do not receive a molecular diagnosis. The current diagnostic paradigm is not well designed for RGDs, especially for patients who remain undiagnosed after the initial set of investigations, and thus requires an expansion of approaches in the clinic. Leveraging opportunities to participate in research programs that utilize new technologies to understand RGDs is an important path forward for patients seeking a diagnosis. Given recent advancements in such technologies and international initiatives, the prospect of identifying a molecular diagnosis for all patients with RGDs has never been so attainable, but achieving this goal will require global cooperation at an unprecedented scale.
Pediatric developmental syndromes present with systemic, complex, and often overlapping clinical features that are not infrequently a consequence of Mendelian inheritance of mutations in genes ...involved in DNA methylation, establishment of histone modifications, and chromatin remodeling (the “epigenetic machinery”). The mechanistic cross-talk between histone modification and DNA methylation suggests that these syndromes might be expected to display specific DNA methylation signatures that are a reflection of those primary errors associated with chromatin dysregulation. Given the interrelated functions of these chromatin regulatory proteins, we sought to identify DNA methylation epi-signatures that could provide syndrome-specific biomarkers to complement standard clinical diagnostics. In the present study, we examined peripheral blood samples from a large cohort of individuals encompassing 14 Mendelian disorders displaying mutations in the genes encoding proteins of the epigenetic machinery. We demonstrated that specific but partially overlapping DNA methylation signatures are associated with many of these conditions. The degree of overlap among these epi-signatures is minimal, further suggesting that, consistent with the initial event, the downstream changes are unique to every syndrome. In addition, by combining these epi-signatures, we have demonstrated that a machine learning tool can be built to concurrently screen for multiple syndromes with high sensitivity and specificity, and we highlight the utility of this tool in solving ambiguous case subjects presenting with variants of unknown significance, along with its ability to generate accurate predictions for subjects presenting with the overlapping clinical and molecular features associated with the disruption of the epigenetic machinery.
Manganese (Mn) and zinc (Zn) are essential divalent cations used by cells as protein cofactors; various human studies and animal models have demonstrated the importance of Mn and Zn for development. ...Here we describe an autosomal-recessive disorder in six individuals from the Hutterite community and in an unrelated Egyptian sibpair; the disorder is characterized by intellectual disability, developmental delay, hypotonia, strabismus, cerebellar atrophy, and variable short stature. Exome sequencing in one affected Hutterite individual and the Egyptian family identified the same homozygous variant, c.112G>C (p.Gly38Arg), affecting a conserved residue of SLC39A8. The affected Hutterite and Egyptian individuals did not share an extended common haplotype, suggesting that the mutation arose independently. SLC39A8 is a member of the solute carrier gene family known to import Mn, Zn, and other divalent cations across the plasma membrane. Evaluation of these two metal ions in the affected individuals revealed variably low levels of Mn and Zn in blood and elevated levels in urine, indicating renal wasting. Our findings identify a human Mn and Zn transporter deficiency syndrome linked to SLC39A8, providing insight into the roles of Mn and Zn homeostasis in human health and development.
Receptor tyrosine kinases (RTKs) are a family of ligand-binding cell surface receptors that regulate a wide range of essential cellular activities, including proliferation, differentiation, ...cell-cycle progression, survival and apoptosis. As such, these proteins play an important role during development and throughout life; germline mutations in genes encoding RTKs cause several developmental syndromes, while somatic alterations contribute to the pathogenesis of many aggressive cancers. This creates an interesting paradigm in which mutation timing, type and location in a gene leads to different cell signaling and biological responses, and ultimately phenotypic outcomes. In this review, we highlight the roles of RTKs in developmental disorders and cancer. The multifaceted roles of these receptors, their genetic signatures and their signaling during developmental morphogenesis and oncogenesis are discussed. Additionally, we propose that comparative analysis of RTK mutations responsible for developmental syndromes may shed light on those driving tumorigenesis.
ATRX is a chromatin remodeling protein involved in deposition of the histone variant H3.3 at telomeres and pericentromeric heterochromatin. It also influences the expression level of specific genes; ...however, deposition of H3.3 at transcribed genes is currently thought to occur independently of ATRX. We focused on a set of genes, including the autism susceptibility gene Neuroligin 4 (Nlgn4), that exhibit decreased expression in ATRX-null cells to investigate the mechanisms used by ATRX to promote gene transcription. Overall TERRA levels, as well as DNA methylation and histone modifications at ATRX target genes are not altered and thus cannot explain transcriptional dysregulation. We found that ATRX does not associate with the promoter of these genes, but rather binds within regions of the gene body corresponding to high H3.3 occupancy. These intragenic regions consist of guanine-rich DNA sequences predicted to form non-B DNA structures called G-quadruplexes during transcriptional elongation. We demonstrate that ATRX deficiency corresponds to reduced H3.3 incorporation and stalling of RNA polymerase II at these G-rich intragenic sites. These findings suggest that ATRX promotes the incorporation of histone H3.3 at particular transcribed genes and facilitates transcriptional elongation through G-rich sequences. The inability to transcribe genes such as Nlgn4 could cause deficits in neuronal connectivity and cognition associated with ATRX mutations in humans.
Primary defects in motile cilia result in dysfunction of the apparatus responsible for generating fluid flows. Defects in these mechanisms underlie disorders characterized by poor mucus clearance, ...resulting in susceptibility to chronic recurrent respiratory infections, often associated with infertility; laterality defects occur in about 50% of such individuals. Here we report biallelic variants in LRRC56 (known as oda8 in Chlamydomonas) identified in three unrelated families. The phenotype comprises laterality defects and chronic pulmonary infections. High-speed video microscopy of cultured epithelial cells from an affected individual showed severely dyskinetic cilia but no obvious ultra-structural abnormalities on routine transmission electron microscopy (TEM). Further investigation revealed that LRRC56 interacts with the intraflagellar transport (IFT) protein IFT88. The link with IFT was interrogated in Trypanosoma brucei. In this protist, LRRC56 is recruited to the cilium during axoneme construction, where it co-localizes with IFT trains and is required for the addition of dynein arms to the distal end of the flagellum. In T. brucei carrying LRRC56-null mutations, or a variant resulting in the p.Leu259Pro substitution corresponding to the p.Leu140Pro variant seen in one of the affected families, we observed abnormal ciliary beat patterns and an absence of outer dynein arms restricted to the distal portion of the axoneme. Together, our findings confirm that deleterious variants in LRRC56 result in a human disease and suggest that this protein has a likely role in dynein transport during cilia assembly that is evolutionarily important for cilia motility.
The past decade has witnessed a rapid evolution in rare disease (RD) research, fueled by the availability of genome-wide (exome and genome) sequencing. In 2011, as this transformative technology was ...introduced to the research community, the Care4Rare Canada Consortium was launched: initially as FORGE, followed by Care4Rare, and Care4Rare SOLVE. Over what amounted to three eras of diagnosis and discovery, the Care4Rare Consortium used exome sequencing and, more recently, genome and other 'omic technologies to identify the molecular cause of unsolved RDs. We achieved a diagnostic yield of 34% (623/1,806 of participating families), including the discovery of deleterious variants in 121 genes not previously associated with disease, and we continue to study candidate variants in novel genes for 145 families. The Consortium has made significant contributions to RD research, including development of platforms for data collection and sharing and instigating a Canadian network to catalyze functional characterization research of novel genes. The Consortium was instrumental to implementing genome-wide sequencing as a publicly funded test for RD diagnosis in Canada. Despite the successes of the past decade, the challenge of solving all RDs remains enormous, and the work is far from over. We must leverage clinical and 'omic data for secondary use, develop tools and policies to support safe data sharing, continue to explore the utility of new and emerging technologies, and optimize research protocols to delineate complex disease mechanisms. Successful approaches in each of these realms is required to offer diagnostic clarity to all families with RDs.
After a decade of collaborative network science in Canada and three eras of RD gene discovery, the Care4Rare Canada Consortium recognized it was time to reflect on the lessons learned from our successes to best meet the challenges of RD diagnosis and discovery in the decade to come.
Matchmaking has emerged as a useful strategy for building evidence toward causality of novel disease genes in patients with undiagnosed rare diseases. The Matchmaker Exchange (MME) is a collaborative ...initiative that facilitates international data sharing for matchmaking purposes; however, data on user experience is limited.
Patients enrolled as part of the Finding of Rare Disease Genes in Canada (FORGE) and Care4Rare Canada research programs had their exome sequencing data reanalyzed by a multidisciplinary research team over a 2-year period. Compelling variants in genes not previously associated with a human phenotype were submitted through the MME node PhenomeCentral, and outcomes were collected.
In this study, 194 novel candidate genes were submitted to the MME, resulting in 1514 matches, and 15% of the genes submitted resulted in collaborations. Most submissions resulted in at least 1 match, and most matches were with GeneMatcher (82%), where additional email exchange was required to evaluate the match because of the lack of phenotypic or inheritance information.
Matchmaking through the MME is an effective way to investigate novel candidate genes; however, it is a labor-intensive process. Engagement from the community to contribute phenotypic, genotypic, and inheritance data will ensure that matchmaking continues to be a useful approach in the future.