Natural language computer applications are becoming increasingly sophisticated and, with the recent release of Generative Pre-trained Transformer 3, they could be deployed in healthcare-related ...contexts that have historically comprised human-to-human interaction. However, for GPT-3 and similar applications to be considered for use in health-related contexts, possibilities and pitfalls need thoughtful exploration. In this article, we briefly introduce some opportunities and cautions that would accompany advanced Natural Language Processing applications deployed in eHealth.
Identifying pathogenic variants and underlying functional alterations is challenging. To this end, we introduce MutPred2, a tool that improves the prioritization of pathogenic amino acid ...substitutions over existing methods, generates molecular mechanisms potentially causative of disease, and returns interpretable pathogenicity score distributions on individual genomes. Whilst its prioritization performance is state-of-the-art, a distinguishing feature of MutPred2 is the probabilistic modeling of variant impact on specific aspects of protein structure and function that can serve to guide experimental studies of phenotype-altering variants. We demonstrate the utility of MutPred2 in the identification of the structural and functional mutational signatures relevant to Mendelian disorders and the prioritization of de novo mutations associated with complex neurodevelopmental disorders. We then experimentally validate the functional impact of several variants identified in patients with such disorders. We argue that mechanism-driven studies of human inherited disease have the potential to significantly accelerate the discovery of clinically actionable variants.
Understanding the role genes and genetic variants play in clinical treatment response continues to be an active area of research with the goal of common clinical use. This goal has developed into ...today’s industry of pharmacogenomics, where new drug-gene relationships are discovered and further characterized, published and then curated into national and international resources for use by researchers and clinicians. These efforts have given us insight into what a pharmacogenomic variant is, and how it differs from human disease variants and common polymorphisms. While publications continue to reveal pharmacogenomic relationships between genes and specific classes of drugs, many challenges remain toward the goal of widespread use clinically. First, the clinical guidelines for pharmacogenomic testing are still in their infancy. Second, sequencing technologies are changing rapidly making it somewhat unclear what genetic data will be available to the clinician at the time of care. Finally, what and when to return data to a patient is an area under constant debate. New innovations such as PheWAS approaches and whole genome sequencing studies are enabling a tsunami of new findings. In this review, pharmacogenomic variants, pharmacogenomic resources, interpretation clinical guidelines and challenges, such as WGS approaches, and the impact of pharmacogenomics on drug development and regulatory approval are reviewed.
Recommendations from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) for interpreting sequence variants specify the use of computational ...predictors as “supporting” level of evidence for pathogenicity or benignity using criteria PP3 and BP4, respectively. However, score intervals defined by tool developers, and ACMG/AMP recommendations that require the consensus of multiple predictors, lack quantitative support. Previously, we described a probabilistic framework that quantified the strengths of evidence (supporting, moderate, strong, very strong) within ACMG/AMP recommendations. We have extended this framework to computational predictors and introduce a new standard that converts a tool’s scores to PP3 and BP4 evidence strengths. Our approach is based on estimating the local positive predictive value and can calibrate any computational tool or other continuous-scale evidence on any variant type. We estimate thresholds (score intervals) corresponding to each strength of evidence for pathogenicity and benignity for thirteen missense variant interpretation tools, using carefully assembled independent data sets. Most tools achieved supporting evidence level for both pathogenic and benign classification using newly established thresholds. Multiple tools reached score thresholds justifying moderate and several reached strong evidence levels. One tool reached very strong evidence level for benign classification on some variants. Based on these findings, we provide recommendations for evidence-based revisions of the PP3 and BP4 ACMG/AMP criteria using individual tools and future assessment of computational methods for clinical interpretation.
We developed an approach to calibrate computational predictors to the American College of Medical Genetics and Genomics and Association for Molecular Pathology guidelines for clinical variant classification. We observed that predictors can provide much stronger evidence for variant pathogenicity/benignity than previously thought and propose updated recommendations for their clinical use.
Differentiation between phenotypically neutral and disease-causing genetic variation remains an open and relevant problem. Among different types of variation, non-frameshifting insertions and ...deletions (indels) represent an understudied group with widespread phenotypic consequences. To address this challenge, we present a machine learning method, MutPred-Indel, that predicts pathogenicity and identifies types of functional residues impacted by non-frameshifting insertion/deletion variation. The model shows good predictive performance as well as the ability to identify impacted structural and functional residues including secondary structure, intrinsic disorder, metal and macromolecular binding, post-translational modifications, allosteric sites, and catalytic residues. We identify structural and functional mechanisms impacted preferentially by germline variation from the Human Gene Mutation Database, recurrent somatic variation from COSMIC in the context of different cancers, as well as de novo variants from families with autism spectrum disorder. Further, the distributions of pathogenicity prediction scores generated by MutPred-Indel are shown to differentiate highly recurrent from non-recurrent somatic variation. Collectively, we present a framework to facilitate the interrogation of both pathogenicity and the functional effects of non-frameshifting insertion/deletion variants. The MutPred-Indel webserver is available at http://mutpred.mutdb.org/.
The steady advances in machine learning and accumulation of biomedical data have contributed to the development of numerous computational models that assess the impact of missense variants. Different ...methods, however, operationalize impact differently. Two common tasks in this context are the prediction of the pathogenicity of variants and the prediction of their effects on a protein's function. These are related but distinct problems, and it is unclear whether methods developed for one are optimized for the other. The Critical Assessment of Genome Interpretation (CAGI) experiment provides a means to address this question empirically. To this end, we participated in various protein‐specific challenges in CAGI with two objectives in mind. First, to compare the performance of methods in the MutPred family with the state‐of‐the‐art. Second and more importantly, to investigate the applicability of general‐purpose pathogenicity predictors to the classification of specific function‐altering variants without additional training or calibration. We find that our pathogenicity predictors performed competitively with other methods, outputting score distributions in agreement with experimental outcomes. Overall, we conclude that binary classifiers learned from disease‐causing mutations are capable of modeling important aspects of the underlying biology and the alteration of protein function resulting from mutations.
By participating in the Critical Assessment of Genome Interpretation, we demonstrate the direct applicability of missense variant pathogenicity predictors in the task of the prediction of real‐valued impact on biochemical, molecular and cellular function, as measured in in vitro experiments. Our work suggests that when a large number of structural and functional features are integrated into a learning algorithm that outputs smooth score distributions, pathogenicity predictors can model the biology shared by both of these prediction tasks.
Preterm infants often spend a significant amount of time in the neonatal intensive care unit (NICU) where they are exposed to many stressors including pain and reduced maternal care. These early-life ...stressful experiences can have negative consequences on brain maturation during the neonatal period; however, less is known about the long-term cognitive and affective outcomes. Thus, this study was conducted to investigate the impact of neonatal pain and reduced maternal care on adult behavior and HPA axis reactivity in an animal model. Male and female rats underwent a series of repetitive needle pokes and/or reduced maternal care (through a novel tea ball infuser encapsulation method) during the first 4 days of life and were then assessed in a battery of behavioral tests as adults. We found that early-life pain enhanced spatial learning independent of the animal's sex, but altered HPA recovery from an acute stressor in females only. Moreover, reduced maternal care altered long-term spatial memory and reversal learning in males. These findings indicate that neonatal stressors have unique sex-dependent long-term biobehavioral effects in rodents. Continued examination of the behavioral consequences of these stressors may help explain varying vulnerability and resiliency in preterm infants who experienced early stress in the NICU.
Motivation: Advances in high-throughput genotyping and next generation sequencing have generated a vast amount of human genetic variation data. Single nucleotide substitutions within protein coding ...regions are of particular importance owing to their potential to give rise to amino acid substitutions that affect protein structure and function which may ultimately lead to a disease state. Over the last decade, a number of computational methods have been developed to predict whether such amino acid substitutions result in an altered phenotype. Although these methods are useful in practice, and accurate for their intended purpose, they are not well suited for providing probabilistic estimates of the underlying disease mechanism. Results: We have developed a new computational model, MutPred, that is based upon protein sequence, and which models changes of structural features and functional sites between wild-type and mutant sequences. These changes, expressed as probabilities of gain or loss of structure and function, can provide insight into the specific molecular mechanism responsible for the disease state. MutPred also builds on the established SIFT method but offers improved classification accuracy with respect to human disease mutations. Given conservative thresholds on the predicted disruption of molecular function, we propose that MutPred can generate accurate and reliable hypotheses on the molecular basis of disease for ∼11% of known inherited disease-causing mutations. We also note that the proportion of changes of functionally relevant residues in the sets of cancer-associated somatic mutations is higher than for the inherited lesions in the Human Gene Mutation Database which are instead predicted to be characterized by disruptions of protein structure. Availability: http://mutdb.org/mutpred Contact: predrag@indiana.edu; smooney@buckinstitute.org
Large-scale proteomic approaches have identified numerous mitochondrial acetylated proteins; however in most cases, their regulation by acetyltransferases and deacetylases remains unclear. Sirtuin 3 ...(SIRT3) is an NAD ⁺-dependent mitochondrial protein deacetylase that has been shown to regulate a limited number of enzymes in key metabolic pathways. Here, we use a rigorous label-free quantitative MS approach (called MS1 Filtering) to analyze changes in lysine acetylation from mouse liver mitochondria in the absence of SIRT3. Among 483 proteins, a total of 2,187 unique sites of lysine acetylation were identified after affinity enrichment. MS1 Filtering revealed that lysine acetylation of 283 sites in 136 proteins was significantly increased in the absence of SIRT3 (at least twofold). A subset of these sites was independently validated using selected reaction monitoring MS. These data show that SIRT3 regulates acetylation on multiple proteins, often at multiple sites, across several metabolic pathways including fatty acid oxidation, ketogenesis, amino acid catabolism, and the urea and tricarboxylic acid cycles, as well as mitochondrial regulatory proteins. The widespread modification of key metabolic pathways greatly expands the number of known substrates and sites that are targeted by SIRT3 and establishes SIRT3 as a global regulator of mitochondrial protein acetylation with the capability of coordinating cellular responses to nutrient status and energy homeostasis.
Public health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently ...used to screen newborns for a panel of rare inborn errors of metabolism (IEMs)
. The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS. We obtained archived residual dried blood spots and data for nearly all IEM cases from the 4.5 million infants born in California between mid-2005 and 2013 and from some infants who screened positive by MS/MS, but were unaffected upon follow-up testing. WES had an overall sensitivity of 88% and specificity of 98.4%, compared to 99.0% and 99.8%, respectively for MS/MS, although effectiveness varied among individual IEMs. Thus, WES alone was insufficiently sensitive or specific to be a primary screen for most NBS IEMs. However, as a secondary test for infants with abnormal MS/MS screens, WES could reduce false-positive results, facilitate timely case resolution and in some instances even suggest more appropriate or specific diagnosis than that initially obtained. This study represents the largest, to date, sequencing effort of an entire population of IEM-affected cases, allowing unbiased assessment of current capabilities of WES as a tool for population screening.