IMPORTANCE: Population screening for medically relevant genomic variants that cause diseases such as hereditary cancer and cardiovascular disorders is increasing to facilitate early disease detection ...or prevention. Neuropsychiatric disorders (NPDs) are common, complex disorders with clear genetic causes; yet, access to genetic diagnosis is limited. We explored whether inclusion of NPD in population-based genomic screening programs is warranted by assessing 3 key factors: prevalence, penetrance, and personal utility. OBJECTIVE: To evaluate the suitability of including pathogenic copy number variants (CNVs) associated with NPD in population screening by determining their prevalence and penetrance and exploring the personal utility of disclosing results. DESIGN, SETTING, AND PARTICIPANTS: In this cohort study, the frequency of 31 NPD CNVs was determined in patient-participants via exome data. Associated clinical phenotypes were assessed using linked electronic health records. Nine CNVs were selected for disclosure by licensed genetic counselors, and participants’ psychosocial reactions were evaluated using a mixed-methods approach. A primarily adult population receiving medical care at Geisinger, a large integrated health care system in the United States with the only population-based genomic screening program approved for medically relevant results disclosure, was included. The cohort was identified from the Geisinger MyCode Community Health Initiative. Exome and linked electronic health record data were available for this cohort, which was recruited from February 2007 to April 2017. Data were collected for the qualitative analysis April 2017 through February 2018. Analysis began February 2018 and ended December 2019. MAIN OUTCOMES AND MEASURES: The planned outcomes of this study include (1) prevalence estimate of NPD-associated CNVs in an unselected health care system population; (2) penetrance estimate of NPD diagnoses in CNV-positive individuals; and (3) qualitative themes that describe participants’ responses to receiving NPD-associated genomic results. RESULTS: Of 90 595 participants with CNV data, a pathogenic CNV was identified in 708 (0.8%; 436 women 61.6%; mean SD age, 50.04 18.74 years). Seventy percent (n = 494) had at least 1 associated clinical symptom. Of these, 28.8% (204) of CNV-positive individuals had an NPD code in their electronic health record, compared with 13.3% (11 835 of 89 887) of CNV-negative individuals (odds ratio, 2.21; 95% CI, 1.86-2.61; P < .001); 66.4% (470) of CNV-positive individuals had a history of depression and anxiety compared with 54.6% (49 118 of 89 887) of CNV-negative individuals (odds ratio, 1.53; 95% CI, 1.31-1.80; P < .001). 16p13.11 (71 0.078%) and 22q11.2 (108 0.119%) were the most prevalent deletions and duplications, respectively. Only 5.8% of individuals (41 of 708) had a previously known genetic diagnosis. Results disclosure was completed for 141 individuals. Positive participant responses included poignant reactions to learning a medical reason for lifelong cognitive and psychiatric disabilities. CONCLUSIONS AND RELEVANCE: This study informs critical factors central to the development of population-based genomic screening programs and supports the inclusion of NPD in future designs to promote equitable access to clinically useful genomic information.
Several algorithms exist for detecting copy number variants (CNVs) from human exome sequencing read depth, but previous tools have not been well suited for large population studies on the order of ...tens or hundreds of thousands of exomes. Their limitations include being difficult to integrate into automated variant-calling pipelines and being ill-suited for detecting common variants. To address these issues, we developed a new algorithm--Copy number estimation using Lattice-Aligned Mixture Models (CLAMMS)--which is highly scalable and suitable for detecting CNVs across the whole allele frequency spectrum.
In this note, we summarize the methods and intended use-case of CLAMMS, compare it to previous algorithms and briefly describe results of validation experiments. We evaluate the adherence of CNV calls from CLAMMS and four other algorithms to Mendelian inheritance patterns on a pedigree; we compare calls from CLAMMS and other algorithms to calls from SNP genotyping arrays for a set of 3164 samples; and we use TaqMan quantitative polymerase chain reaction to validate CNVs predicted by CLAMMS at 39 loci (95% of rare variants validate; across 19 common variant loci, the mean precision and recall are 99% and 94%, respectively). In the Supplementary Materials (available at the CLAMMS Github repository), we present our methods and validation results in greater detail.
https://github.com/rgcgithub/clamms (implemented in C).
jeffrey.reid@regeneron.com
Supplementary data are available at Bioinformatics online.
MicroRNAs play a vital role in the regulation of gene expression and have been identified in every animal with a sequenced genome examined thus far, except for the placozoan Trichoplax. The genomic ...repertoires of metazoan microRNAs have become increasingly endorsed as phylogenetic characters and drivers of biological complexity.
In this study, we report the first investigation of microRNAs in a species from the phylum Ctenophora. We use short RNA sequencing and the assembled genome of the lobate ctenophore Mnemiopsis leidyi to show that this species appears to lack any recognizable microRNAs, as well as the nuclear proteins Drosha and Pasha, which are critical to canonical microRNA biogenesis. This finding represents the first reported case of a metazoan lacking a Drosha protein.
Recent phylogenomic analyses suggest that Mnemiopsis may be the earliest branching metazoan lineage. If this is true, then the origins of canonical microRNA biogenesis and microRNA-mediated gene regulation may postdate the last common metazoan ancestor. Alternatively, canonical microRNA functionality may have been lost independently in the lineages leading to both Mnemiopsis and the placozoan Trichoplax, suggesting that microRNA functionality was not critical until much later in metazoan evolution.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Large-scale human genetics studies are ascertaining increasing proportions of populations as they continue growing in both number and scale. As a result, the amount of cryptic relatedness within ...these study cohorts is growing rapidly and has significant implications on downstream analyses. We demonstrate this growth empirically among the first 92,455 exomes from the DiscovEHR cohort and, via a custom simulation framework we developed called SimProgeny, show that these measures are in line with expectations given the underlying population and ascertainment approach. For example, within DiscovEHR we identified ∼66,000 close (first- and second-degree) relationships, involving 55.6% of study participants. Our simulation results project that >70% of the cohort will be involved in these close relationships, given that DiscovEHR scales to 250,000 recruited individuals. We reconstructed 12,574 pedigrees by using these relationships (including 2,192 nuclear families) and leveraged them for multiple applications. The pedigrees substantially improved the phasing accuracy of 20,947 rare, deleterious compound heterozygous mutations. Reconstructed nuclear families were critical for identifying 3,415 de novo mutations in ∼1,783 genes. Finally, we demonstrate the segregation of known and suspected disease-causing mutations, including a tandem duplication that occurs in LDLR and causes familial hypercholesterolemia, through reconstructed pedigrees. In summary, this work highlights the prevalence of cryptic relatedness expected among large healthcare population-genomic studies and demonstrates several analyses that are uniquely enabled by large amounts of cryptic relatedness.
MicroRNAs (miRNAs) regulate gene expression by binding to partially complementary sequences on target mRNA transcripts, thereby causing their degradation, deadenylation, or inhibiting their ...translation. Genomic variants can alter miRNA regulation by modifying miRNA target sites, and multiple human disease phenotypes have been linked to such miRNA target site variants (miR-TSVs). However, systematic genome-wide identification of functional miR-TSVs is difficult due to high false positive rates; functional miRNA recognition sequences can be as short as six nucleotides, with the human genome encoding thousands of miRNAs. Furthermore, while large-scale clinical genomic data sets are becoming increasingly commonplace, existing miR-TSV prediction methods are not designed to analyze these data. Here, we present an open-source tool called SubmiRine that is designed to perform efficient miR-TSV prediction systematically on variants identified in novel clinical genomic data sets. Most importantly, SubmiRine allows for the prioritization of predicted miR-TSVs according to their relative probability of being functional. We present the results of SubmiRine using integrated clinical genomic data from a large-scale cohort study on chronic obstructive pulmonary disease (COPD), making a number of high-scoring, novel miR-TSV predictions. We also demonstrate SubmiRine's ability to predict and prioritize known miR-TSVs that have undergone experimental validation in previous studies.
Recurrent pathogenic copy number variants (pCNVs) have large-effect impacts on brain function and represent important etiologies of neurodevelopmental psychiatric disorders (NPDs), including autism ...and schizophrenia. Patterns of health care utilization in adults with pCNVs have gone largely unstudied and are likely to differ in significant ways from those of children.
We compared the prevalence of NPDs and electronic health record–based medical conditions in 928 adults with 26 pCNVs to a demographically-matched cohort of pCNV-negative controls from >135,000 patient-participants in Geisinger’s MyCode Community Health Initiative. We also evaluated 3 quantitative health care utilization measures (outpatient, inpatient, and emergency department visits) in both groups.
Adults with pCNVs (24.9%) were more likely than controls (16.0%) to have a documented NPD. They had significantly higher rates of several chronic diseases, including diabetes (29.3% in participants with pCNVs vs 20.4% in participants without pCNVs) and dementia (2.2% in participants with pCNVs vs 1.0% participants without pCNVs), and twice as many annual emergency department visits.
These findings highlight the potential for genetic information—specifically, pCNVs—to inform the study of health care outcomes and utilization in adults. If, as our findings suggest, adults with pCNVs have poorer health and require disproportionate health care resources, early genetic diagnosis paired with patient-centered interventions may help to anticipate problems, improve outcomes, and reduce the associated economic burden.
The recent expansion of whole-genome sequence data available from diverse animal lineages provides an opportunity to investigate the evolutionary origins of specific classes of human disease genes. ...Previous studies have observed that human disease genes are of particularly ancient origin. While this suggests that many animal species have the potential to serve as feasible models for research on genes responsible for human disease, it is unclear whether this pattern has meaningful implications and whether it prevails for every class of human disease.
We used a comparative genomics approach encompassing a broad phylogenetic range of animals with sequenced genomes to determine the evolutionary patterns exhibited by human genes associated with different classes of disease. Our results support previous claims that most human disease genes are of ancient origin but, more importantly, we also demonstrate that several specific disease classes have a significantly large proportion of genes that emerged relatively recently within the metazoans and/or vertebrates. An independent assessment of the synonymous to non-synonymous substitution rates of human disease genes found in mammals reveals that disease classes that arose more recently also display unexpected rates of purifying selection between their mammalian and human counterparts.
Our results reveal the heterogeneity underlying the evolutionary origins of (and selective pressures on) different classes of human disease genes. For example, some disease gene classes appear to be of uncommonly recent (i.e., vertebrate-specific) origin and, as a whole, have been evolving at a faster rate within mammals than the majority of disease classes having more ancient origins. The novel patterns that we have identified may provide new insight into cases where studies using traditional animal models were unable to produce results that translated to humans. Conversely, we note that the larger set of disease classes do have ancient origins, suggesting that many non-traditional animal models have the potential to be useful for studying many human disease genes. Taken together, these findings emphasize why model organism selection should be done on a disease-by-disease basis, with evolutionary profiles in mind.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Type 2 diabetes has been reproducibly clustered into five subtypes with different disease progression and risk of complications; however, etiological differences are unknown. We used genome-wide ...association and genetic risk score (GRS) analysis to compare the underlying genetic drivers. Individuals from the Swedish ANDIS (All New Diabetics In Scania) study were compared to individuals without diabetes; the Finnish DIREVA (Diabetes register in Vasa) and Botnia studies were used for replication. We show that subtypes differ with regard to family history of diabetes and association with GRS for diabetes-related traits. The severe insulin-resistant subtype was uniquely associated with GRS for fasting insulin but not with variants in the TCF7L2 locus or GRS reflecting insulin secretion. Further, an SNP (rs10824307) near LRMDA was uniquely associated with mild obesity-related diabetes. Therefore, we conclude that the subtypes have partially distinct genetic backgrounds indicating etiological differences.
Cardiometabolic diseases are the leading cause of death worldwide. Despite a known genetic component, our understanding of these diseases remains incomplete. Here, we analyzed the contribution of ...rare variants to 57 diseases and 26 cardiometabolic traits, using data from 200,337 UK Biobank participants with whole-exome sequencing. We identified 57 gene-based associations, with broad replication of novel signals in Geisinger MyCode. There was a striking risk associated with mutations in known Mendelian disease genes, including MYBPC3, LDLR, GCK, PKD1 and TTN. Many genes showed independent convergence of rare and common variant evidence, including an association between GIGYF1 and type 2 diabetes. We identified several large effect associations for height and 18 unique genes associated with blood lipid or glucose levels. Finally, we found that between 1.0% and 2.4% of participants carried rare potentially pathogenic variants for cardiometabolic disorders. These findings may facilitate studies aimed at therapeutics and screening of these common disorders.
Prediction of disease risk is a key component of precision medicine. Common traits such as psychiatric disorders have a complex polygenic architecture, making the identification of a single risk ...predictor difficult. Polygenic risk scores (PRSs) denoting the sum of an individual’s genetic liability for a disorder are a promising biomarker for psychiatric disorders, but they require evaluation in a clinical setting.
We developed PRSs for 6 psychiatric disorders (schizophrenia, bipolar disorder, major depressive disorder, cross disorder, attention-deficit/hyperactivity disorder, and anorexia nervosa) and 17 nonpsychiatric traits in more than 10,000 individuals from the Penn Medicine Biobank with accompanying electronic health records. We performed phenome-wide association analyses to test their association across disease categories.
Four of the 6 psychiatric PRSs were associated with their primary phenotypes (odds ratios from 1.2 to 1.6). Cross-trait associations were identified both within the psychiatric domain and across trait domains. PRSs for coronary artery disease and years of education were significantly associated with psychiatric disorders, largely driven by an association with tobacco use disorder.
We demonstrated that the genetic architecture of electronic health record–derived psychiatric diagnoses is similar to ascertained research cohorts from large consortia. Psychiatric PRSs are moderately associated with psychiatric diagnoses but are not yet clinically predictive in naïve patients. Cross-trait associations for these PRSs suggest a broader effect of genetic liability beyond traditional diagnostic boundaries. As identification of genetic markers increases, including PRSs alongside other clinical risk factors may enhance prediction of psychiatric disorders and associated conditions in clinical registries.