Alopecia areata is a complex genetic disease that results in hair loss due to the autoimmune-mediated attack of the hair follicle. We previously defined a role for both rare and common variants in ...our earlier GWAS and linkage studies. Here, we identify rare variants contributing to Alopecia Areata using a whole exome sequencing and gene-level burden analyses approach on 849 Alopecia Areata patients compared to 15,640 controls. KRT82 is identified as an Alopecia Areata risk gene with rare damaging variants in 51 heterozygous Alopecia Areata individuals (6.01%), achieving genome-wide significance (p = 2.18E-07). KRT82 encodes a hair-specific type II keratin that is exclusively expressed in the hair shaft cuticle during anagen phase, and its expression is decreased in Alopecia Areata patient skin and hair follicles. Finally, we find that cases with an identified damaging KRT82 variant and reduced KRT82 expression have elevated perifollicular CD8 infiltrates. In this work, we utilize whole exome sequencing to successfully identify a significant Alopecia Areata disease-relevant gene, KRT82, and reveal a proposed mechanism for rare variant predisposition leading to disrupted hair shaft integrity.
The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method ...to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.
Over the past few years, substantial effort has been put into the functional annotation of variation in human genome sequences. Such annotations can have a critical role in identifying putatively ...causal variants for a disease or trait among the abundant natural variation that occurs at a locus of interest. The main challenges in using these various annotations include their large numbers and their diversity. Here we develop an unsupervised approach to integrate these different annotations into one measure of functional importance (Eigen) that, unlike most existing methods, is not based on any labeled training data. We show that the resulting meta-score has better discriminatory ability using disease-associated and putatively benign variants from published studies (in both coding and noncoding regions) than the recently proposed CADD score. Across varied scenarios, the Eigen score performs generally better than any single individual annotation, representing a powerful single functional score that can be incorporated in fine-mapping studies.
Recent developments in sequencing technologies have made it possible to uncover both rare and common genetic variants. Genome-wide association studies (GWASs) can test for the effect of common ...variants, whereas sequence-based association studies can evaluate the cumulative effect of both rare and common variants on disease risk. Many groupwise association tests, including burden tests and variance-component tests, have been proposed for this purpose. Although such tests do not exclude common variants from their evaluation, they focus mostly on testing the effect of rare variants by upweighting rare-variant effects and downweighting common-variant effects and can therefore lose substantial power when both rare and common genetic variants in a region influence trait susceptibility. There is increasing evidence that the allelic spectrum of risk variants at a given locus might include novel, rare, low-frequency, and common genetic variants. Here, we introduce several sequence kernel association tests to evaluate the cumulative effect of rare and common variants. The proposed tests are computationally efficient and are applicable to both binary and continuous traits. Furthermore, they can readily combine GWAS and whole-exome-sequencing data on the same individuals, when available, and are also applicable to deep-resequencing data of GWAS loci. We evaluate these tests on data simulated under comprehensive scenarios and show that compared with the most commonly used tests, including the burden and variance-component tests, they can achieve substantial increases in power. We next show applications to sequencing studies for Crohn disease and autism spectrum disorders. The proposed tests have been incorporated into the software package SKAT.
To evaluate evidence for de novo etiologies in schizophrenia, we sequenced at high coverage the exomes of families recruited from two populations with distinct demographic structures and history. We ...sequenced a total of 795 exomes from 231 parent-proband trios enriched for sporadic schizophrenia cases, as well as 34 unaffected trios. We observed in cases an excess of de novo nonsynonymous single-nucleotide variants as well as a higher prevalence of gene-disruptive de novo mutations relative to controls. We found four genes (LAMA2, DPYD, TRRAP and VPS39) affected by recurrent de novo events within or across the two populations, which is unlikely to have occurred by chance. We show that de novo mutations affect genes with diverse functions and developmental profiles, but we also find a substantial contribution of mutations in genes with higher expression in early fetal life. Our results help define the genomic and neural architecture of schizophrenia.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Introduction
We investigated metabolites in plasma to capture systemic biochemical changes associated with Alzheimer's disease (AD).
Methods
Metabolites in plasma were measured in 59 AD cases and 60 ...healthy participants of African American (AA), Caribbean Hispanic (CH), and non‐Hispanic white (NHW) ancestry using untargeted liquid‐chromatography–based ultra‐high‐resolution mass spectrometry. Metabolite differences between AD and healthy, ethnic groups and apolipoprotein E gene (APOE) ε4 status were analyzed. Untargeted network analysis identified pathways enriched in AD‐associated metabolites.
Results
A total of 5929 annotated metabolites were measured. Partial least squares discriminant analysis (PLS‐DA) inferred that AD clustered separately from healthy controls (area under the curve AUC = 0.9816); discriminating pathways included glycerophospholipid, sphingolipid, and non‐essential amino acid (alanine, aspartate, glutamate) metabolism. Metabolic features in AA clustered differently from CH and NHW (AUC = 0.9275), and differed between APOE ε4 carriers and non‐carriers (AUC = 0.9972).
Discussion
Metabolites, specifically lipids, were associated with AD, APOE ε4, and ethnic group. Metabolite profiling can identify perturbed AD pathways, but genetic and ancestral background need to be considered.
The analysis of whole-genome sequencing studies is challenging due to the large number of noncoding rare variants, our limited understanding of their functional effects, and the lack of natural units ...for testing. Here we propose a scan statistic framework, WGScan, to simultaneously detect the existence, and estimate the locations of association signals at genome-wide scale. WGScan can analytically estimate the significance threshold for a whole-genome scan; utilize summary statistics for a meta-analysis; incorporate functional annotations for enhanced discoveries in noncoding regions; and enable enrichment analyses using genome-wide summary statistics. Based on the analysis of whole genomes of 1,786 phenotypically discordant sibling pairs from the Simons Simplex Collection study for autism spectrum disorders, we derive genome-wide significance thresholds for whole genome sequencing studies and detect significant enrichments of regions showing associations with autism in promoter regions, functional categories related to autism, and enhancers predicted to regulate expression of autism associated genes.
Predicting the functional consequences of genetic variants in non-coding regions is a challenging problem. We propose here a semi-supervised approach, GenoNet, to jointly utilize experimentally ...confirmed regulatory variants (labeled variants), millions of unlabeled variants genome-wide, and more than a thousand cell/tissue type specific epigenetic annotations to predict functional consequences of non-coding variants. Through the application to several experimental datasets, we demonstrate that the proposed method significantly improves prediction accuracy compared to existing functional prediction methods at the tissue/cell type level, but especially so at the organism level. Importantly, we illustrate how the GenoNet scores can help in fine-mapping at GWAS loci, and in the discovery of disease associated genes in sequencing studies. As more comprehensive lists of experimentally validated variants become available over the next few years, semi-supervised methods like GenoNet can be used to provide increasingly accurate functional predictions for variants genome-wide and across a variety of cell/tissue types.
We analyze de novo synonymous mutations identified in autism spectrum disorders (ASDs) and schizophrenia (SCZ) with potential impact on regulatory elements using data from whole-exome sequencing ...(WESs) studies. Focusing on five types of genetic regulatory functions, we found that de novo near-splice site synonymous mutations changing exonic splicing regulators and those within frontal cortex-derived DNase I hypersensitivity sites are significantly enriched in ASD and SCZ, respectively. These results remained significant, albeit less so, after incorporating two additional ASD datasets. Among the genes identified, several are hit by multiple functional de novo mutations, with RAB2A and SETD1A showing the highest statistical significance in ASD and SCZ, respectively. The estimated contribution of these synonymous mutations to disease liability is comparable to de novo protein-truncating mutations. These findings expand the repertoire of functional de novo mutations to include “functional” synonymous ones and strengthen the role of rare variants in neuropsychiatric disease risk.
•De novo synonymous mutations likely affecting splicing regulation are enriched in ASD•De novo synonymous mutations within frontal cortex-derived DHS are enriched in SCZ•“Functional” synonymous mutations significantly contribute to disease liability•“Functional” synonymous mutations support role of SETD1A and RAB2A in neuropsychiatry
The role of de novo non-synonymous mutations in neuropsychiatric disorders is well established. In this study, Takata et al. explore the role of a different class of genetic variants, the synonymous mutations, and discover that they too contribute to autism and schizophrenia by affecting regulatory elements.
Labeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, and painstaking review by clinicians. Furthermore, existing phenotyping ...algorithms are not uniformly applied across large datasets and can suffer from inconsistencies in case definitions across different algorithms. We describe here quantitative disease risk scores based on almost unsupervised methods that require minimal input from clinicians, can be applied to large datasets, and alleviate some of the main weaknesses of existing phenotyping algorithms. We show applications to phenotypic data on approximately 100,000 individuals in eMERGE, and focus on several complex diseases, including Chronic Kidney Disease, Coronary Artery Disease, Type 2 Diabetes, Heart Failure, and a few others. We demonstrate that relative to existing approaches, the proposed methods have higher prediction accuracy, can better identify phenotypic features relevant to the disease under consideration, can perform better at clinical risk stratification, and can identify undiagnosed cases based on phenotypic features available in the EHR. Using genetic data from the eMERGE-seq panel that includes sequencing data for 109 genes on 21,363 individuals from multiple ethnicities, we also show how the new quantitative disease risk scores help improve the power of genetic association studies relative to the standard use of disease phenotypes. The results demonstrate the effectiveness of quantitative disease risk scores derived from rich phenotypic EHR databases to provide a more meaningful characterization of clinical risk for diseases of interest beyond the prevalent binary (case-control) classification.