The OAS1/2/3 cluster has been identified as a risk locus for severe COVID-19 among individuals of European ancestry, with a protective haplotype of approximately 75 kilobases (kb) derived from ...Neanderthals in the chromosomal region 12q24.13. This haplotype contains a splice variant of OAS1, which occurs in people of African ancestry independently of gene flow from Neanderthals. Using trans-ancestry fine-mapping approaches in 20,779 hospitalized cases, we demonstrate that this splice variant is likely to be the SNP responsible for the association at this locus, thus strongly implicating OAS1 as an effector gene influencing COVID-19 severity.
A range of rare and common genetic variants have been discovered to be potentially associated with mental diseases, but many more have not been uncovered. Powerful integrative methods are needed to ...systematically prioritize both variants and genes that confer susceptibility to mental diseases in personal genomes of individual patients and to facilitate the development of personalized treatment or therapeutic approaches.
Leveraging deep neural network on the TensorFlow framework, we developed a computational tool, integrated Mental-disorder GEnome Score (iMEGES), for analyzing whole genome/exome sequencing data on personal genomes. iMEGES takes as input genetic mutations and phenotypic information from a patient with mental disorders, and outputs the rank of whole genome susceptibility variants and the prioritized disease-specific genes for mental disorders by integrating contributions from coding and non-coding variants, structural variants (SVs), known brain expression quantitative trait loci (eQTLs), and epigenetic information from PsychENCODE.
iMEGES was evaluated on multiple datasets of mental disorders, and it achieved improved performance than competing approaches when large training dataset is available.
iMEGES can be used in population studies to help the prioritization of novel genes or variants that might be associated with the susceptibility to mental disorders, and also on individual patients to help the identification of genes or variants related to mental diseases.
We propose BIGKnock (BIobank-scale Gene-based association test via Knockoffs), a computationally efficient gene-based testing approach for biobank-scale data, that leverages long-range chromatin ...interaction data, and performs conditional genome-wide testing via knockoffs. BIGKnock can prioritize causal genes over proxy associations at a locus. We apply BIGKnock to the UK Biobank data with 405,296 participants for multiple binary and quantitative traits, and show that relative to conventional gene-based tests, BIGKnock produces smaller sets of significant genes that contain the causal gene(s) with high probability. We further illustrate its ability to pinpoint potential causal genes at Formula: see text of the associated loci.
Chronic kidney disease (CKD) is determined by an interplay of monogenic, polygenic, and environmental risks. Autosomal dominant polycystic kidney disease (ADPKD) and COL4A-associated nephropathy ...(COL4A-AN) represent the most common forms of monogenic kidney diseases. These disorders have incomplete penetrance and variable expressivity, and we hypothesize that polygenic factors explain some of this variability. By combining SNP array, exome/genome sequence, and electronic health record data from the UK Biobank and All-of-Us cohorts, we demonstrate that the genome-wide polygenic score (GPS) significantly predicts CKD among ADPKD monogenic variant carriers. Compared to the middle tertile of the GPS for noncarriers, ADPKD variant carriers in the top tertile have a 54-fold increased risk of CKD, while ADPKD variant carriers in the bottom tertile have only a 3-fold increased risk of CKD. Similarly, the GPS significantly predicts CKD in COL4A-AN carriers. The carriers in the top tertile of the GPS have a 2.5-fold higher risk of CKD, while the risk for carriers in the bottom tertile is not different from the average population risk. These results suggest that accounting for polygenic risk improves risk stratification in monogenic kidney disease.
IgA nephropathy is thought to be an autoimmune disease wherein galactose-deficient IgA1 (Gd-IgA1) is recognized by IgG autoantibodies, resulting in formation and renal accumulation of nephritogenic ...immune complexes. Although this hypothesis is supported by recent findings that, in renal immunodeposits of IgA nephropathy patients, IgG is enriched for Gd-IgA1-specific autoantibodies, experimental proof is still lacking.
IgG isolated from sera of IgA nephropathy patients or produced as a recombinant IgG (rIgG) was mixed with human Gd-IgA1 to form immune complexes. IgG from healthy individuals served as a control. Nude and SCID mice were injected with human IgG and Gd-IgA1, in immune complexes or individually, and their presence in kidneys was ascertained by immunofluorescence. Pathologic changes in the glomeruli were evaluated by quantitative morphometry and exploratory transcriptomic profiling was performed by RNA-Seq.
Immunodeficient mice injected with Gd-IgA1 mixed with IgG autoantibodies from patients with IgA nephropathy, but not Gd-IgA1 mixed with IgG from healthy individuals, displayed IgA, IgG, and mouse complement C3 glomerular deposits and mesangioproliferative glomerular injury with hematuria and proteinuria. Un-complexed Gd-IgA1 or IgG did not induce pathological changes. Moreover, Gd-IgA1-rIgG immune complexes injected into immunodeficient mice induced histopathological changes characteristic of human disease. Exploratory transcriptome profiling of mouse kidney tissues indicated that these immune complexes altered gene expression of multiple pathways, in concordance with the changes observed in kidney biopsies of patients with IgA nephropathy.
This study provides the first in vivo evidence for a pathogenic role of IgG autoantibodies specific for Gd-IgA1 in the pathogenesis of IgA nephropathy.
•IgG autoantibodies in IgA nephropathy (IgAN) bind galactose-deficient IgA1 (Gd-IgA1).•IgG autoantibodies from IgAN patients form immune complexes with Gd-IgA1.•Immune complexes injected into mice induce pathologic glomerular changes.•Immune complex-altered gene expression in mouse kidneys resembles that in IgAN.
New genome-wide meta-analysis for longitudinal kidney function decline identified several genetic loci related to kidney disease progression. The study illustrated the complexity of modeling ...longitudinal traits in genome-wide association studies and highlighted the issue of a collider bias that can be introduced when a kidney disease progression phenotype is adjusted for baseline kidney function. Herein, we briefly outline the key findings of this study, their limitations, and implications for future studies.
Genetic variants in complement genes have been associated with a wide range of human disease states, but well-powered genetic association studies of complement activation have not been performed in ...large multiethnic cohorts.
We performed medical records-based genome-wide and phenome-wide association studies for plasma C3 and C4 levels among participants of the Electronic Medical Records and Genomics (eMERGE) network.
In a GWAS for C3 levels in 3949 individuals, we detected two genome-wide significant loci: chr.1q31.3 (CFH locus; rs3753396-A;
=0.20; 95% CI, 0.14 to 0.25;
=1.52x10
) and chr.19p13.3 (C3 locus; rs11569470-G;
=0.19; 95% CI, 0.13 to 0.24;
=1.29x10
). These two loci explained approximately 2% of variance in C3 levels. GWAS for C4 levels involved 3998 individuals and revealed a genome-wide significant locus at chr.6p21.32 (C4 locus; rs3135353-C;
=0.40; 95% CI, 0.34 to 0.45;
=4.58x10
). This locus explained approximately 13% of variance in C4 levels. The multiallelic copy number variant analysis defined two structural genomic C4 variants with large effect on blood C4 levels: C4-BS (
=-0.36; 95% CI, -0.42 to -0.30;
=2.98x10
) and C4-AL-BS (
=0.25; 95% CI, 0.21 to 0.29;
=8.11x10
). Overall, C4 levels were strongly correlated with copy numbers of C4A and C4B genes. In comprehensive phenome-wide association studies involving 102,138 eMERGE participants, we cataloged a full spectrum of autoimmune, cardiometabolic, and kidney diseases genetically related to systemic complement activation.
We discovered genetic determinants of plasma C3 and C4 levels using eMERGE genomic data linked to electronic medical records. Genetic variants regulating C3 and C4 levels have large effects and multiple clinical correlations across the spectrum of complement-related diseases in humans.
A learning scheme based on Extreme Learning Machine (ELM) and L1/2 regularization is proposed for a double parallel feedforward neural network. ELM has been widely used as a fast learning method for ...feedforward networks with a single hidden layer. A key problem for ELM is the choice of the (minimum) number of the hidden nodes. To resolve this problem, we propose to combine the L1/2 regularization method, that becomes popular in recent years in informatics, with ELM. It is shown in our experiments that the involvement of the L1/2 regularizer in DPFNN with ELM results in less hidden nodes but equally good performance.
Labeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, and painstaking review by clinicians. Furthermore, existing phenotyping ...algorithms are not uniformly applied across large datasets and can suffer from inconsistencies in case definitions across different algorithms. We describe here quantitative disease risk scores based on almost unsupervised methods that require minimal input from clinicians, can be applied to large datasets, and alleviate some of the main weaknesses of existing phenotyping algorithms. We show applications to phenotypic data on approximately 100,000 individuals in eMERGE, and focus on several complex diseases, including Chronic Kidney Disease, Coronary Artery Disease, Type 2 Diabetes, Heart Failure, and a few others. We demonstrate that relative to existing approaches, the proposed methods have higher prediction accuracy, can better identify phenotypic features relevant to the disease under consideration, can perform better at clinical risk stratification, and can identify undiagnosed cases based on phenotypic features available in the EHR. Using genetic data from the eMERGE-seq panel that includes sequencing data for 109 genes on 21,363 individuals from multiple ethnicities, we also show how the new quantitative disease risk scores help improve the power of genetic association studies relative to the standard use of disease phenotypes. The results demonstrate the effectiveness of quantitative disease risk scores derived from rich phenotypic EHR databases to provide a more meaningful characterization of clinical risk for diseases of interest beyond the prevalent binary (case-control) classification.