The analysis of whole-genome sequencing studies is challenging due to the large number of noncoding rare variants, our limited understanding of their functional effects, and the lack of natural units ...for testing. Here we propose a scan statistic framework, WGScan, to simultaneously detect the existence, and estimate the locations of association signals at genome-wide scale. WGScan can analytically estimate the significance threshold for a whole-genome scan; utilize summary statistics for a meta-analysis; incorporate functional annotations for enhanced discoveries in noncoding regions; and enable enrichment analyses using genome-wide summary statistics. Based on the analysis of whole genomes of 1,786 phenotypically discordant sibling pairs from the Simons Simplex Collection study for autism spectrum disorders, we derive genome-wide significance thresholds for whole genome sequencing studies and detect significant enrichments of regions showing associations with autism in promoter regions, functional categories related to autism, and enhancers predicted to regulate expression of autism associated genes.
Predicting the functional consequences of genetic variants in non-coding regions is a challenging problem. We propose here a semi-supervised approach, GenoNet, to jointly utilize experimentally ...confirmed regulatory variants (labeled variants), millions of unlabeled variants genome-wide, and more than a thousand cell/tissue type specific epigenetic annotations to predict functional consequences of non-coding variants. Through the application to several experimental datasets, we demonstrate that the proposed method significantly improves prediction accuracy compared to existing functional prediction methods at the tissue/cell type level, but especially so at the organism level. Importantly, we illustrate how the GenoNet scores can help in fine-mapping at GWAS loci, and in the discovery of disease associated genes in sequencing studies. As more comprehensive lists of experimentally validated variants become available over the next few years, semi-supervised methods like GenoNet can be used to provide increasingly accurate functional predictions for variants genome-wide and across a variety of cell/tissue types.
The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method ...to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.
Tau aggregation in neurofibrillary tangles (NFTs) is closely associated with neurodegeneration and cognitive decline in Alzheimer's disease (AD). However, the molecular signatures that distinguish ...between aggregation-prone and aggregation-resistant cell states are unknown. We developed methods for the high-throughput isolation and transcriptome profiling of single somas with NFTs from the human AD brain, quantified the susceptibility of 20 neocortical subtypes for NFT formation and death, and identified both shared and cell-type-specific signatures. NFT-bearing neurons shared a marked upregulation of synaptic transmission-related genes, including a core set of 63 genes enriched for synaptic vesicle cycling. Oxidative phosphorylation and mitochondrial dysfunction were highly cell-type dependent. Apoptosis was only modestly enriched, and the susceptibilities of NFT-bearing and NFT-free neurons for death were highly similar. Our analysis suggests that NFTs represent cell-type-specific responses to stress and synaptic dysfunction. We provide a resource for biomarker discovery and the investigation of tau-dependent and tau-independent mechanisms of neurodegeneration.
We describe a method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell-type- and/or tissue-specific way (FUN-LDA). Using this ...unsupervised approach, we predict tissue-specific functional effects for every position in the human genome in 127 different tissues and cell types. We demonstrate the usefulness of our predictions by using several validation experiments. Using eQTL data from several sources, including the GTEx project, Geuvadis project, and TwinsUK cohort, we show that eQTLs in specific tissues tend to be most enriched among the predicted functional variants in relevant tissues in Roadmap. We further show how these integrated functional scores can be used for (1) deriving the most likely cell or tissue type causally implicated for a complex trait by using summary statistics from genome-wide association studies and (2) estimating a tissue-based correlation matrix of various complex traits. We found large enrichment of heritability in functional components of relevant tissues for various complex traits, and FUN-LDA yielded higher enrichment estimates than existing methods. Finally, using experimentally validated functional variants from the literature and variants possibly implicated in disease by previous studies, we rigorously compare FUN-LDA with state-of-the-art functional annotation methods and show that FUN-LDA has better prediction accuracy and higher resolution than these methods. In particular, our results suggest that tissue- and cell-type-specific functional prediction methods tend to have substantially better prediction accuracy than organism-level prediction methods. Scores for each position in the human genome and for each ENCODE and Roadmap tissue are available online (see Web Resources).
We propose BIGKnock (BIobank-scale Gene-based association test via Knockoffs), a computationally efficient gene-based testing approach for biobank-scale data, that leverages long-range chromatin ...interaction data, and performs conditional genome-wide testing via knockoffs. BIGKnock can prioritize causal genes over proxy associations at a locus. We apply BIGKnock to the UK Biobank data with 405,296 participants for multiple binary and quantitative traits, and show that relative to conventional gene-based tests, BIGKnock produces smaller sets of significant genes that contain the causal gene(s) with high probability. We further illustrate its ability to pinpoint potential causal genes at Formula: see text of the associated loci.
Spinal cord injury (SCI) is a devastating condition with no current neurorestorative treatments. Clinical trials have been hampered by a lack of meaningful diagnostic and prognostic markers of injury ...severity and neurologic recovery. Objective biomarkers and novel therapies for SCI represent urgent unmet clinical needs. Biomarkers of SCI that objectively stratify the severity of cord damage could expand the depth and scope of clinical trials and represent targets for the development of novel therapies for acute SCI. MicroRNAs (miRNAs) represent promising candidates both as informative molecules of injury severity and recovery, and as therapeutic targets. miRNAs are small, regulatory RNA molecules that are tissue-specific and evolutionarily conserved across species. miRNAs have been shown to represent powerful predictors of pathology, particularly with respect to neurologic disorders.
Studies investigating miRNA alterations in all species of animal models and human studies of acute, traumatic SCI will be identified from PubMed, Embase, and Scopus. We aim to identify whether SCI is associated with a specific pattern of miRNA expression that is conserved across species, and whether SCI is associated with a tissue- or cell type-specific pattern of miRNA expression. The inclusion criteria for this study will include (1) studies published anytime, (2) including all species, and sexes with acute, traumatic SCI, (3) relating to the alteration of miRNA after SCI, using molecular-based detection platforms including qRT-PCR, microarray, and RNA-sequencing, (4) including statistically significant miRNA alterations in tissues, such as spinal cord, serum/plasma, and/or CSF, and (5) studies with a SHAM surgery group. Articles included in the review will have their titles, abstracts, and full texts reviewed by two independent authors. Random effects meta-regression will be performed, which allows for within-study and between-study variability, on the miRNA expression after SCI or SHAM surgery. We will analyze both the cumulative pooled dataset, as well as datasets stratified by species, tissue type, and timepoint to identify miRNA alterations that are specifically related to the injured spinal cord. We aim to identify SCI-related miRNA that are specifically altered both within a species, and those that are evolutionarily conserved across species, including humans. The analyses will provide a description of the evolutionarily conserved miRNA signature of the pathophysiological response to SCI.
Here, we present a protocol to perform a systematic review and meta-analysis to investigate the conserved inter- and intra-species miRNA changes that occur due to acute, traumatic SCI. This review seeks to serve as a valuable resource for the SCI community by establishing a rigorous and unbiased description of miRNA changes after SCI for the next generation of SCI biomarkers and therapeutic interventions.
The protocol for the systematic review and meta-analysis has been registered through PROSPERO: CRD42021222552 .
Transcranial magnetic stimulation paired with electroencephalography (TMS-EEG) can measure local excitability and functional connectivity. To address trial-to-trial variability, responses to multiple ...TMS pulses are recorded to obtain an average TMS evoked potential (TEP). Balancing adequate data acquisition to establish stable TEPs with feasible experimental duration is critical when applying TMS-EEG to clinical populations. Here we aim to investigate the minimum number of pulses (MNP) required to achieve stable TEPs in children with epilepsy. Eighteen children with Self-Limited Epilepsy with Centrotemporal Spikes, a common epilepsy arising from the motor cortices, underwent multiple 100-pulse blocks of TMS to both motor cortices over two days. TMS was applied at 120% of resting motor threshold (rMT) up to a maximum of 100% maximum stimulator output. The average of all 100 pulses was used as a "gold-standard" TEP to which we compared "candidate" TEPs obtained by averaging subsets of pulses. We defined TEP stability as the MNP needed to achieve a concordance correlation coefficient of 80% between the candidate and "gold-standard" TEP. We additionally assessed whether experimental or clinical factors affected TEP stability. Results show that stable TEPs can be derived from fewer than 100 pulses, a number typically used for designing TMS-EEG experiments. The early segment (15-80 ms) of the TEP was less stable than the later segment (80-350 ms). Global mean field amplitude derived from all channels was less stable than local TEP derived from channels overlying the stimulated site. TEP stability did not differ depending on stimulated hemisphere, block order, or antiseizure medication use, but was greater in older children. Stimulation administered with an intensity above the rMT yielded more stable local TEPs. Studies of TMS-EEG in pediatrics have been limited by the complexity of experimental set-up and time course. This study serves as a critical starting point, demonstrating the feasibility of designing efficient TMS-EEG studies that use a relatively small number of pulses to study pediatric epilepsy and potentially other pediatric groups.
Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ...ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer's disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.
Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis of GWAS summary statistics from ...36 phenotypes to decipher multitrait genetic architecture and its link with biological mechanisms. Our framework incorporates multitrait association mapping along with an investigation of the breakdown of genetic associations into clusters of variants harboring similar multitrait association profiles. Focusing on two subsets of immunity and metabolism phenotypes, we then demonstrate how genetic variants within clusters can be mapped to biological pathways and disease mechanisms. Finally, for the metabolism set, we investigate the link between gene cluster assignment and the success of drug targets in randomized controlled trials.