Many human proteins contain domains that vary in size or copy number because of variable numbers of tandem repeats (VNTRs) in protein-coding exons. However, the relationships of VNTRs to most ...phenotypes are unknown because of difficulties in measuring such repetitive elements. We developed methods to estimate VNTR lengths from whole-exome sequencing data and impute VNTR alleles into single-nucleotide polymorphism haplotypes. Analyzing 118 protein-altering VNTRs in 415,280 UK Biobank participants for association with 786 phenotypes identified some of the strongest associations of common variants with human phenotypes, including height, hair morphology, and biomarkers of health. Accounting for large-effect VNTRs further enabled fine-mapping of associations to many more protein-coding mutations in the same genes. These results point to cryptic effects of highly polymorphic common structural variants that have eluded molecular analyses to date.
Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between ...49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total n ≈ 500,000) to impute exome-wide variants with accuracy R
> 0.5 down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 significant associations (P < 5 × 10
) involving 675 distinct rare protein-altering variants (MAF < 0.01) that passed stringent filters for likely causality. Across all traits, 49% of associations (578/1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified allelic series containing up to 45 distinct 'likely-causal' variants. Our results demonstrate the utility of within-cohort imputation in population-scale genome-wide association studies, provide a catalog of likely-causal, large-effect coding variant associations and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.
The maturation of the mammalian brain occurs after birth, and this stage of neuronal development is frequently impaired in neurological disorders, such as autism and schizophrenia. However, the ...mechanisms that regulate postnatal brain maturation are poorly defined. By purifying neuronal subpopulations across brain development in mice, we identify a postnatal switch in the transcriptional regulatory circuits that operates in the maturing mammalian brain. We show that this developmental transition includes the formation of hundreds of cell-type-specific neuronal enhancers that appear to be modulated by neuronal activity. Once selected, these enhancers are active throughout adulthood, suggesting that their formation in early life shapes neuronal identity and regulates mature brain function.
•Changes in gene-regulatory elements (GREs) mediate early postnatal neuronal maturation•DNA methylation contributes to postnatal repression of GREs•Activity-dependent transcription factors mediate postnatal activation of GREs
The molecular mechanisms controlling brain development during early life are poorly understood. Stroud et al. characterize a postnatal switch in the transcriptional regulatory circuits that operate in maturing neurons and identify mechanisms by which neuronal activity and DNA methylation mediate this process.
Human neocortical 15–29-Hz beta oscillations are strong predictors of perceptual and motor performance. However, the mechanistic origin of beta in vivo is unknown, hindering understanding of its ...functional role. Combining human magnetoencephalography (MEG), computational modeling, and laminar recordings in animals, we present a new theory that accounts for the origin of spontaneous neocortical beta. In our MEG data, spontaneous beta activity from somatosensory and frontal cortex emerged as noncontinuous beta events typically lasting <150 ms with a stereotypical waveform. Computational modeling uniquely designed to infer the electrical currents underlying these signals showed that beta events could emerge from the integration of nearly synchronous bursts of excitatory synaptic drive targeting proximal and distal dendrites of pyramidal neurons, where the defining feature of a beta event was a strong distal drive that lasted one beta period (∼50 ms). This beta mechanism rigorously accounted for the beta event profiles; several other mechanisms did not. The spatial location of synaptic drive in the model to supragranular and infragranular layers was critical to the emergence of beta events and led to the prediction that beta events should be associated with a specific laminar current profile. Laminar recordings in somatosensory neocortex from anesthetized mice and awake monkeys supported these predictions, suggesting this beta mechanism is conserved across species and recording modalities. These findings make several predictions about optimal states for perceptual and motor performance and guide causal interventions to modulate beta for optimal function.
It has long been hypothesized that aging and neurodegeneration are associated with somatic mutation in neurons; however, methodological hurdles have prevented testing this hypothesis directly. We ...used single-cell whole-genome sequencing to perform genome-wide somatic single-nucleotide variant (sSNV) identification on DNA from 161 single neurons from the prefrontal cortex and hippocampus of 15 normal individuals (aged 4 months to 82 years), as well as 9 individuals affected by early-onset neurodegeneration due to genetic disorders of DNA repair (Cockayne syndrome and xeroderma pigmentosum). sSNVs increased approximately linearly with age in both areas (with a higher rate in hippocampus) and were more abundant in neurodegenerative disease. The accumulation of somatic mutations with age-which we term genosenium-shows age-related, region-related, and disease-related molecular signatures and may be important in other human age-associated conditions.
Identification of cancer driver mutations that confer a proliferative advantage is central to understanding cancer; however, searches have often been limited to protein-coding sequences and specific ...non-coding elements (for example, promoters) because of the challenge of modeling the highly variable somatic mutation rates observed across tumor genomes. Here we present Dig, a method to search for driver elements and mutations anywhere in the genome. We use deep neural networks to map cancer-specific mutation rates genome-wide at kilobase-scale resolution. These estimates are then refined to search for evidence of driver mutations under positive selection throughout the genome by comparing observed to expected mutation counts. We mapped mutation rates for 37 cancer types and applied these maps to identify putative drivers within intronic cryptic splice regions, 5' untranslated regions and infrequently mutated genes. Our high-resolution mutation rate maps, available for web-based exploration, are a resource to enable driver discovery genome-wide.
Recent work has found increasing evidence of mitigated, incompletely penetrant phenotypes in heterozygous carriers of recessive Mendelian disease variants. We leveraged whole-exome imputation within ...the full UK Biobank cohort (n ∼ 500K) to extend such analyses to 3,475 rare variants curated from ClinVar and OMIM. Testing these variants for association with 58 quantitative traits yielded 102 significant associations involving variants previously implicated in 34 different diseases. Notable examples included a POR missense variant implicated in Antley-Bixler syndrome that associated with a 1.76 (SE 0.27) cm increase in height and an ABCA3 missense variant implicated in interstitial lung disease that associated with reduced FEV1/FVC ratio. Association analyses with 1,134 disease traits yielded five additional variant-disease associations. We also observed contrasting levels of recessiveness between two more-common, classical Mendelian diseases. Carriers of cystic fibrosis variants exhibited increased risk of several mitigated disease phenotypes, whereas carriers of spinal muscular atrophy alleles showed no evidence of altered phenotypes. Incomplete penetrance of cystic fibrosis carrier phenotypes did not appear to be mediated by common allelic variation on the functional haplotype. Our results show that many disease-associated recessive variants can produce mitigated phenotypes in heterozygous carriers and motivate further work exploring penetrance mechanisms.
Although germline de novo copy number variants (CNVs) are known causes of autism spectrum disorder (ASD), the contribution of mosaic (early-developmental) copy number variants (mCNVs) has not been ...explored. In this study, we assessed the contribution of mCNVs to ASD by ascertaining mCNVs in genotype array intensity data from 12,077 probands with ASD and 5,500 unaffected siblings. We detected 46 mCNVs in probands and 19 mCNVs in siblings, affecting 2.8-73.8% of cells. Probands carried a significant burden of large (>4-Mb) mCNVs, which were detected in 25 probands but only one sibling (odds ratio = 11.4, 95% confidence interval = 1.5-84.2, P = 7.4 × 10
). Event size positively correlated with severity of ASD symptoms (P = 0.016). Surprisingly, we did not observe mosaic analogues of the short de novo CNVs recurrently observed in ASD (eg, 16p11.2). We further experimentally validated two mCNVs in postmortem brain tissue from 59 additional probands. These results indicate that mCNVs contribute a previously unexplained component of ASD risk.
The human genome contains hundreds of thousands of regions harboring copy-number variants (CNV). However, the phenotypic effects of most such polymorphisms are unknown because only larger CNVs have ...been ascertainable from SNP-array data generated by large biobanks. We developed a computational approach leveraging haplotype sharing in biobank cohorts to more sensitively detect CNVs. Applied to UK Biobank, this approach accounted for approximately half of all rare gene inactivation events produced by genomic structural variation. This CNV call set enabled a detailed analysis of associations between CNVs and 56 quantitative traits, identifying 269 independent associations (p < 5 × 10
) likely to be causally driven by CNVs. Putative target genes were identifiable for nearly half of the loci, enabling insights into dosage sensitivity of these genes and uncovering several gene-trait relationships. These results demonstrate the ability of haplotype-informed analysis to provide insights into the genetic basis of human complex traits.
Many regions in the human genome vary in length among individuals due to variable numbers of tandem repeats (VNTRs). To assess the phenotypic impact of VNTRs genome-wide, we applied a statistical ...imputation approach to estimate the lengths of 9,561 autosomal VNTR loci in 418,136 unrelated UK Biobank participants and 838 GTEx participants. Association and statistical fine-mapping analyses identified 58 VNTRs that appeared to influence a complex trait in UK Biobank, 18 of which also appeared to modulate expression or splicing of a nearby gene. Non-coding VNTRs at TMCO1 and EIF3H appeared to generate the largest known contributions of common human genetic variation to risk of glaucoma and colorectal cancer, respectively. Each of these two VNTRs associated with a >2-fold range of risk across individuals. These results reveal a substantial and previously unappreciated role of non-coding VNTRs in human health and gene regulation.