Chromatin loops form a basic unit of interphase nuclear organization, with chromatin loop anchor points providing contacts between regulatory regions and promoters. However, the mutational landscape ...at these anchor points remains under-studied. Here, we describe the unusual patterns of somatic mutations and germline variation associated with loop anchor points and explore the underlying features influencing these patterns.
Analyses of whole genome sequencing datasets reveal that anchor points are strongly depleted for single nucleotide variants (SNVs) in tumours. Despite low SNV rates in their genomic neighbourhood, anchor points emerge as sites of evolutionary innovation, showing enrichment for structural variant (SV) breakpoints and a peak of SNVs at focal CTCF sites within the anchor points. Both CTCF-bound and non-CTCF anchor points harbour an excess of SV breakpoints in multiple tumour types and are prone to double-strand breaks in cell lines. Common fragile sites, which are hotspots for genome instability, also show elevated numbers of intersecting loop anchor points. Recurrently disrupted anchor points are enriched for genes with functions in cell cycle transitions and regions associated with predisposition to cancer. We also discover a novel class of CTCF-bound anchor points which overlap meiotic recombination hotspots and are enriched for the core PRDM9 binding motif, suggesting that the anchor points have been foci for diversity generated during recent human evolution.
We suggest that the unusual chromatin environment at loop anchor points underlies the elevated rates of variation observed, marking them as sites of regulatory importance but also genomic fragility.
We have surveyed the evolutionary trends of mammalian promoters and upstream sequences, utilising large sets of experimentally supported transcription start sites (TSSs). With 30,969 well-defined ...TSSs from mouse and 26,341 from human, there are sufficient numbers to draw statistically meaningful conclusions and to consider differences between promoter types. Unlike previous smaller studies, we have considered the effects of insertions, deletions, and transposable elements as well as nucleotide substitutions. The rate of promoter evolution relative to that of control sequences has not been consistent between lineages nor within lineages over time. The most pronounced manifestation of this heterotachy is the increased rate of evolution in primate promoters. This increase is seen across different classes of mutation, including substitutions and micro-indel events. We investigated the relationship between promoter and coding sequence selective constraint and suggest that they are generally uncorrelated. This analysis also identified a small number of mouse promoters associated with the immune response that are under positive selection in rodents. We demonstrate significant differences in divergence between functional promoter categories and identify a category of promoters, not associated with conventional protein-coding genes, that has the highest rates of divergence across mammals. We find that evolutionary rates vary both on a fine scale within mammalian promoters and also between different functional classes of promoters. The discovery of heterotachy in promoter evolution, in particular the accelerated evolution of primate promoters, has important implications for our understanding of human evolution and for strategies to detect primate-specific regulatory elements.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Disruption of gene regulation is known to play major roles in carcinogenesis and tumour progression. Here, we comprehensively characterize the mutational profiles of diverse transcription factor ...binding sites (TFBSs) across 1,574 completely sequenced cancer genomes encompassing 11 tumour types. We assess the relative rates and impact of the mutational burden at the binding sites of 81 transcription factors (TFs), by comparing the abundance and patterns of single base substitutions within putatively functional binding sites to control sites with matched sequence composition. There is a strong (1.43-fold) and significant excess of mutations at functional binding sites across TFs, and the mutations that accumulate in cancers are typically more disruptive than variants tolerated in extant human populations at the same sites. CTCF binding sites suffer an exceptionally high mutational load in cancer (3.31-fold excess) relative to control sites, and we demonstrate for the first time that this effect is seen in essentially all cancer types with sufficient data. The sub-set of CTCF sites involved in higher order chromatin structures has the highest mutational burden, suggesting a widespread breakdown of chromatin organization. However, we find no evidence for selection driving these distinctive patterns of mutation. The mutational load at CTCF-binding sites is substantially determined by replication timing and the mutational signature of the tumor in question, suggesting that selectively neutral processes underlie the unusual mutation patterns. Pervasive hyper-mutation within transcription factor binding sites rewires the regulatory landscape of the cancer genome, but it is dominated by mutational processes rather than selection.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Mammalian chromosomes fold into arrays of megabase‐sized topologically associating domains (TADs), which are arranged into compartments spanning multiple megabases of genomic DNA. TADs have internal ...substructures that are often cell type specific, but their higher‐order organization remains elusive. Here, we investigate TAD higher‐order interactions with Hi‐C through neuronal differentiation and show that they form a hierarchy of domains‐within‐domains (metaTADs) extending across genomic scales up to the range of entire chromosomes. We find that TAD interactions are well captured by tree‐like, hierarchical structures irrespective of cell type. metaTAD tree structures correlate with genetic, epigenomic and expression features, and structural tree rearrangements during differentiation are linked to transcriptional state changes. Using polymer modelling, we demonstrate that hierarchical folding promotes efficient chromatin packaging without the loss of contact specificity, highlighting a role far beyond the simple need for packing efficiency.
Synopsis
Genome‐wide mapping of chromatin architecture reveals a hierarchical folding of chromatin that involves higher‐order domains interactions across the whole chromosomes, reflects epigenomic features and reorganizes upon differentiation‐induced gene expression changes.
Chromatin architecture is mapped genome‐wide using Hi‐C and a neuronal differentiation model from mESC to post‐mitotic neurons.
Mammalian chromosomes fold hierarchically in a manner that reflects epigenomic features and involves higher‐order domains (metaTADs) up to the chromosome scale.
metaTAD topologies are relatively conserved through differentiation, and their reorganization is related to gene expression changes.
Polymer modelling shows that hierarchical chromatin folding promotes efficient packaging without the loss of contact specificity.
Genome‐wide mapping of chromatin architecture reveals a hierarchical folding of chromatin that involves higher‐order domains interactions across the whole chromosomes, reflects epigenomic features and reorganizes upon differentiation‐induced gene expression changes.
Endometrioid ovarian carcinoma (EnOC) demonstrates substantial clinical and molecular heterogeneity. Here, we report whole exome sequencing of 112 EnOC cases following rigorous pathological ...assessment. We detect a high frequency of mutation in CTNNB1 (43%), PIK3CA (43%), ARID1A (36%), PTEN (29%), KRAS (26%), TP53 (26%) and SOX8 (19%), a recurrently-mutated gene previously unreported in EnOC. POLE and mismatch repair protein-encoding genes were mutated at lower frequency (6%, 18%) with significant co-occurrence. A molecular taxonomy is constructed, identifying clinically distinct EnOC subtypes: cases with TP53 mutation demonstrate greater genomic complexity, are commonly FIGO stage III/IV at diagnosis (48%), are frequently incompletely debulked (44%) and demonstrate inferior survival; conversely, cases with CTNNB1 mutation, which is mutually exclusive with TP53 mutation, demonstrate low genomic complexity and excellent clinical outcome, and are predominantly stage I/II at diagnosis (89%) and completely resected (87%). Moreover, we identify the WNT, MAPK/RAS and PI3K pathways as good candidate targets for molecular therapeutics in EnOC.
We recently found that hnRNP A1, a protein implicated in many aspects of RNA processing, acts as an auxiliary factor for the Drosha-mediated processing of a microRNA precursor, pri-miR-18a. Here, we ...provide the mechanism by which hnRNP A1 regulates this event. We show that hnRNP A1 binds to the loop of pri-miR-18a and induces a relaxation at the stem, creating a more favorable cleavage site for Drosha. We found that approximately 14% of all pri-miRNAs have highly conserved loops, which we predict act as landing pads for trans-acting factors influencing miRNA processing. In agreement, we show that 2′O-methyl oligonucleotides targeting conserved loops (LooptomiRs) abolish miRNA processing in vitro. Furthermore, we present evidence to support an essential role of conserved loops for pri-miRNA processing. Altogether, these data suggest the existence of auxiliary factors for the processing of specific miRNAs, revealing an additional level of complexity for the regulation of miRNA biogenesis.
In this study we investigated the strengths and modes of selection associated with nucleosome positioning in the human lineage through the comparison of interspecies and intraspecies rates of ...divergence. We identify significant evidence for both positive and negative selection linked to human nucleosome positioning for the first time, implicating a widespread and important role for DNA sequence in the location of well-positioned nucleosomes. Selection appears to be acting on particular base substitutions to maintain optimum GC compositions in core and linker regions, with, e.g., unexpectedly elevated rates of C→T substitutions during recent human evolution at linker regions 60-90 bp from the nucleosome dyad but significant depletion of the same substitutions within nucleosome core regions. These patterns are strikingly consistent with the known relationships between genomic sequence composition and nucleosome assembly. By stratifying nucleosomes according to the GC content of their genomic neighborhood, we also show that the strength and direction of selection detected is dictated by local GC content. Intriguingly these signatures of selection are not restricted to nucleosomes in close proximity to exons, suggesting the correct positioning of nucleosomes is not only important in and around coding regions. This analysis provides strong evidence that the genomic sequences associated with nucleosomes are not evolving neutrally, and suggests that underlying DNA sequence is an important factor in nucleosome positioning. Recent signatures of selection linked to genomic features as ubiquitous as the nucleosome have important implications for human genome evolution and disease.
Evolutionary change in gene expression is generally considered to be a major driver of phenotypic differences between species. We investigated innate immune diversification by analyzing interspecies ...differences in the transcriptional responses of primary human and mouse macrophages to the Toll-like receptor (TLR)–4 agonist lipopolysaccharide (LPS). By using a custom platform permitting cross-species interrogation coupled with deep sequencing of mRNA 5′ ends, we identified extensive divergence in LPS-regulated orthologous gene expression between humans and mice (24% of orthologues were identified as “divergently regulated”). We further demonstrate concordant regulation of human-specific LPS target genes in primary pig macrophages. Divergently regulated orthologues were enriched for genes encoding cellular “inputs” such as cell surface receptors (e.g., TLR6, IL-7Rα) and functional “outputs” such as inflammatory cytokines/chemokines (e.g., CCL20, CXCL13). Conversely, intracellular signaling components linking inputs to outputs were typically concordantly regulated. Functional consequences of divergent gene regulation were confirmed by showing LPS pretreatment boosts subsequent TLR6 responses in mouse but not human macrophages, in keeping with mouse-specific TLR6 induction. Divergently regulated genes were associated with a large dynamic range of gene expression, and specific promoter architectural features (TATA box enrichment, CpG island depletion). Surprisingly, regulatory divergence was also associated with enhanced interspecies promoter conservation. Thus, the genes controlled by complex, highly conserved promoters that facilitate dynamic regulation are also the most susceptible to evolutionary change.
Gene-gene interactions (epistasis) are thought to be important in shaping complex traits, but they have been under-explored in genome-wide association studies (GWAS) due to the computational ...challenge of enumerating billions of single nucleotide polymorphism (SNP) combinations. Fast screening tools are needed to make epistasis analysis routinely available in GWAS.
We present BiForce to support high-throughput analysis of epistasis in GWAS for either quantitative or binary disease (case-control) traits. BiForce achieves great computational efficiency by using memory efficient data structures, Boolean bitwise operations and multithreaded parallelization. It performs a full pair-wise genome scan to detect interactions involving SNPs with or without significant marginal effects using appropriate Bonferroni-corrected significance thresholds. We show that BiForce is more powerful and significantly faster than published tools for both binary and quantitative traits in a series of performance tests on simulated and real datasets. We demonstrate BiForce in analysing eight metabolic traits in a GWAS cohort (323 697 SNPs, >4500 individuals) and two disease traits in another (>340 000 SNPs, >1750 cases and 1500 controls) on a 32-node computing cluster. BiForce completed analyses of the eight metabolic traits within 1 day, identified nine epistatic pairs of SNPs in five metabolic traits and 18 SNP pairs in two disease traits. BiForce can make the analysis of epistasis a routine exercise in GWAS and thus improve our understanding of the role of epistasis in the genetic regulation of complex traits.
The software is free and can be downloaded from http://bioinfo.utu.fi/BiForce/.
wenhua.wei@igmm.ed.ac.uk
Supplementary data are available at Bioinformatics online.
Local interactions between neighbouring SNPs are hypothesized to be able to capture variants missing from genome-wide association studies (GWAS) via haplotype effects but have not been thoroughly ...explored. We have used a new high-throughput analysis tool to probe this underexplored area through full pair-wise genome scans and conventional GWAS in diastolic and systolic blood pressure and six metabolic traits in the Northern Finland Birth Cohort 1966 (NFBC1966) and the Atherosclerosis Risk in Communities study cohort (ARIC). Genome-wide significant interactions were detected in ARIC for systolic blood pressure between PLEKHA7 (a known GWAS locus for blood pressure) and GPR180 (which plays a role in vascular remodelling), and also for triglycerides as local interactions within the 11q23.3 region (replicated significantly in NFBC1966), which notably harbours several loci (BUD13, ZNF259 and APOA5) contributing to triglyceride levels. Tests of the local interactions within the 11q23.3 region conditional on the top GWAS signal suggested the presence of two independent functional variants, each with supportive evidence for their roles in gene regulation. Local interactions captured 9 additional GWAS loci identified in this study (3 significantly replicated) and 73 from previous GWAS (24 in the eight traits and 49 in related traits). We conclude that the detection of local interactions requires adequate SNP coverage of the genome and that such interactions are only likely to be detectable between SNPs in low linkage disequilibrium. Analysing local interactions is a potentially valuable complement to GWAS and can provide new insights into the biology underlying variation in complex traits.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK