Abstract
The chromatin interaction assays, particularly Hi-C, enable detailed studies of genome architecture in multiple organisms and model systems, resulting in a deeper understanding of gene ...expression regulation mechanisms mediated by epigenetics. However, the analysis and interpretation of Hi-C data remain challenging due to technical biases, limiting direct comparisons of datasets obtained in different experiments and laboratories. As a result, removing biases from Hi-C-generated chromatin contact matrices is a critical data analysis step. Our novel approach, HiConfidence, eliminates biases from the Hi-C data by weighing chromatin contacts according to their consistency between replicates so that low-quality replicates do not substantially influence the result. The algorithm is effective for the analysis of global changes in chromatin structures such as compartments and topologically associating domains. We apply the HiConfidence approach to several Hi-C datasets with significant technical biases, that could not be analyzed effectively using existing methods, and obtain meaningful biological conclusions. In particular, HiConfidence aids in the study of how changes in histone acetylation pattern affect chromatin organization in Drosophila melanogaster S2 cells. The method is freely available at GitHub: https://github.com/victorykobets/HiConfidence.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Proximity ligation approaches, which are widely used to study the spatial organization of the genome, also make it possible to reveal patterns of RNA-DNA interactions. Here, we use RedC, an RNA-DNA ...proximity ligation approach, to assess the distribution of major RNA types along the genomes of E. coli, B. subtilis, and thermophilic archaeon T. adornatum. We find that (i) messenger RNAs preferentially interact with their cognate genes and the genes located downstream in the same operon, which is consistent with polycistronic transcription; (ii) ribosomal RNAs preferentially interact with active protein-coding genes in both bacteria and archaea, indicating co-transcriptional translation; and (iii) 6S noncoding RNA, a negative regulator of bacterial transcription, is depleted from active genes in E. coli and B. subtilis. We conclude that the RedC data provide a rich resource for studying both transcription dynamics and the function of noncoding RNAs in microbial organisms.
In homeotherms, the alpha-globin gene clusters are located within permanently open genome regions enriched in housekeeping genes. Terminal erythroid differentiation results in dramatic upregulation ...of alpha-globin genes making their expression comparable to the rRNA transcriptional output. Little is known about the influence of the erythroid-specific alpha-globin gene transcription outburst on adjacent, widely expressed genes and large-scale chromatin organization. Here, we have analyzed the total transcription output, the overall chromatin contact profile, and CTCF binding within the 2.7 Mb segment of chicken chromosome 14 harboring the alpha-globin gene cluster in cultured lymphoid cells and cultured erythroid cells before and after induction of terminal erythroid differentiation.
We found that, similarly to mammalian genome, the chicken genomes is organized in TADs and compartments. Full activation of the alpha-globin gene transcription in differentiated erythroid cells is correlated with upregulation of several adjacent housekeeping genes and the emergence of abundant intergenic transcription. An extended chromosome region encompassing the alpha-globin cluster becomes significantly decompacted in differentiated erythroid cells, and depleted in CTCF binding and CTCF-anchored chromatin loops, while the sub-TAD harboring alpha-globin gene cluster and the upstream major regulatory element (MRE) becomes highly enriched with chromatin interactions as compared to lymphoid and proliferating erythroid cells. The alpha-globin gene domain and the neighboring loci reside within the A-like chromatin compartment in both lymphoid and erythroid cells and become further segregated from the upstream gene desert upon terminal erythroid differentiation.
Our findings demonstrate that the effects of tissue-specific transcription activation are not restricted to the host genomic locus but affect the overall chromatin structure and transcriptional output of the encompassing topologically associating domain.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Technological advances have lead to the creation of large epigenetic datasets, including information about DNA binding proteins and DNA spatial structure. Hi-C experiments have revealed that ...chromosomes are subdivided into sets of self-interacting domains called Topologically Associating Domains (TADs). TADs are involved in the regulation of gene expression activity, but the mechanisms of their formation are not yet fully understood. Here, we focus on machine learning methods to characterize DNA folding patterns in
based on chromatin marks across three cell lines. We present linear regression models with four types of regularization, gradient boosting, and recurrent neural networks (RNN) as tools to study chromatin folding characteristics associated with TADs given epigenetic chromatin immunoprecipitation data. The bidirectional long short-term memory RNN architecture produced the best prediction scores and identified biologically relevant features. Distribution of protein Chriz (Chromator) and histone modification H3K4me3 were selected as the most informative features for the prediction of TADs characteristics. This approach may be adapted to any similar biological dataset of chromatin features across various cell lines and species. The code for the implemented pipeline, Hi-ChiP-ML, is publicly available: https://github.com/MichalRozenwald/Hi-ChIP-ML.
The genomes are folded in a complex three-dimensional (3D) structure. Some features of this organization are common for all eukaryotes, but little is known about its evolution. Here, we have studied ...the 3D organization and regulation of zebrafish globin gene domain and compared its organization and regulation with those of other vertebrate species. In birds and mammals, the α- and β-globin genes are segregated into separate clusters located on different chromosomes and organized into chromatin domains of different types, whereas in cold-blooded vertebrates, including Danio rerio, α- and β-globin genes are organized into common clusters. The major globin gene locus of Danio rerio is of particular interest as it is located in a genomic area that is syntenic in vertebrates and is controlled by a conserved enhancer. We have found that the major globin gene locus of Danio rerio is structurally and functionally segregated into two spatially distinct subloci harboring either adult or embryo-larval globin genes. These subloci demonstrate different organization at the level of chromatin domains and different modes of spatial organization, which appears to be due to selective interaction of the upstream enhancer with the sublocus harboring globin genes of the adult type. These data are discussed in terms of evolution of linear and 3D organization of gene clusters in vertebrates.
Construction of chromosomes 3D models based on single cell Hi-C data constitute an important challenge. We present a reconstruction approach, DPDchrom, that incorporates basic knowledge whether the ...reconstructed conformation should be coil-like or globular and spring relaxation at contact sites. In contrast to previously published protocols, DPDchrom can naturally form globular conformation due to the presence of explicit solvent. Benchmarking of this and several other methods on artificial polymer models reveals similar reconstruction accuracy at high contact density and DPDchrom advantage at low contact density. To compare 3D structures insensitively to spatial orientation and scale, we propose the Modified Jaccard Index. We analyzed two sources of the contact dropout: contact radius change and random contact sampling. We found that the reconstruction accuracy exponentially depends on the number of contacts per genomic bin allowing to estimate the reconstruction accuracy in advance. We applied DPDchrom to model chromosome configurations based on single-cell Hi-C data of mouse oocytes and found that these configurations differ significantly from a random one, that is consistent with other studies.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The recent data suggest that the areas of enhancer action are restricted by partitioning of the genome into topologically-associating domains, TADs. However, most of the observations on the 3D genome ...organization have been made by conventional methods in cell population where only average characteristics of the so-called typical cell can be identified. On the other hand, FISH-based studies demonstrated that distances between various genomic regions vary significantly in individual cells. To get further insights into the functional significance of 3D genome organization it is necessary to estimate the plasticity of this organization. The high throughput chromosome conformation capture protocol (Hi-C) has been modified recently to allow construction of chromatin contact frequency maps for individual cells. Using this modified protocol we have constructed Hi-C maps for 20 Drosophila cells (line Dm-BG3c2). In the best cell we have captured ~12% of the theoretically available contacts. To analyze these sparse contact matrices, we have developed program tools allowing us to take into account the noise by comparing maps from individual cells with artificially generated random matrices. The results of our analysis demonstrate that contact chromatin domains and chromatin compartments do exist in individual Drosopila cells. We also observed the hierarchical organization of contact chromatin domains in individual cells. Consequently this hierarchical organization does not represent a population average but is an intrinsic feature of chromatin folding. The population TAD borders are well reproduced in individual cells and correlate with certain epigenetic signatures. Presence of TADs in individual cells argues that TADs do not represent a population average as postulated by chromatin loop extrusion model but rather originate due to condensation of chromatin domains possessing specific epigenetic signatures. ACKNOWLEDGMENTS This work was supported by a Russian Science Foundation grant #19-14-00016 and RFBR grant 18-29-13013.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
In the last decade, experimental methods for determining the spatial organization of chromatin have been actively developed. One of the known classes of such methods is 3C-based methods. Hi-C ...experiments, which are a full-genome variation of 3C-based methods, allow obtaining contact maps for a population of 10 million cells with an accuracy of 500 base pairs. In parallel with the improvement of experiments, theoretical models were created to describe the spatial organization of chromatin. These models include: crumpled (fractal) globule, SBS model, loop extrusion model and others. All this models describe chromatin conformation well on different spatial scales. Another task is the chromatin conformation recovery from the experimental Hi-C contact map obtained for the cell population. This task is incorrectly set because it has many solutions. In addition, the individual features of the spatial organization of chromatin in individual cells are not visible due to averaging.
miR-10b is silenced in normal neuroglial cells of the brain but commonly activated in glioma, where it assumes an essential tumor-promoting role. We demonstrate that the entire miR-10b-hosting HOXD ...locus is activated in glioma via the cis-acting mechanism involving 3D chromatin reorganization and CTCF-cohesin-mediated looping. This mechanism requires two interacting lncRNAs, HOXD-AS2 and LINC01116, one associated with HOXD3/HOXD4/miR-10b promoter and another with the remote enhancer. Knockdown of either lncRNA in glioma cells alters CTCF and cohesin binding, abolishes chromatin looping, inhibits the expression of all genes within HOXD locus, and leads to glioma cell death. Conversely, in cortical astrocytes, enhancer activation is sufficient for HOXD/miR-10b locus reorganization, gene derepression, and neoplastic cell transformation. LINC01116 RNA is essential for this process. Our results demonstrate the interplay of two lncRNAs in the chromatin folding and concordant regulation of miR-10b and multiple HOXD genes normally silenced in astrocytes and triggering the neoplastic glial transformation.
Display omitted
•Chromatin topology of the HOXD locus is altered between the brain and glioblastoma•lncRNAs regulate CTCF/cohesin-dependent loop formation and HOXD gene expression•Activation of LINC01116 enhancer RNA leads to astrocyte transformation•The transformation depends on the loop formation and miR-10b derepression
Deforzh et al. investigated a common mechanism of HOXD/miR-10b genes’ derepression in glioblastoma and revealed the coordinated activity of two lncRNAs, HOXD-embedded HOXD-AS2 and distant enhancer-associated LINC01116, in CTCF/cohesin binding, chromatin topology, and astrocyte transformation. The work shed light on the molecular mechanisms of gliomagenesis.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP