Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. ...Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.
Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from ...multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
While the majority of cells contain a single nucleus, cell types such as trophoblasts, osteoclasts, and skeletal myofibers require multinucleation. One advantage of multinucleation can be the ...assignment of distinct functions to different nuclei, but comprehensive interrogation of transcriptional heterogeneity within multinucleated tissues has been challenging due to the presence of a shared cytoplasm. Here, we utilized single-nucleus RNA-sequencing (snRNA-seq) to determine the extent of transcriptional diversity within multinucleated skeletal myofibers. Nuclei from mouse skeletal muscle were profiled across the lifespan, which revealed the presence of distinct myonuclear populations emerging in postnatal development as well as aging muscle. Our datasets also provided a platform for discovery of genes associated with rare specialized regions of the muscle cell, including markers of the myotendinous junction and functionally validated factors expressed at the neuromuscular junction. These findings reveal that myonuclei within syncytial muscle fibers possess distinct transcriptional profiles that regulate muscle biology.
Over the past forty years, stable isotope analysis of bone (and tooth) collagen and hydroxyapatite has become a mainstay of archaeological and paleoanthropological reconstructions of paleodiet and ...paleoenvironment. Despite this method's frequent use across anthropological subdisciplines (and beyond), the present work represents the first attempt at gauging the effects of inter-laboratory variability engendered by differences in a) sample preparation, and b) analysis (instrumentation, working standards, and data calibration). Replicate analyses of a 14C-dated ancient human bone by twenty-one archaeological and paleoecological stable isotope laboratories revealed significant inter-laboratory isotopic variation for both collagen and carbonate. For bone collagen, we found a sizeable range of 1.8‰ for δ13Ccol and 1.9‰ for δ15Ncol among laboratories, but an interpretatively insignificant average pairwise difference of 0.2‰ and 0.4‰ for δ13Ccol and δ15Ncol respectively. For bone hydroxyapatite the observed range increased to a troublingly large 3.5‰ for δ13Cap and 6.7‰ for δ18Oap, with average pairwise differences of 0.6‰ for δ13Cap and a disquieting 2.0‰ for δ18Oap. In order to assess the effects of preparation versus analysis on isotopic variability among laboratories, a subset of the samples prepared by the participating laboratories were analyzed a second time on the same instrument. Based on this duplicate analysis, it was determined that roughly half of the isotopic variability among laboratories could be attributed to differences in sample preparation, with the other half resulting from differences in analysis (instrumentation, working standards, and data calibration). These findings have serious implications for choices made in the preparation and extraction of target biomolecules, the comparison of results obtained from different laboratories, and the interpretation of small differences in bone collagen and hydroxyapatite isotope values. To address the issues arising from inter-laboratory comparisons, we devise a novel measure we term the Minimum Meaningful Difference (MMD), and demonstrate its application.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The Human Transcription Factors Lambert, Samuel A.; Jolma, Arttu; Campitelli, Laura F. ...
Cell,
02/2018, Letnik:
172, Številka:
4
Journal Article
Recenzirano
Odprti dostop
Transcription factors (TFs) recognize specific DNA sequences to control chromatin and transcription, forming a complex system that guides expression of the genome. Despite keen interest in ...understanding how TFs control gene expression, it remains challenging to determine how the precise genomic binding sites of TFs are specified and how TF binding ultimately relates to regulation of transcription. This review considers how TFs are identified and functionally characterized, principally through the lens of a catalog of over 1,600 likely human TFs and binding motifs for two-thirds of them. Major classes of human TFs differ markedly in their evolutionary trajectories and expression patterns, underscoring distinct functions. TFs likewise underlie many different aspects of human physiology, disease, and variation, highlighting the importance of continued effort to understand TF-mediated gene regulation.
Knowing how and where transcription factors bind to the genome is crucial for understanding how they control gene expression. This Review looks at how human TFs are identified and the ways they interact with DNA sequences.
Emerging evidence suggests a key contribution to non-alcoholic fatty liver disease (NAFLD) pathogenesis by Th17 cells. The pathogenic characteristics and mechanisms of hepatic Th17 cells, however, ...remain unknown. Here, we uncover and characterize a distinct population of inflammatory hepatic CXCR3+Th17 (ihTh17) cells sufficient to exacerbate NAFLD pathogenesis. Hepatic ihTh17 cell accrual was dependent on the liver microenvironment and CXCR3 axis activation. Mechanistically, the pathogenic potential of ihTh17 cells correlated with increased chromatin accessibility, glycolytic output, and concomitant production of IL-17A, IFNγ, and TNFα. Modulation of glycolysis using 2-DG or cell-specific PKM2 deletion was sufficient to reverse ihTh17-centric inflammatory vigor and NAFLD severity. Importantly, ihTh17 cell characteristics, CXCR3 axis activation, and hepatic expression of glycolytic genes were conserved in human NAFLD. Together, our data show that the steatotic liver microenvironment regulates Th17 cell accrual, metabolism, and competence toward an ihTh17 fate. Modulation of these pathways holds potential for development of novel therapeutic strategies for NAFLD.
Display omitted
•Obesity promotes accrual of a unique inflammatory hepatic Th17 (ihTh17) cell subset•ihTh17 cells are sufficient to exacerbate NAFLD progression•The activation of CXCR3 axis in steatotic liver promotes ihTh17 cell accrual•Glycolysis-associated PKM2 activity shapes the ihTh17 inflammatory potential
Non-alcoholic fatty liver disease (NAFLD) is associated with increased inflammation. Moreno-Fernandez et al. now show that the steatotic liver microenvironment gives rise to a distinct inflammatory hepatic Th17 (ihTh17) cell subset, which preferentially utilizes the glycolytic pathway to fuel tissue inflammation and promote NAFLD progression.
Runt-related transcription factor 1 (Runx1) can act as both an activator and a repressor. Here we show that CRISPR-mediated deletion of Runx1 in mouse metanephric mesenchyme-derived mK4 cells results ...in large-scale genome-wide changes to chromatin accessibility and gene expression. Open chromatin regions near down-regulated loci enriched for Runx sites in mK4 cells lose chromatin accessibility in Runx1 knockout cells, despite remaining Runx2-bound. Unexpectedly, regions near upregulated genes are depleted of Runx sites and are instead enriched for Zeb transcription factor binding sites. Re-expressing Zeb2 in Runx1 knockout cells restores suppression, and CRISPR mediated deletion of Zeb1 and Zeb2 phenocopies the gained expression and chromatin accessibility changes seen in Runx1KO due in part to subsequent activation of factors like Grhl2. These data confirm that Runx1 activity is uniquely needed to maintain open chromatin at many loci, and demonstrate that Zeb proteins are required and sufficient to maintain Runx1-dependent genome-scale repression.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Cis-regulatory elements are coordinated to regulate the expression of their targeted genes. However, the joint measurement of cis-regulatory elements' activities and their interactions in spatial ...proximity is limited by the current sequencing approaches. We describe a method, NOMe-HiC, which simultaneously captures single-nucleotide polymorphisms, DNA methylation, chromatin accessibility (GpC methyltransferase footprints), and chromosome conformation changes from the same DNA molecule, together with the transcriptome, in a single assay. NOMe-HiC shows high concordance with state-of-the-art mono-omic assays across different molecular measurements and reveals coordinated chromatin accessibility at distal genomic segments in spatial proximity and novel types of long-range allele-specific chromatin accessibility.
Our understanding of gene regulation in plants is constrained by our limited knowledge of plant cis-regulatory DNA and its dynamics. We mapped DNase I hypersensitive sites (DHSs) in A. thaliana ...seedlings and used genomic footprinting to delineate ∼700,000 sites of in vivo transcription factor (TF) occupancy at nucleotide resolution. We show that variation associated with 72 diverse quantitative phenotypes localizes within DHSs. TF footprints encode an extensive cis-regulatory lexicon subject to recent evolutionary pressures, and widespread TF binding within exons may have shaped codon usage patterns. The architecture of A. thaliana TF regulatory networks is strikingly similar to that of animals in spite of diverged regulatory repertoires. We analyzed regulatory landscape dynamics during heat shock and photomorphogenesis, disclosing thousands of environmentally sensitive elements and enabling mapping of key TF regulatory circuits underlying these fundamental responses. Our results provide an extensive resource for the study of A. thaliana gene regulation and functional biology.
Display omitted
•A. thaliana regulatory DNA, TF footprints, and cis-regulatory lexicon are elucidated•TF binding in protein-coding exons may have shaped A. thaliana codon usage•A. thaliana TF network architecture is strikingly similar to human•Light- and heat-cued regulatory DNA dynamics and TF network remodeling are revealed
Our understanding of plant gene regulation is constrained by our limited knowledge of plant cis-regulatory DNA and its dynamics in response to environmental cues. Sullivan et al. now establish nucleotide-resolution regulatory DNA landscapes for A. thaliana seedlings before and after exposure to light and heat, key environmental cues shaping plant growth and development. This study generates genome-wide, condition- and tissue-specific maps of TF occupancy, constructs condition-specific TF networks, and identifies hundreds of de novo TF motif models.
Transcription factors read the genome, fundamentally connecting DNA sequence to gene expression across diverse cell types. Determining how, where, and when TFs bind chromatin will advance our ...understanding of gene regulatory networks and cellular behavior. The 2017 ENCODE-DREAM in vivo Transcription-Factor Binding Site (TFBS) Prediction Challenge highlighted the value of chromatin accessibility data to TFBS prediction, establishing state-of-the-art methods for TFBS prediction from DNase-seq. However, the more recent Assay-for-Transposase-Accessible-Chromatin (ATAC)-seq has surpassed DNase-seq as the most widely-used chromatin accessibility profiling method. Furthermore, ATAC-seq is the only such technique available at single-cell resolution from standard commercial platforms. While ATAC-seq datasets grow exponentially, suboptimal motif scanning is unfortunately the most common method for TFBS prediction from ATAC-seq. To enable community access to state-of-the-art TFBS prediction from ATAC-seq, we (1) curated an extensive benchmark dataset (127 TFs) for ATAC-seq model training and (2) built "maxATAC", a suite of user-friendly, deep neural network models for genome-wide TFBS prediction from ATAC-seq in any cell type. With models available for 127 human TFs, maxATAC is the largest collection of high-performance TFBS prediction models for ATAC-seq. maxATAC performance extends to primary cells and single-cell ATAC-seq, enabling improved TFBS prediction in vivo. We demonstrate maxATAC's capabilities by identifying TFBS associated with allele-dependent chromatin accessibility at atopic dermatitis genetic risk loci.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK