The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function ...paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are ...captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills.
We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R's shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser's ease of use by reproducing the analysis of two previously published data sets.
DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge.
The emergence of high throughput technologies that produce vast amounts of genomic data, such as next-generation sequencing (NGS) is transforming biological research. The dramatic increase in the ...volume of data, the variety and continuous change of data processing tools, algorithms and databases make analysis the main bottleneck for scientific discovery. The processing of high throughput datasets typically involves many different computational programs, each of which performs a specific step in a pipeline. Given the wide range of applications and organizational infrastructures, there is a great need for highly parallel, flexible, portable, and reproducible data processing frameworks. Several platforms currently exist for the design and execution of complex pipelines. Unfortunately, current platforms lack the necessary combination of parallelism, portability, flexibility and/or reproducibility that are required by the current research environment. To address these shortcomings, workflow frameworks that provide a platform to develop and share portable pipelines have recently arisen. We complement these new platforms by providing a graphical user interface to create, maintain, and execute complex pipelines. Such a platform will simplify robust and reproducible workflow creation for non-technical users as well as provide a robust platform to maintain pipelines for large organizations.
To simplify development, maintenance, and execution of complex pipelines we created DolphinNext. DolphinNext facilitates building and deployment of complex pipelines using a modular approach implemented in a graphical interface that relies on the powerful Nextflow workflow framework by providing 1. A drag and drop user interface that visualizes pipelines and allows users to create pipelines without familiarity in underlying programming languages. 2. Modules to execute and monitor pipelines in distributed computing environments such as high-performance clusters and/or cloud 3. Reproducible pipelines with version tracking and stand-alone versions that can be run independently. 4. Modular process design with process revisioning support to increase reusability and pipeline development efficiency. 5. Pipeline sharing with GitHub and automated testing 6. Extensive reports with R-markdown and shiny support for interactive data visualization and analysis.
DolphinNext is a flexible, intuitive, web-based data processing and analysis platform that enables creating, deploying, sharing, and executing complex Nextflow pipelines with extensive revisioning and interactive reporting to enhance reproducible results.
Following testicular spermatogenesis, mammalian sperm continue to mature in a long epithelial tube known as the epididymis, which plays key roles in remodeling sperm protein, lipid, and RNA ...composition. To understand the roles for the epididymis in reproductive biology, we generated a single-cell atlas of the murine epididymis and vas deferens. We recovered key epithelial cell types including principal cells, clear cells, and basal cells, along with associated support cells that include fibroblasts, smooth muscle, macrophages and other immune cells. Moreover, our data illuminate extensive regional specialization of principal cell populations across the length of the epididymis. In addition to region-specific specialization of principal cells, we find evidence for functionally specialized subpopulations of stromal cells, and, most notably, two distinct populations of clear cells. Our dataset extends on existing knowledge of epididymal biology, and provides a wealth of information on potential regulatory and signaling factors that bear future investigation.
Several recent studies link parental environments to phenotypes in subsequent generations. In this work, we investigate the mechanism by which paternal diet affects offspring metabolism. Protein ...restriction in mice affects small RNA (sRNA) levels in mature sperm, with decreased let-7 levels and increased amounts of 5’ fragments of glycine transfer RNAs (tRNAs). In testicular sperm, tRNA fragments are scarce but increase in abundance as sperm mature in the epididymis. Epididymosomes (vesicles that fuse with sperm during epididymal transit) carry RNA payloads matching those of mature sperm and can deliver RNAs to immature sperm in vitro. Functionally, tRNA-glycine-GCC fragments repress genes associated with the endogenous retroelement MERVL, in both embryonic stem cells and embryos. Our results shed light on sRNA biogenesis and its dietary regulation during posttesticular sperm maturation, and they also link tRNA fragments to regulation of endogenous retroelements active in the preimplantation embryo.
Single-cell sequencing technologies have revealed an unexpectedly broad repertoire of cells required to mediate complex functions in multicellular organisms. Despite the multiple roles of adipose ...tissue in maintaining systemic metabolic homeostasis, adipocytes are thought to be largely homogenous with only 2 major subtypes recognized in humans so far. Here we report the existence and characteristics of 4 distinct human adipocyte subtypes, and of their respective mesenchymal progenitors. The phenotypes of these distinct adipocyte subtypes are differentially associated with key adipose tissue functions, including thermogenesis, lipid storage, and adipokine secretion. The transcriptomic signature of “brite/beige” thermogenic adipocytes reveals mechanisms for iron accumulation and protection from oxidative stress, necessary for mitochondrial biogenesis and respiration upon activation. Importantly, this signature is enriched in human supraclavicular adipose tissue, confirming that these cells comprise thermogenic depots in vivo, and explain previous findings of a rate-limiting role of iron in adipose tissue browning. The mesenchymal progenitors that give rise to beige/brite adipocytes express a unique set of cytokines and transcriptional regulators involved in immune cell modulation of adipose tissue browning. Unexpectedly, we also find adipocyte subtypes specialized for high-level expression of the adipokines adiponectin or leptin, associated with distinct transcription factors previously implicated in adipocyte differentiation. The finding of a broad adipocyte repertoire derived from a distinct set of mesenchymal progenitors, and of the transcriptional regulators that can control their development, provides a framework for understanding human adipose tissue function and role in metabolic disease.
In addition to sculpting eukaryotic transcripts by removing introns, pre-mRNA splicing greatly impacts protein composition of the emerging mRNP. The exon junction complex (EJC), deposited upstream of ...exon-exon junctions after splicing, is a major constituent of spliced mRNPs. Here, we report comprehensive analysis of the endogenous human EJC protein and RNA interactomes. We confirm that the major “canonical” EJC occupancy site in vivo lies 24 nucleotides upstream of exon junctions and that the majority of exon junctions carry an EJC. Unexpectedly, we find that endogenous EJCs multimerize with one another and with numerous SR proteins to form megadalton sized complexes in which SR proteins are super-stoichiometric to EJC core factors. This tight physical association may explain known functional parallels between EJCs and SR proteins. Further, their protection of long mRNA stretches from nuclease digestion suggests that endogenous EJCs and SR proteins cooperate to promote mRNA packaging and compaction.
Display omitted
► EJCs reside at ∼80% of exon-exon junctions in human mRNAs ► EJCs and SR proteins multimerize to form high molecular weight complexes ► EJCs stabilize SR protein association with polyA+ RNA ► This EJC/SR protein collaboration likely functions in mRNA compaction
Exon junction complexes associate with the majority splice junctions across human mRNAs, where they multimerize and interact directly with SR proteins. These associations lead to formation of higher-order architecture within spliced mRNPs.
Understanding distinct gene expression patterns of normal adult and developing fetal human pancreatic α- and β-cells is crucial for developing stem cell therapies, islet regeneration strategies, and ...therapies designed to increase β-cell function in patients with diabetes (type 1 or 2). Toward that end, we have developed methods to highly purify α-, β-, and δ-cells from human fetal and adult pancreata by intracellular staining for the cell-specific hormone content, sorting the subpopulations by flow cytometry, and, using next-generation RNA sequencing, we report the detailed transcriptomes of fetal and adult α- and β-cells. We observed that human islet composition was not influenced by age, sex, or BMI, and transcripts for inflammatory gene products were noted in fetal β-cells. In addition, within highly purified adult glucagon-expressing α-cells, we observed surprisingly high insulin mRNA expression, but not insulin protein expression. This transcriptome analysis from highly purified islet α- and β-cell subsets from fetal and adult pancreata offers clear implications for strategies that seek to increase insulin expression in type 1 and type 2 diabetes.
Significant progress has revealed transcriptional inputs that underlie regulation of artery and vein endothelial cell fates. However, little is known concerning genome-wide regulation of this ...process. Therefore, such studies are warranted to address this gap.
To identify and characterize artery- and vein-specific endothelial enhancers in the human genome, thereby gaining insights into mechanisms by which blood vessel identity is regulated.
Using chromatin immunoprecipitation and deep sequencing for markers of active chromatin in human arterial and venous endothelial cells, we identified several thousand artery- and vein-specific regulatory elements. Computational analysis revealed that NR2F2 (nuclear receptor subfamily 2, group F, member 2) sites were overrepresented in vein-specific enhancers, suggesting a direct role in promoting vein identity. Subsequent integration of chromatin immunoprecipitation and deep sequencing data sets with RNA sequencing revealed that NR2F2 regulated 3 distinct aspects related to arteriovenous identity. First, consistent with previous genetic observations, NR2F2 directly activated enhancer elements flanking cell cycle genes to drive their expression. Second, NR2F2 was essential to directly activate vein-specific enhancers and their associated genes. Our genomic approach further revealed that NR2F2 acts with ERG (ETS-related gene) at many of these sites to drive vein-specific gene expression. Finally, NR2F2 directly repressed only a small number of artery enhancers in venous cells to prevent their activation, including a distal element upstream of the artery-specific transcription factor,
(hes related family bHLH transcription factor with YRPW motif 2). In arterial endothelial cells, this enhancer was normally bound by ERG, which was also required for arterial
expression. By contrast, in venous endothelial cells, NR2F2 was bound to this site, together with ERG, and prevented its activation.
By leveraging a genome-wide approach, we revealed mechanistic insights into how NR2F2 functions in multiple roles to maintain venous identity. Importantly, characterization of its role at a crucial artery enhancer upstream of
established a novel mechanism by which artery-specific expression can be achieved.
Human immunodeficiency virus 1 (HIV-1) infection is associated with heightened inflammation and excess risk of cardiovascular disease, cancer and other complications. These pathologies persist ...despite antiretroviral therapy. In two independent cohorts, we found that innate lymphoid cells (ILCs) were depleted in the blood and gut of people with HIV-1, even with effective antiretroviral therapy. ILC depletion was associated with neutrophil infiltration of the gut lamina propria, type 1 interferon activation, increased microbial translocation and natural killer (NK) cell skewing towards an inflammatory state, with chromatin structure and phenotype typical of WNT transcription factor TCF7-dependent memory T cells. Cytokines that are elevated during acute HIV-1 infection reproduced the ILC and NK cell abnormalities ex vivo. These results show that inflammatory cytokines associated with HIV-1 infection irreversibly disrupt ILCs. This results in loss of gut epithelial integrity, microbial translocation and memory NK cells with heightened inflammatory potential, and explains the chronic inflammation in people with HIV-1.