System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to ...extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement.
Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios.
Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.
Abstract Purpose Cyberbullying differs from face-to-face bullying and may negatively influence adolescent mental health, but there is a lack of definitive research on this topic. This study examines ...longitudinal associations between cyberbullying involvement and adolescent mental health. Methods Participants were 2,480 teenagers taking part in the Olympic Regeneration in East London study. We collected information from participants when they were 12–13 years old and again 1 year later to examine links between involvement in cyberbullying and future symptoms of depression and social anxiety, and mental well-being. Results At baseline, 14% reported being cybervictims, 8% reported being cyberbullies, and 20% reported being cyberbully-victims in the previous year. Compared to uninvolved adolescents, cybervictims and cyberbully-victims were significantly more likely to report symptoms of depression (cybervictims: odds ratio OR = 1.44, 95% confidence interval CI 1.00, 2.06; cyberbully-victims: OR = 1.54, 95% CI 1.13, 2.09) and social anxiety (cybervictims: OR = 1.52, 95% CI 1.11, 2.07; cyberbully-victims: OR = 1.44, 95% CI 1.10, 1.89) but not below average well-being (cybervictims: relative risk ratio = 1.28, 95% CI .86, 1.91; cyberbully-victims: relative risk ratio = 1.38, 95% CI .95, 1.99) at 1 year follow-up, after adjustment for confounding factors including baseline mental health. Conclusions This study emphasizes the high prevalence of cyberbullying and the potential of cybervictimization as a risk factor for future depressive symptoms, social anxiety symptoms, and below average well-being among adolescents. Future research should identify protective factors and possible interventions to reduce adolescent cyberbullying.
Adverse drug reactions (ADRs) are a central consideration during drug development. Here we present a machine learning classifier to prioritize ADRs for approved drugs and pre-clinical small-molecule ...compounds by combining chemical structure (CS) and gene expression (GE) features. The GE data is from the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 dataset that measured changes in GE before and after treatment of human cells with over 20 000 small-molecule compounds including most of the FDA-approved drugs. Using various benchmarking methods, we show that the integration of GE data with the CS of the drugs can significantly improve the predictability of ADRs. Moreover, transforming GE features to enrichment vectors of biological terms further improves the predictive capability of the classifiers. The most predictive biological-term features can assist in understanding the drug mechanisms of action. Finally, we applied the classifier to all >20 000 small-molecules profiled, and developed a web portal for browsing and searching predictive small-molecule/ADR connections.
The interface for the adverse event predictions for the >20 000 LINCS compounds is available at http://maayanlab.net/SEP-L1000/ CONTACT: avi.maayan@mssm.edu
Supplementary data are available at Bioinformatics online.
Definitive hematopoiesis emerges during embryogenesis via an endothelial-to-hematopoietic transition. We attempted to induce this process in mouse fibroblasts by screening a panel of factors for ...hemogenic activity. We identified a combination of four transcription factors, Gata2, Gfi1b, cFos, and Etv6, that efficiently induces endothelial-like precursor cells, with the subsequent appearance of hematopoietic cells. The precursor cells express a human CD34 reporter, Sca1, and Prominin1 within a global endothelial transcription program. Emergent hematopoietic cells possess nascent hematopoietic stem cell gene-expression profiles and cell-surface phenotypes. After transgene silencing and reaggregation culture, the specified cells generate hematopoietic colonies in vitro. Thus, we show that a simple combination of transcription factors is sufficient to induce a complex, dynamic, and multistep developmental program in vitro. These findings provide insights into the specification of definitive hemogenesis and a platform for future development of patient-specific stem and progenitor cells, as well as more-differentiated blood products.
Display omitted
•Gata2, Gfi1b, cFos, and Etv6 induce a hemogenic program in fibroblasts•Initial production of Sca1+Prom1+ endothelial-like precursors•Progression to emergence of hematopoietic cells with HSC features•Emergent cells generate colonies in vitro after reaggregation culture
A cocktail of four transcription factors, Gata2, cFos, Gfi1b, and Etv6, can induce a hemogenic program in mouse fibroblasts, leading to formation of an endothelium that gives rise to hematopoietic cells.
Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human ...curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.
Identifying differentially expressed genes (DEG) is a fundamental step in studies that perform genome wide expression profiling. Typically, DEG are identified by univariate approaches such as ...Significance Analysis of Microarrays (SAM) or Linear Models for Microarray Data (LIMMA) for processing cDNA microarrays, and differential gene expression analysis based on the negative binomial distribution (DESeq) or Empirical analysis of Digital Gene Expression data in R (edgeR) for RNA-seq profiling.
Here we present a new geometrical multivariate approach to identify DEG called the Characteristic Direction. We demonstrate that the Characteristic Direction method is significantly more sensitive than existing methods for identifying DEG in the context of transcription factor (TF) and drug perturbation responses over a large number of microarray experiments. We also benchmarked the Characteristic Direction method using synthetic data, as well as RNA-Seq data. A large collection of microarray expression data from TF perturbations (73 experiments) and drug perturbations (130 experiments) extracted from the Gene Expression Omnibus (GEO), as well as an RNA-Seq study that profiled genome-wide gene expression and STAT3 DNA binding in two subtypes of diffuse large B-cell Lymphoma, were used for benchmarking the method using real data. ChIP-Seq data identifying DNA binding sites of the perturbed TFs, as well as known drug targets of the perturbing drugs, were used as prior knowledge silver-standard for validation. In all cases the Characteristic Direction DEG calling method outperformed other methods. We find that when drugs are applied to cells in various contexts, the proteins that interact with the drug-targets are differentially expressed and more of the corresponding genes are discovered by the Characteristic Direction method. In addition, we show that the Characteristic Direction conceptualization can be used to perform improved gene set enrichment analyses when compared with the gene-set enrichment analysis (GSEA) and the hypergeometric test.
The application of the Characteristic Direction method may shed new light on relevant biological mechanisms that would have remained undiscovered by the current state-of-the-art DEG methods. The method is freely accessible via various open source code implementations using four popular programming languages: R, Python, MATLAB and Mathematica, all available at: http://www.maayanlab.net/CD.
Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides ...physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent.
Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user's PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories.
Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in many cancers are mostly connected through PPIs whereas other complex diseases, such as autism and type-2 diabetes, are mostly connected through FANs without PPIs, can guide better strategies for disease gene discovery. Genes2FANs is available at: http://actin.pharm.mssm.edu/genes2FANs.
Peatlands provide important ecosystem services including carbon storage and biodiversity conservation. Remote sensing shows potential for monitoring peatlands, but most off-the-shelf data products ...are developed for unsaturated environments and it is unclear how well they can perform in peatland ecosystems. Sphagnum moss is an important peatland genus with specific characteristics which can affect spectral reflectance, and we hypothesized that the prevalence of Sphagnum in a peatland could affect the spectral signature of the area. This article combines results from both laboratory and field experiments to assess the relationship between spectral indices and the moisture content and gross primary productivity (GPP) of peatland (blanket bog) vegetation species. The aim was to consider how well the selected indices perform under a range of conditions, and whether Sphagnum has a significant impact on the relationships tested. We found that both water indices tested normalized difference water index (NDWI) and floating water band index (fWBI) were sensitive to the water content changes in Sphagnum moss in the laboratory, and there was little difference between them. Most of the vegetation indices tested the normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), structure insensitive pigment index (SIPI), and chlorophyll index (CIm) were found to have a strong relationship with GPP both in the laboratory and in the field. The NDVI and EVI are useful for large-scale estimation of GPP, but are sensitive to the proportion of Sphagnum present. The CIm is less affected by different species proportions and might therefore be the best to use in areas where vegetation species cover is unknown. The photochemical reflectance index (PRI) is shown to be best suited to small-scale studies of single species.
Resistance to infection is critically dependent on the ability of pattern recognition receptors to recognize microbial invasion and induce protective immune responses. One such family of receptors ...are the C-type lectins, which are central to antifungal immunity. These receptors activate key effector mechanisms upon recognition of conserved fungal cell-wall carbohydrates. However, several other immunologically active fungal ligands have been described; these include melanin, for which the mechanism of recognition is hitherto undefined. Here we identify a C-type lectin receptor, melanin-sensing C-type lectin receptor (MelLec), that has an essential role in antifungal immunity through recognition of the naphthalene-diol unit of 1,8-dihydroxynaphthalene (DHN)-melanin. MelLec recognizes melanin in conidial spores of Aspergillus fumigatus as well as in other DHN-melanized fungi. MelLec is ubiquitously expressed by CD31
endothelial cells in mice, and is also expressed by a sub-population of these cells that co-express epithelial cell adhesion molecule and are detected only in the lung and the liver. In mouse models, MelLec was required for protection against disseminated infection with A. fumigatus. In humans, MelLec is also expressed by myeloid cells, and we identified a single nucleotide polymorphism of this receptor that negatively affected myeloid inflammatory responses and significantly increased the susceptibility of stem-cell transplant recipients to disseminated Aspergillus infections. MelLec therefore recognizes an immunologically active component commonly found on fungi and has an essential role in protective antifungal immunity in both mice and humans.