Pseudomonas aeruginosa and Candida albicans are opportunistic pathogens whose interactions involve the secreted products ethanol and phenazines. Here, we describe the role of ethanol in mixed-species ...co-cultures by dual-seq analyses. P. aeruginosa and C. albicans transcriptomes were assessed after growth in mono-culture or co-culture with either ethanol-producing C. albicans or a C. albicans mutant lacking the primary ethanol dehydrogenase, Adh1. Analysis of the RNA-Seq data using KEGG pathway enrichment and eADAGE methods revealed several P. aeruginosa responses to C. albicans-produced ethanol including the induction of a non-canonical low-phosphate response regulated by PhoB. C. albicans wild type, but not C. albicans adh1DELTA/DELTA, induces P. aeruginosa production of 5-methyl-phenazine-1-carboxylic acid (5-MPCA), which forms a red derivative within fungal cells and exhibits antifungal activity. Here, we show that C. albicans adh1DELTA/DELTA no longer activates P. aeruginosa PhoB and PhoB-regulated phosphatase activity, that exogenous ethanol complements this defect, and that ethanol is sufficient to activate PhoB in single-species P. aeruginosa cultures at permissive phosphate levels. The intersection of ethanol and phosphate in co-culture is inversely reflected in C. albicans; C. albicans adh1DELTA/DELTA had increased expression of genes regulated by Pho4, the C. albicans transcription factor that responds to low phosphate, and Pho4-dependent phosphatase activity. Together, these results show that C. albicans-produced ethanol stimulates P. aeruginosa PhoB activity and 5-MPCA-mediated antagonism, and that both responses are dependent on local phosphate concentrations. Further, our data suggest that phosphate scavenging by one species improves phosphate access for the other, thus highlighting the complex dynamics at play in microbial communities.
The
genome encodes more than 50 proteins predicted to be involved in c-di-GMP signaling. Here, we demonstrated that, tested across 188 nutrients, these enzymes and effectors appeared capable of ...impacting biofilm formation. Transcriptional analysis of network members across ∼50 nutrient conditions indicates that altered gene expression can explain a subset of but not all biofilm formation responses to the nutrients. Additional organization of the network is likely achieved through physical interaction, as determined via probing ∼2,000 interactions by bacterial two-hybrid assays. Our analysis revealed a multimodal regulatory strategy using combinations of ligand-mediated signals, protein-protein interaction, and/or transcriptional regulation to fine-tune c-di-GMP-mediated responses. These results create a profile of a large c-di-GMP network that is used to make important cellular decisions, opening the door to future model building and the ability to engineer this complex circuitry in other bacteria.
Cyclic diguanylate (c-di-GMP) is a key signaling molecule regulating bacterial biofilm formation, and many microbes have up to dozens of proteins that make, break, or bind this dinucleotide. A major open issue in the field is how signaling specificity is conferred in the unpartitioned space of a bacterial cell. Here, we took a systems approach, using mutational analysis, transcriptional studies, and bacterial two-hybrid analysis to interrogate this network. We found that a majority of enzymes are capable of impacting biofilm formation in a context-dependent manner, and we revealed examples of two or more modes of regulation (i.e., transcriptional control with protein-protein interaction) being utilized to generate an observable impact on biofilm formation.
Pseudomonas aeruginosa is an opportunistic pathogen that causes difficult-to-treat infections. Two well-studied divergent P. aeruginosa strain types, PAO1 and PA14, have significant genomic ...heterogeneity, including diverse accessory genes present in only some strains. Genome content comparisons find core genes that are conserved across both PAO1 and PA14 strains and accessory genes that are present in only a subset of PAO1 and PA14 strains. Here, we use recently assembled transcriptome compendia of publicly available P. aeruginosa RNA sequencing (RNA-seq) samples to create two smaller compendia consisting of only strain PAO1 or strain PA14 samples with each aligned to their cognate reference genome. We confirmed strain annotations and identified other samples for inclusion by assessing each sample's median expression of PAO1-only or PA14-only accessory genes. We then compared the patterns of core gene expression in each strain. To do so, we developed a method by which we analyzed genes in terms of which genes showed similar expression patterns across strain types. We found that some core genes had consistent correlated expression patterns across both compendia, while others were less stable in an interstrain comparison. For each accessory gene, we also determined core genes with correlated expression patterns. We found that stable core genes had fewer coexpressed neighbors that were accessory genes. Overall, this approach for analyzing expression patterns across strain types can be extended to other groups of genes, like phage genes, or applied for analyzing patterns beyond groups of strains, such as samples with different traits, to reveal a deeper understanding of regulation.
Pseudomonas aeruginosa is a ubiquitous pathogen. There is much diversity among P. aeruginosa strains, including two divergent but well-studied strains, PAO1 and PA14. Understanding how these different strain-level traits manifest is important for identifying targets that regulate different traits of interest. With the availability of thousands of PAO1 and PA14 samples, we created two strain-specific RNA-seq compendia where each one contains hundreds of samples from PAO1 or PA14 strains and used them to compare the expression patterns of core genes that are conserved in both strain types and to determine which core genes have expression patterns that are similar to those of accessory genes that are unique to one strain or the other using an approach that we developed. We found a subset of core genes with different transcriptional patterns across PAO1 and PA14 strains and identified those core genes with expression patterns similar to those of strain-specific accessory genes.
Thousands of Pseudomonas aeruginosa RNA sequencing (RNA-seq) gene expression profiles are publicly available via the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). ...In this work, the transcriptional profiles from hundreds of studies performed by over 75 research groups were reanalyzed in aggregate to create a powerful tool for hypothesis generation and testing. Raw sequence data were uniformly processed using the Salmon pseudoaligner, and this read mapping method was validated by comparison to a direct alignment method. We developed filtering criteria to exclude samples with aberrant levels of housekeeping gene expression or an unexpected number of genes with no reported values and normalized the filtered compendia using the ratio-of-medians method. The filtering and normalization steps greatly improved gene expression correlations for genes within the same operon or regulon across the 2,333 samples. Since the RNA-seq data were generated using diverse strains, we report the effects of mapping samples to noncognate reference genomes by separately analyzing all samples mapped to cDNA reference genomes for strains PAO1 and PA14, two divergent strains that were used to generate most of the samples. Finally, we developed an algorithm to incorporate new data as they are deposited into the SRA. Our processing and quality control methods provide a scalable framework for taking advantage of the troves of biological information hibernating in the depths of microbial gene expression data and yield useful tools for P. aeruginosa RNA-seq data to be leveraged for diverse research goals.
Pseudomonas aeruginosa is a causative agent of a wide range of infections, including chronic infections associated with cystic fibrosis. These P. aeruginosa infections are difficult to treat and often have negative outcomes. To aid in the study of this problematic pathogen, we mapped, filtered for quality, and normalized thousands of P. aeruginosa RNA-seq gene expression profiles that were publicly available via the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA). The resulting compendia facilitate analyses across experiments, strains, and conditions. Ultimately, the workflow that we present could be applied to analyses of other microbial species.
Chronic
lung infections are a feature of cystic fibrosis (CF) that many patients experience even with the advent of highly effective modulator therapies. Identifying factors that impact
in the CF ...lung could yield novel strategies to eradicate infection or otherwise improve outcomes. To complement published
studies using laboratory models or RNA isolated from sputum, we analyzed transcripts of strain PAO1 after incubation in sputum from different CF donors prior to RNA extraction. We compared PAO1 gene expression in this "spike-in" sputum model to that for
grown in synthetic cystic fibrosis sputum medium to determine key genes, which are among the most differentially expressed or most highly expressed. Using the key genes, gene sets with correlated expression were determined using the gene expression analysis tool eADAGE. Gene sets were used to analyze the activity of specific pathways in
grown in sputum from different individuals. Gene sets that we found to be more active in sputum showed similar activation in published data that included
RNA isolated from sputum relative to corresponding
reference cultures. In the
samples,
had increased levels of genes related to zinc and iron acquisition which were suppressed by metal amendment of sputum. We also found a significant correlation between expression of the H1-type VI secretion system and CFTR corrector use by the sputum donor. An
sputum model or synthetic sputum medium formulation that imposes metal restriction may enhance future CF-related studies.IMPORTANCEIdentifying the gene expression programs used by
to colonize the lungs of people with cystic fibrosis (CF) will illuminate new therapeutic strategies. To capture these transcriptional programs, we cultured the common
laboratory strain PAO1 in expectorated sputum from CF patient donors. Through bioinformatic analysis, we defined sets of genes that are more transcriptionally active in real CF sputum compared to a synthetic cystic fibrosis sputum medium. Many of the most differentially active gene sets contained genes related to metal acquisition, suggesting that these gene sets play an active role in scavenging for metals in the CF lung environment which may be inadequately represented in some models. Future studies of
transcript abundance in CF may benefit from the use of an expectorated sputum model or media supplemented with factors that induce metal restriction.
Genome-wide transcriptome profiling identifies genes that are prone to differential expression (DE) across contexts, as well as genes with changes specific to the experimental manipulation. ...Distinguishing genes that are specifically changed in a context of interest from common differentially expressed genes (DEGs) allows more efficient prediction of which genes are specific to a given biological process under scrutiny. Currently, common DEGs or pathways can only be identified through the laborious manual curation of experiments, an inordinately time-consuming endeavor. Here we pioneer an approach, Specific cOntext Pattern Highlighting In Expression data (SOPHIE), for distinguishing between common and specific transcriptional patterns using a generative neural network to create a background set of experiments from which a null distribution of gene and pathway changes can be generated. We apply SOPHIE to diverse datasets including those from human, human cancer, and bacterial pathogen Pseudomonas aeruginosa. SOPHIE identifies common DEGs in concordance with previously described, manually and systematically determined common DEGs. Further molecular validation indicates that SOPHIE detects highly specific but low-magnitude biologically relevant transcriptional changes. SOPHIE’s measure of specificity can complement log2 fold change values generated from traditional DE analyses. For example, by filtering the set of DEGs, one can identify genes that are specifically relevant to the experimental condition of interest. Consequently, these results can inform future research directions. All scripts used in these analyses are available at https://github.com/greenelab/generic-expression-patterns. Users can access https://github.com/greenelab/sophie to run SOPHIE on their own data.
Display omitted
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions ...and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particularly important for studying microbes, where the transcriptional responses integrate many signals and demonstrate plasticity across strains including response to what nutrients are available and what microbes are present. Advances in high-throughput measurement technology have made it feasible to construct compendia for many microbes. In this review we discuss how these compendia are constructed and analyzed to reveal transcriptional patterns.
Researchers studying cystic fibrosis (CF) pathogens have produced numerous RNA-seq datasets which are available in the gene expression omnibus (GEO). Although these studies are publicly available, ...substantial computational expertise and manual effort are required to compare similar studies, visualize gene expression patterns within studies, and use published data to generate new experimental hypotheses. Furthermore, it is difficult to filter available studies by domain-relevant attributes such as strain, treatment, or media, or for a researcher to assess how a specific gene responds to various experimental conditions across studies. To reduce these barriers to data re-analysis, we have developed an R Shiny application called CF-Seq, which works with a compendium of 128 studies and 1,322 individual samples from 13 clinically relevant CF pathogens. The application allows users to filter studies by experimental factors and to view complex differential gene expression analyses at the click of a button. Here we present a series of use cases that demonstrate the application is a useful and efficient tool for new hypothesis generation. (CF-Seq: http://scangeo.dartmouth.edu/CFSeq/ ).
species are among a number of freshwater Gram-negative violacein-producing bacteria.
and
have had their whole genomes sequenced and annotated. This is the first report of a draft whole-genome ...sequence of a violacein-producing
strain that was isolated from the Hudson Valley watershed.
Investigators often interpret genome-wide data by analyzing the expression levels of genes within pathways. While this within-pathway analysis is routine, the products of any one pathway can affect ...the activity of other pathways. Past efforts to identify relationships between biological processes have evaluated overlap in knowledge bases or evaluated changes that occur after specific treatments. Individual experiments can highlight condition-specific pathway-pathway relationships; however, constructing a complete network of such relationships across many conditions requires analyzing results from many studies.
We developed PathCORE-T framework by implementing existing methods to identify pathway-pathway transcriptional relationships evident across a broad data compendium. PathCORE-T is applied to the output of feature construction algorithms; it identifies pairs of pathways observed in features more than expected by chance as
. We demonstrate PathCORE-T by analyzing an existing eADAGE model of a microbial compendium and building and analyzing NMF features from the TCGA dataset of 33 cancer types. The PathCORE-T framework includes a demonstration web interface, with source code, that users can launch to (1) visualize the network and (2) review the expression levels of associated genes in the original data. PathCORE-T creates and displays the network of globally co-occurring pathways based on features observed in a machine learning analysis of gene expression data.
The PathCORE-T framework identifies transcriptionally co-occurring pathways from the results of unsupervised analysis of gene expression data and visualizes the relationships between pathways as a network. PathCORE-T recapitulated previously described pathway-pathway relationships and suggested experimentally testable additional hypotheses that remain to be explored.