Abstract
Cell morphological phenotypes, including shape, size, intensity, and texture of cellular compartments have been shown to change in response to perturbation with small molecule compounds. ...Image-based cell profiling or cell morphological profiling has been used to associate changes of cell morphological features with alterations in cellular function and to infer molecular mechanisms of action. Recently, the Library of Integrated Network-based Cellular Signatures (LINCS) Project has measured gene expression and performed image-based cell profiling on cell lines treated with 9515 unique compounds. These data provide an opportunity to study the interdependence between transcription and cell morphology. Previous methods to investigate cell phenotypes have focused on targeting candidate genes as components of known pathways, RNAi morphological profiling, and cataloging morphological defects; however, these methods do not provide an explicit model to link transcriptomic changes with corresponding alterations in morphology. To address this, we propose a cell morphology enrichment analysis to assess the association between transcriptomic alterations and changes in cell morphology. Additionally, for a new transcriptomic query, our approach can be used to predict associated changes in cellular morphology. We demonstrate the utility of our method by applying it to cell morphological changes in a human bone osteosarcoma cell line.
On non-detects in qPCR data McCall, Matthew N; McMurray, Helene R; Land, Hartmut ...
Bioinformatics,
08/2014, Volume:
30, Issue:
16
Journal Article
Peer reviewed
Open access
Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. Despite extensive research in qPCR laboratory protocols, normalization and statistical analysis, ...little attention has been given to qPCR non-detects-those reactions failing to produce a minimum amount of signal.
We show that the common methods of handling qPCR non-detects lead to biased inference. Furthermore, we show that non-detects do not represent data missing completely at random and likely represent missing data occurring not at random. We propose a model of the missing data mechanism and develop a method to directly model non-detects as missing data. Finally, we show that our approach results in a sizeable reduction in bias when estimating both absolute and differential gene expression.
The proposed algorithm is implemented in the R package, nondetects. This package also contains the raw data for the three example datasets used in this manuscript. The package is freely available at http://mnmccall.com/software and as part of the Bioconductor project.
The sources of gene expression variability in human tissues are thought to be a complex interplay of technical, compositional, and disease-related factors. To better understand these contributions, ...we investigated expression variability in a relatively homogeneous tissue expression dataset from the Genotype-Tissue Expression (GTEx) resource. In addition to identifying technical sources, such as sequencing date and post-mortem interval, we also identified several biological sources of variation. An in-depth analysis of the 175 genes with the greatest variation among 133 lung tissue samples identified five distinct clusters of highly correlated genes. One large cluster included surfactant genes (SFTPA1, SFTPA2, and SFTPC), which are expressed exclusively in type II pneumocytes, cells that proliferate in ventilator associated lung injury. High surfactant expression was strongly associated with death on a ventilator and type II pneumocyte hyperplasia. A second large cluster included dynein (DNAH9 and DNAH12) and mucin (MUC5B and MUC16) genes, which are exclusive to the respiratory epithelium and goblet cells of bronchial structures. This indicates heterogeneous bronchiole sampling due to the harvesting location in the lung. A small cluster included acute-phase reactant genes (SAA1, SAA2, and SAA2–SAA4). The final two small clusters were technical and gender related. To summarize, in a collection of normal lung samples, we found that tissue heterogeneity caused by harvesting location (medial or lateral lung) and late therapeutic intervention (mechanical ventilation) were major contributors to expression variation. These unexpected sources of variation were the result of altered cell ratios in the tissue samples, an underappreciated source of expression variation.
miR-143 and miR-145 are co-expressed microRNAs (miRNAs) that have been extensively studied as potential tumor suppressors. These miRNAs are highly expressed in the colon and are consistently reported ...as being downregulated in colorectal and other cancers. Through regulation of multiple targets, they elicit potent effects on cancer cell growth and tumorigenesis. Importantly, a recent discovery demonstrates that miR-143 and miR-145 are not expressed in colonic epithelial cells; rather, these two miRNAs are highly expressed in mesenchymal cells such as fibroblasts and smooth muscle cells. The expression patterns of miR-143 and miR-145 and other miRNAs were initially determined from tissue level data without consideration that multiple different cell types, each with their own unique miRNA expression patterns, make up each tissue. Herein, we discuss the early reports on the identification of dysregulated miR-143 and miR-145 expression in colorectal cancer and how lack of consideration of cellular composition of normal tissue led to the misconception that these miRNAs are downregulated in cancer. We evaluate mechanistic data from miR-143/145 studies in context of their cell type-restricted expression pattern and the potential of these miRNAs to be considered tumor suppressors. Further, we examine other examples of miRNAs being investigated in inappropriate cell types modulating pathways in a non-biological fashion. Our review highlights the importance of determining the cellular expression pattern of each miRNA, so that downstream studies are conducted in the appropriate cell type.
MicroRNAs (miRNAs) are small (∼22-nt), stable RNAs that critically modulate post-transcriptional gene regulation. MicroRNAs can be found in the blood as components of serum, plasma and peripheral ...blood mononuclear cells (PBMCs). Many microRNAs have been reported to be specific biomarkers in a variety of non-neoplastic diseases. To date, no one has globally evaluated these proposed clinical biomarkers for general quality or disease specificity. We hypothesized that the cellular source of circulating microRNAs should correlate with cells involved in specific non-neoplastic disease processes. Appropriate cell expression data would inform on the quality and usefulness of each microRNA as a biomarker for specific diseases. We further hypothesized a useful clinical microRNA biomarker would have specificity to a single disease.
We identified 416 microRNA biomarkers, of which 192 were unique, in 104 publications covering 57 diseases. One hundred and thirty-nine microRNAs (33%) represented biologically plausible biomarkers, corresponding to non-ubiquitous microRNAs expressed in disease-appropriate cell types. However, at a global level, many of these microRNAs were reported as "specific" biomarkers for two or more unrelated diseases with 6 microRNAs (miR-21, miR-16, miR-146a, miR-155, miR-126 and miR-223) being reported as biomarkers for 9 or more distinct diseases. Other biomarkers corresponded to common patterns of cellular injury, such as the liver-specific microRNA, miR-122, which was elevated in a disparate set of diseases that injure the liver primarily or secondarily including hepatitis B, hepatitis C, sepsis, and myocardial infarction.
Only a subset of reported blood-based microRNA biomarkers have specificity for a particular disease. The remainder of the reported non-neoplastic biomarkers are either biologically implausible, non-specific, or uninterpretable due to limitations of our current understanding of microRNA expression.
The mitochondrial unfolded protein response (UPR
) is a cytoprotective signaling pathway triggered by mitochondrial dysfunction. UPR
activation upregulates chaperones, proteases, antioxidants, and ...glycolysis at the gene level to restore proteostasis and cell energetics. Activating transcription factor 5 (ATF5) is a proposed mediator of the mammalian UPR
. Herein, we hypothesized pharmacological UPR
activation may protect against cardiac ischemia-reperfusion (I/R) injury in an ATF5-dependent manner. Accordingly, in vivo administration of the UPR
inducers oligomycin or doxycycline 6 h before ex vivo I/R injury (perfused heart) was cardioprotective in wild-type but not global
mice. Acute ex vivo UPR
activation was not cardioprotective, and loss of ATF5 did not impact baseline I/R injury without UPR
induction. In vivo UPR
induction significantly upregulated many known UPR
-linked genes (cardiac quantitative PCR and Western blot analysis), and RNA-Seq revealed an UPR
-induced ATF5-dependent gene set, which may contribute to cardioprotection. This is the first in vivo proof of a role for ATF5 in the mammalian UPR
and the first demonstration that UPR
is a cardioprotective drug target.
Cardioprotection can be induced by drugs that activate the mitochondrial unfolded protein response (UPR
). UPR
protection is dependent on activating transcription factor 5 (ATF5). This is the first in vivo evidence for a role of ATF5 in the mammalian UPR
.
Abstract
Motivation
Current methods used to analyze real-time quantitative polymerase chain reaction (qPCR) data exhibit systematic deviations from the assumed model over the progression of the ...reaction. Slight variations in the amount of the initial target molecule or in early amplifications are likely responsible for these deviations. Commonly used 4- and 5-parameter sigmoidal models appear to be particularly susceptible to this issue, often displaying patterns of autocorrelation in the residuals. The presence of this phenomenon, even for technical replicates, suggests that these parametric models may be misspecified. Specifically, they do not account for the sequential dependent nature of the amplification process that underlies qPCR fluorescence measurements.
Results
We demonstrate that a Smooth Transition Autoregressive (STAR) model addresses this limitation by explicitly modeling the dependence between cycles and the gradual transition between amplification regimes. In summary, application of a STAR model to qPCR amplification data improves model fit and reduces autocorrelation in the residuals.
Availability and implementation
R scripts to reproduce all the analyses and results described in this manuscript can be found at: https://github.com/bhsu4/GAPDH.SO.
Supplementary information
Supplementary data are available at Bioinformatics online.
Toward the human cellular microRNAome McCall, Matthew N; Kim, Min-Sik; Adil, Mohammed ...
Genome research,
10/2017, Volume:
27, Issue:
10
Journal Article
Peer reviewed
Open access
MicroRNAs are short RNAs that serve as regulators of gene expression and are essential components of normal development as well as modulators of disease. MicroRNAs generally act cell-autonomously, ...and thus their localization to specific cell types is needed to guide our understanding of microRNA activity. Current tissue-level data have caused considerable confusion, and comprehensive cell-level data do not yet exist. Here, we establish the landscape of human cell-specific microRNA expression. This project evaluated 8 billion small RNA-seq reads from 46 primary cell types, 42 cancer or immortalized cell lines, and 26 tissues. It identified both specific and ubiquitous patterns of expression that strongly correlate with adjacent superenhancer activity. Analysis of unaligned RNA reads uncovered 207 unknown minor strand (passenger) microRNAs of known microRNA loci and 495 novel putative microRNA loci. Although cancer cell lines generally recapitulated the expression patterns of matched primary cells, their isomiR sequence families exhibited increased disorder, suggesting DROSHA- and DICER1-dependent microRNA processing variability. Cell-specific patterns of microRNA expression were used to de-convolute variable cellular composition of colon and adipose tissue samples, highlighting one use of these cell-specific microRNA expression data. Characterization of cellular microRNA expression across a wide variety of cell types provides a new understanding of this critical regulatory RNA species.
Frozen robust multiarray analysis (fRMA) McCall, Matthew N; Bolstad, Benjamin M; Irizarry, Rafael A
Biostatistics,
04/2010, Volume:
11, Issue:
2
Journal Article
Peer reviewed
Open access
Robust multiarray analysis (RMA) is the most widely used preprocessing algorithm for Affymetrix and Nimblegen gene expression microarrays. RMA performs background correction, normalization, and ...summarization in a modular way. The last 2 steps require multiple arrays to be analyzed simultaneously. The ability to borrow information across samples provides RMA various advantages. For example, the summarization step fits a parametric model that accounts for probe effects, assumed to be fixed across arrays, and improves outlier detection. Residuals, obtained from the fitted model, permit the creation of useful quality metrics. However, the dependence on multiple arrays has 2 drawbacks: (1) RMA cannot be used in clinical settings where samples must be processed individually or in small batches and (2) data sets preprocessed separately are not comparable. We propose a preprocessing algorithm, frozen RMA (fRMA), which allows one to analyze microarrays individually or in small batches and then combine the data for analysis. This is accomplished by utilizing information from the large publicly available microarray databases. In particular, estimates of probe-specific effects and variances are precomputed and frozen. Then, with new data sets, these are used in concert with information from the new arrays to normalize and summarize the data. We find that fRMA is comparable to RMA when the data are analyzed as a single batch and outperforms RMA when analyzing multiple batches. The methods described here are implemented in the R package fRMA and are currently available for download from the software section of http://rafalab.jhsph.edu.