Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA ...sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.
DNA microarrays have been widely applied to cancer transcriptome analysis; however, the majority of such data are not easily accessible or comparable. Furthermore, several important analytic ...approaches have been applied to microarray analysis; however, their application is often limited. To overcome these limitations, we have developed Oncomine, a bioinformatics initiative aimed at collecting, standardizing, analyzing, and delivering cancer transcriptome data to the biomedical research community. Our analysis has identified the genes, pathways, and networks deregulated across 18,000 cancer gene expression microarrays, spanning the majority of cancer types and subtypes. Here, we provide an update on the initiative, describe the database and analysis modules, and highlight several notable observations. Results from this comprehensive analysis are available at http://www.oncomine.org.
Recurrent gene fusions are a prevalent class of mutations arising from the juxtaposition of 2 distinct regions, which can generate novel functional transcripts that could serve as valuable ...therapeutic targets in cancer. Therefore, we aim to establish a sensitive, high-throughput methodology to comprehensively catalog functional gene fusions in cancer by evaluating a paired-end transcriptome sequencing strategy. Not only did a paired-end approach provide a greater dynamic range in comparison with single read based approaches, but it clearly distinguished the high-level "driving" gene fusions, such as BCR-ABL1 and TMPRSS2-ERG, from potential lower level "passenger" gene fusions. Also, the comprehensiveness of a paired-end approach enabled the discovery of 12 previously undescribed gene fusions in 4 commonly used cell lines that eluded previous approaches. Using the paired-end transcriptome sequencing approach, we observed read-through mRNA chimeras, tissue-type restricted chimeras, converging transcripts, diverging transcripts, and overlapping mRNA transcripts. Last, we successfully used paired-end transcriptome sequencing to detect previously undescribed ETS gene fusions in prostate tumors. Together, this study establishes a highly specific and sensitive approach for accurately and comprehensively cataloguing chimeras within a sample using paired-end transcriptome sequencing.
Prostate cancer is the most frequently diagnosed cancer in American men. Screening for prostate-specific antigen (PSA) has led to earlier detection of prostate cancer, but elevated serum PSA levels ...may be present in non-malignant conditions such as benign prostatic hyperlasia (BPH). Characterization of gene-expression profiles that molecularly distinguish prostatic neoplasms may identify genes involved in prostate carcinogenesis, elucidate clinical biomarkers, and lead to an improved classification of prostate cancer. Using microarrays of complementary DNA, we examined gene-expression profiles of more than 50 normal and neoplastic prostate specimens and three common prostate-cancer cell lines. Signature expression profiles of normal adjacent prostate (NAP), BPH, localized prostate cancer, and metastatic, hormone-refractory prostate cancer were determined. Here we establish many associations between genes and prostate cancer. We assessed two of these genes-hepsin, a transmembrane serine protease, and pim-1, a serine/threonine kinase-at the protein level using tissue microarrays consisting of over 700 clinically stratified prostate-cancer specimens. Expression of hepsin and pim-1 proteins was significantly correlated with measures of clinical outcome. Thus, the integration of cDNA microarray, high-density tissue microarray, and linked clinical and pathology data is a powerful approach to molecular profiling of human cancer.
Prostate cancer is a leading cause of cancer-related death in males and is second only to lung cancer. Although effective surgical and radiation treatments exist for clinically localized prostate ...cancer, metastatic prostate cancer remains essentially incurable. Here we show, through gene expression profiling, that the polycomb group protein enhancer of zeste homolog 2 (EZH2) is overexpressed in hormone-refractory, metastatic prostate cancer. Small interfering RNA (siRNA) duplexes targeted against EZH2 reduce the amounts of EZH2 protein present in prostate cells and also inhibit cell proliferation in vitro. Ectopic expression of EZH2 in prostate cells induces transcriptional repression of a specific cohort of genes. Gene silencing mediated by EZH2 requires the SET domain and is attenuated by inhibiting histone deacetylase activity. Amounts of both EZH2 messenger RNA and EZH2 protein are increased in metastatic prostate cancer; in addition, clinically localized prostate cancers that express higher concentrations of EZH2 show a poorer prognosis. Thus, dysregulated expression of EZH2 may be involved in the progression of prostate cancer, as well as being a marker that distinguishes indolent prostate cancer from those at risk of lethal progression.
DNA microarrays have been widely applied to cancer transcriptome analysis. The Oncomine database contains a large collection of such data, as well as hundreds of derived gene-expression signatures. ...We studied the regulatory mechanisms responsible for gene deregulation in these cancer signatures by searching for the coordinate regulation of genes with common transcription factor binding sites. We found that genes with binding sites for the archetypal cancer transcription factor, E2F, were disproportionately overexpressed in a wide variety of cancers, whereas genes with binding sites for other transcription factors, such as Myc-Max, c-Rel and ATF, were disproportionately overexpressed in specific cancer types. These results suggest that alterations in pathways activating these transcription factors may be responsible for the observed gene deregulation and cancer pathogenesis.
The increasing availability and maturity of DNA microarray technology has led to an explosion of cancer profiling studies. To extract maximum value from the accumulating mass of publicly available ...cancer gene expression data, methods are needed to evaluate, integrate, and intervalidate multiple datasets. Here we demonstrate a statistical model for performing meta-analysis of independent microarray datasets. Implementation of this model revealed that four prostate cancer gene expression datasets shared significantly similar results, independent of the method and technology used (i.e., spotted cDNA versus oligonucleotide). This interstudy cross-validation approach generated a cohort of genes that were consistently and significantly dysregulated in prostate cancer. Bioinformatic investigation of these genes revealed a synchronous network of transcriptional regulation in the polyamine and purine biosynthesis pathways. Beyond the specific implications for prostate cancer, this work establishes a much-needed model for the evaluation, cross-validation, and comparison of multiple cancer profiling studies.
Global molecular profiling of cancers has shown broad utility in delineating pathways and processes underlying disease, in predicting prognosis and response to therapy, and in suggesting novel ...treatments. To gain further insights from such data, we have integrated and analyzed a comprehensive collection of “molecular concepts”" representing > 2500 cancer-related gene expression signatures from Oncomine and manual curation of the literature, drug treatment signatures from the Connectivity Map, target gene sets from genome-scale regulatory motif analyses, and reference gene sets from several gene and protein annotation databases. We computed pairwise association analysis on all 13,364 molecular concepts and identified > 290,000 significant associations, generating hypotheses that link cancer types and subtypes, pathways, mechanisms, and drugs. To navigate a network of associations, we developed an analysis platform, the Molecular Concepts Map. We demonstrate the utility of the approach by highlighting molecular concepts analyses of Myc pathway activation, breast cancer relapse, and retinoic acid treatment.
CONTEXT Molecular profiling of prostate cancer has led to the identification
of candidate biomarkers and regulatory genes. Discoveries from these genome-scale
approaches may have applicability in the ...analysis of diagnostic prostate specimens. OBJECTIVES To determine the expression and clinical utility of α-methylacyl
coenzyme A racemase (AMACR), a gene identified as
being overexpressed in prostate cancer by global profiling strategies. DESIGN Four gene expression data sets from independent DNA microarray analyses
were examined to identify genes expressed in prostate cancer (n = 128 specimens).
A lead candidate gene, AMACR, was validated at the
transcript level by reverse transcriptase polymerase chain reaction (RT-PCR)
and at the protein level by immunoblot and immunohistochemical analysis. AMACR levels were examined using prostate cancer tissue
microarrays in 342 samples representing different stages of prostate cancer
progression. Protein expression was characterized as negative (score = 1),
weak (2), moderate (3), or strong (4). Clinical utility of AMACR was evaluated using 94 prostate needle biopsy specimens. MAIN OUTCOME MEASURES Messenger RNA transcript and protein levels of AMACR; sensitivity and specificity of AMACR as
a tissue biomarker for prostate cancer in needle biopsy specimens. RESULTS Three of 4 independent DNA microarray analyses (n = 128 specimens) revealed
significant overexpression of AMACR in prostate cancer
(P<.001). AMACR up-regulation
in prostate cancer was confirmed by both RT-PCR and immunoblot analysis. Immunohistochemical
analysis demonstrated an increased expression of AMACR
in malignant prostate epithelia relative to benign epithelia. Tissue microarrays
to assess AMACR expression in specimens consisting
of benign prostate (n = 108 samples), atrophic prostate (n = 26), prostatic
intraepithelial neoplasia (n = 75), localized prostate cancer (n = 116), and
metastatic prostate cancer (n = 17) demonstrated mean AMACR protein staining
intensity of 1.31 (95% confidence interval, 1.23-1.40), 2.33 (95% CI, 2.13-2.52),
2.67 (95% CI, 2.52-2.81), 3.20 (95% CI, 3.10-3.28), and 2.50 (95% CI, 2.20-2.80),
respectively (P<.001). Pairwise comparisons demonstrated
significant differences in staining intensity between clinically localized
prostate cancer compared with benign prostate tissue, with mean expression
scores of 3.2 and 1.3, respectively (mean difference, 1.9; 95% CI, 1.7-2.1; P<.001). Using moderate or strong staining intensity
as positive (score = 3 or 4), evaluation of AMACR protein expression in 94
prostate needle biopsy specimens demonstrated 97% sensitivity and 100% specificity
for detecting prostate cancer. CONCLUSIONS AMACR was shown to be overexpressed in prostate
cancer using independent experimental methods and prostate cancer specimens. AMACR may be useful in the interpretation of prostate needle
biopsy specimens that are diagnostically challenging.
The endothelium plays a critical role in the inflammatory process. The complement activation product, C5a, is known to have proinflammatory effects on the endothelium, but the molecular mechanisms ...remain unclear. We have used cDNA microarray analysis to assess gene expression in human umbilical vein endothelial cells (HUVECs) that were stimulated with human C5a
in vitro
. Chip analyses were confirmed by reverse transcriptase-polymerase chain reaction and by Western blot analysis. Gene activation responses were remarkably similar to gene expression patterns of HUVECs stimulated with human tumor necrosis factor-α or bacterial lipopolysaccharide. HUVECs stimulated with C5a showed progressive increases in gene expression for cell adhesion molecules (eg, E-selectin, ICAM-1, VCAM-1), cytokines/chemokines, and related receptors (eg, VEGFC, IL-6, IL-18R). Surprisingly, HUVECs showed little evidence for up-regulation of complement-related genes. There were transient increases in gene expression associated with broad functional activities. The three agonists used also caused down-regulation of genes that regulate angiogenesis and drug metabolism. With a single exception, C5a caused little evidence of activation of complement-related genes. These studies indicate that endothelial cells respond robustly to C5a by activation of genes related to progressive expression of cell adherence molecules, and cytokines and chemokines in a manner similar to responses induced by tumor necrosis factor-α and lipopolysaccharide.