Small-cell lung cancer (SCLC), an aggressive neuroendocrine tumor with early dissemination and dismal prognosis, accounts for 15-20% of lung cancer cases and ∼200,000 deaths each year. Most cases are ...inoperable, and biopsies to investigate SCLC biology are rarely obtainable. Circulating tumor cells (CTCs), which are prevalent in SCLC, present a readily accessible 'liquid biopsy'. Here we show that CTCs from patients with either chemosensitive or chemorefractory SCLC are tumorigenic in immune-compromised mice, and the resultant CTC-derived explants (CDXs) mirror the donor patient's response to platinum and etoposide chemotherapy. Genomic analysis of isolated CTCs revealed considerable similarity to the corresponding CDX. Most marked differences were observed between CDXs from patients with different clinical outcomes. These data demonstrate that CTC molecular analysis via serial blood sampling could facilitate delivery of personalized medicine for SCLC. CDXs are readily passaged, and these unique mouse models provide tractable systems for therapy testing and understanding drug resistance mechanisms.
Used alone, the MAS5.0 algorithm for generating expression summaries has been criticized for high False Positive rates resulting from exaggerated variance at low intensities.
Here we show, with ...replicated cell line data, that, when used alongside detection calls, MAS5 can be both selective and sensitive. A set of differentially expressed transcripts were identified that were found to be changing by MAS5, but unchanging by RMA and GCRMA. Subsequent analysis by real time PCR confirmed these changes. In addition, with the Latin square datasets often used to assess expression summary algorithms, filtered MAS5.0 was found to have performance approaching that of its peers.
When used alongside detection calls, MAS5 is a sensitive and selective algorithm for identifying differentially expressed genes.
RNA-Seq exploits the rapid generation of gigabases of sequence data by Massively Parallel Nucleotide Sequencing, allowing for the mapping and digital quantification of whole transcriptomes. Whilst ...previous comparisons between RNA-Seq and microarrays have been performed at the level of gene expression, in this study we adopt a more fine-grained approach. Using RNA samples from a normal human breast epithelial cell line (MCF-10a) and a breast cancer cell line (MCF-7), we present a comprehensive comparison between RNA-Seq data generated on the Applied Biosystems SOLiD platform and data from Affymetrix Exon 1.0ST arrays. The use of Exon arrays makes it possible to assess the performance of RNA-Seq in two key areas: detection of expression at the granularity of individual exons, and discovery of transcription outside annotated loci.
We found a high degree of correspondence between the two platforms in terms of exon-level fold changes and detection. For example, over 80% of exons detected as expressed in RNA-Seq were also detected on the Exon array, and 91% of exons flagged as changing from Absent to Present on at least one platform had fold-changes in the same direction. The greatest detection correspondence was seen when the read count threshold at which to flag exons Absent in the SOLiD data was set to t<1 suggesting that the background error rate is extremely low in RNA-Seq. We also found RNA-Seq more sensitive to detecting differentially expressed exons than the Exon array, reflecting the wider dynamic range achievable on the SOLiD platform. In addition, we find significant evidence of novel protein coding regions outside known exons, 93% of which map to Exon array probesets, and are able to infer the presence of thousands of novel transcripts through the detection of previously unreported exon-exon junctions.
By focusing on exon-level expression, we present the most fine-grained comparison between RNA-Seq and microarrays to date. Overall, our study demonstrates that data from a SOLiD RNA-Seq experiment are sufficient to generate results comparable to those produced from Affymetrix Exon arrays, even using only a single replicate from each platform, and when presented with a large genome.
The number of gene expression studies in the public domain is rapidly increasing, representing a highly valuable resource. However, dataset-specific bias precludes meta-analysis at the raw transcript ...level, even when the RNA is from comparable sources and has been processed on the same microarray platform using similar protocols. Here, we demonstrate, using Affymetrix data, that much of this bias can be removed, allowing multiple datasets to be legitimately combined for meaningful meta-analyses.
A series of validation datasets comparing breast cancer and normal breast cell lines (MCF7 and MCF10A) were generated to examine the variability between datasets generated using different amounts of starting RNA, alternative protocols, different generations of Affymetrix GeneChip or scanning hardware. We demonstrate that systematic, multiplicative biases are introduced at the RNA, hybridization and image-capture stages of a microarray experiment. Simple batch mean-centering was found to significantly reduce the level of inter-experimental variation, allowing raw transcript levels to be compared across datasets with confidence. By accounting for dataset-specific bias, we were able to assemble the largest gene expression dataset of primary breast tumours to-date (1107), from six previously published studies. Using this meta-dataset, we demonstrate that combining greater numbers of datasets or tumours leads to a greater overlap in differentially expressed genes and more accurate prognostic predictions. However, this is highly dependent upon the composition of the datasets and patient characteristics.
Multiplicative, systematic biases are introduced at many stages of microarray experiments. When these are reconciled, raw data can be directly integrated from different gene expression datasets leading to new biological findings with increased statistical power.
Microarray gene expression profiling of formalin-fixed paraffin-embedded (FFPE) tissues is a new and evolving technique. This report compares transcript detection rates on Affymetrix U133 Plus 2.0 ...and Human Exon 1.0 ST GeneChips across several RNA extraction and target labeling protocols, using routinely collected archival FFPE samples. All RNA extraction protocols tested (Ambion-Optimum, Ambion-RecoverAll, and Qiagen-RNeasy FFPE) provided extracts suitable for microarray hybridization. Compared with Affymetrix One-Cycle labeled extracts, NuGEN system protocols utilizing oligo(dT) and random hexamer primers, and cDNA target preparations instead of cRNA, achieved percent present rates up to 55% on Plus 2.0 arrays. Based on two paired-sample analyses, at 90% specificity this equalled an average 30 percentage-point increase (from 50% to 80%) in FFPE transcript sensitivity relative to fresh frozen tissues, which we have assumed to have 100% sensitivity and specificity. The high content of Exon arrays, with multiple probe sets per exon, improved FFPE sensitivity to 92% at 96% specificity, corresponding to an absolute increase of ∼600 genes over Plus 2.0 arrays. While larger series are needed to confirm high correspondence between fresh-frozen and FFPE expression patterns, these data suggest that both Plus 2.0 and Exon arrays are suitable platforms for FFPE microarray expression analyses.
Strand‐specific RNA sequencing of S. pombe revealed a highly structured programme of ncRNA expression at over 600 loci. Waves of antisense transcription accompanied sexual differentiation. A ...substantial proportion of ncRNA arose from mechanisms previously considered to be largely artefactual, including improper 3′ termination and bidirectional transcription. Constitutive induction of the entire spk1+, spo4+, dis1+ and spo6+ antisense transcripts from an integrated, ectopic, locus disrupted their respective meiotic functions. This ability of antisense transcripts to disrupt gene function when expressed in trans suggests that cis production at native loci during sexual differentiation may also control gene function. Consistently, insertion of a marker gene adjacent to the dis1+ antisense start site mimicked ectopic antisense expression in reducing the levels of this microtubule regulator and abolishing the microtubule‐dependent ‘horsetail’ stage of meiosis. Antisense production had no impact at any of these loci when the RNA interference (RNAi) machinery was removed. Thus, far from being simply ‘genome chatter’, this extensive ncRNA landscape constitutes a fundamental component in the controls that drive the complex programme of sexual differentiation in S. pombe.
Strand‐specific RNA sequencing of S. pombe reveals a highly structured programme of ncRNA expression at over 600 loci. Functional investigations show that this extensive ncRNA landscape controls the complex programme of sexual differentiation in S. pombe.
Synopsis
Strand‐specific RNA sequencing of S. pombe reveals a highly structured programme of ncRNA expression at over 600 loci. Functional investigations show that this extensive ncRNA landscape controls the complex programme of sexual differentiation in S. pombe.
Regulation of the RNA profile is a principal control driving sexual differentiation in the fission yeast Schizosaccharomyces pombe. Before transcription, RNAi‐mediated formation of heterochromatin is used to suppress expression, while post‐transcription, regulation is achieved via the active stabilisation or destruction of transcripts, and through at least two distinct types of splicing control (Mata et al, 2002; Shimoseki and Shimoda, 2001; Averbeck et al, 2005; Mata and Bähler, 2006; Xue‐Franzen et al, 2006; Moldon et al, 2008; Djupedal et al, 2009; Amorim et al, 2010; Grewal, 2010; Cremona et al, 2011).
Around 94% of the S. pombe genome is transcribed (Wilhelm et al, 2008). While many of these transcripts encode proteins (Wood et al, 2002; Bitton et al, 2011), the majority have no known function. We used a strand‐specific protocol to sequence total RNA extracts taken from vegetatively growing cells, and at different points during a time course of sexual differentiation. The resulting data redefined existing gene coordinates and identified additional transcribed loci. The frequency of reads at each of these was used to monitor transcript abundance.
Transcript levels at 6599 loci changed in at least one sample (G‐statistic; False Discovery Rate <5%). 4231 (72.3%), of which 4011 map to protein‐coding genes, while 809 loci were antisense to a known gene. Comparisons between haploid and diploid strains identified changes in transcript levels at over 1000 loci.
At 354 loci, greater antisense abundance was observed relative to sense, in at least one sample (putative antisense regulatory transcripts—ARTs). Since antisense mechanisms are known to modulate sense transcript expression through a variety of inhibitory mechanisms (Faghihi and Wahlestedt, 2009), we postulated that the waves of antisense expression activated at different stages during meiosis might be regulating protein expression.
To ask whether transcription factors that drive sense‐transcript levels influenced ART production, we performed RNA‐seq of a pat1.114 diploid meiosis in the absence of the transcription factors Atf21 and Atf31 (responsible for late meiotic transcription; Mata et al, 2002). Transcript levels at 185 ncRNA loci showed significant changes in the knockout backgrounds. Although meiotic progression is largely unaffected by removal of Atf21 and Atf31, viability of the resulting spores was significantly diminished, indicating that Atf21‐ and Atf31‐mediated events are critical to efficient sexual differentiation.
If changes to relative antisense/sense transcript levels during a particular phase of sexual differentiation were to regulate protein expression, then the continued presence of the antisense at points in the differentiation programme where it would normally be absent should abolish protein function during this phase. We tested this hypothesis at four loci representing the three means of antisense production: convergent gene expression, improper termination and nascent transcription from an independent locus. Induction of the natural antisense transcripts that opposed spo4+, spo6+ and dis1+ (Figures 3 and 7) in trans from a heterologous locus phenocopied a loss of function of the target protein. ART overexpression decreased Dis1 protein levels. Antisense transcription opposing spk1+ originated from improper termination of the sense ups1+ transcript on the opposite strand (Figure 3B, left locus). Expression of either the natural full‐length ups1+ transcript or a truncated version, restricted to the portion of ups1+ overlapping spk1+ (Figure 3, orange transcripts) in trans from a heterologous locus phenocopied the spk1.Δ differentiation deficiency. Convergent transcription from a neighbouring gene on the opposing strand is, therefore, an effective mechanism to generate RNAi‐mediated (below) silencing in fission yeast. Further analysis of the data revealed, for many loci, substantial changes in UTR length over the course of meiosis, suggesting that UTR dynamics may have an active role in regulating gene expression by controlling the transcriptional overlap between convergent adjacent gene pairs.
The RNAi machinery (Grewal, 2010) was required for antisense suppression at each of the dis1, spk1, spo4 and spo6 loci, as antisense to each locus had no impact in ago1.Δ, dcr1.Δ and rdp1.Δ backgrounds. We conclude that RNAi control has a key role in maintaining the fidelity of sexual differentiation in fission yeast. The histone H3 methyl transferase Clr4 was required for antisense control from a heterologous locus.
Thus, a significant portion of the impact of ncRNA upon sexual differentiation arises from antisense gene silencing. Importantly, in contrast to the extensively characterised ability of the RNAi machinery to operate in cis at a target locus in S. pombe (Grewal, 2010), each case of gene silencing generated here could be achieved in trans by expression of the antisense transcript from a single heterologous locus elsewhere in the genome.
Integration of an antibiotic marker gene immediately downstream of the dis1+ locus instigated antisense control in an orientation‐dependent manner. PCR‐based gene tagging approaches are widely used to fuse the coding sequences of epitope or protein tags to a gene of interest. Not only do these tagging approaches disrupt normal 3′UTR controls, but the insertion of a heterologous marker gene immediately downstream of an ORF can clearly have a significant impact upon transcriptional control of the resulting fusion protein. Thus, PCR tagging approaches can no longer be viewed as benign manipulations of a locus that only result in the production of a tagged protein product.
Repression of Dis1 function by gene deletion or antisense control revealed a key role this conserved microtubule regulator in driving the horsetail nuclear migrations that promote recombination during meiotic prophase.
Non‐coding transcripts have often been viewed as simple ‘chatter’, maintained solely because evolutionary pressures have not been strong enough to force their elimination from the system. Our data show that phenomena such as improper termination and bidirectional transcription are not simply interesting artifacts arising from the complexities of transcription or genome history, but have a critical role in regulating gene expression in the current genome. Given the widespread use of RNAi, it is reasonable to anticipate that future analyses will establish ARTs to have equal importance in other organisms, including vertebrates.
These data highlight the need to modify our concept of a gene from that of a spatially distinct locus. This view is becoming increasingly untenable. Not only are the 5′ and 3′ ends of many genes indistinct, but that this lack of a hard and fast boundary is actively used by cells to control the transcription of adjacent and overlapping loci, and thus to regulate critical events in the life of a cell.
The model eukaryote S. pombe features substantial numbers of ncRNAs many of which are antisense regulatory transcripts (ARTs), ncRNAs expressed on the opposing strand to coding sequences.
Individual ARTs are generated during the mitotic cycle, or at discrete stages of sexual differentiation to downregulate the levels of proteins that drive and coordinate sexual differentiation.
Antisense transcription occurring from events such as bidirectional transcription is not simply artefactual ‘chatter’, it performs a critical role in regulating gene expression.
The challenge of gene expression studies is to reliably quantify levels of transcripts, but this is hindered by a number of factors including sample availability, handling and storage. The PAXgene ...Blood RNA System includes a stabilizing additive in a plastic evacuated tube, but requires 2.5 mL blood, which makes routine implementation impractical for paediatric use. The aim of this study was to modify the PAXgene Blood RNA System kit protocol for application to small, sick children, without compromising RNA integrity, and subsequently to perform quantitative analysis of ICAM and interleukin-6 gene expression.Aliquots of 0.86 mL PAXgene reagent were put into microtubes and 0.3 mL whole blood added to maintain the same recommended proportions as in the PAXgene evacuated tube system. RNA quality was assessed using the Agilent BioAnalyser 2100 and an in-house TaqMan assay which measures GAPDH transcript integrity by determining 3' to 5' ratios. qPCR analysis was performed on an additional panel of 7 housekeeping genes. Three reference genes (HPRT1, YWHAZ and GAPDH) were identified using the GeNORM algorithm, which were subsequently used to normalising target gene expression levels. ICAM-1 and IL-6 gene expression were measured in 87 Malawian children with invasive pneumococcal disease.
Total RNA yield was between 1,114 and 2,950 ng and the BioAnalyser 2100 demonstrated discernible 18s and 28s bands. The cycle threshold values obtained for the seven housekeeping genes were between 15 and 30 and showed good consistency. Median relative ICAM and IL-6 gene expression were significantly reduced in non-survivors compared to survivors (ICAM: 3.56 vs 4.41, p = 0.04, and IL-6: 2.16 vs 6.73, p = 0.02).
We have successfully modified the PAXgene blood collection system for use in small children and demonstrated preservation of RNA integrity and successful quantitative real-time PCR analysis.
Exon arrays aim to provide comprehensive gene expression data at the level of individual exons, similar to that provided on a per-gene basis by existing expression arrays. This report describes the ...performance of Affymetrix GeneChip® Human Exon 1.0 ST array by using replicated RNA samples from two human cell lines, MCF7 and MCF10A, hybridized both to Exon 1.0 ST and to HG-U133 Plus2 arrays. Cross-comparison between array types requires an appropriate mapping to be found between individual probe sets. Three possible mappings were considered, reflecting different strategies for dealing with probe sets that target different parts of the same transcript. Irrespective of the mapping used, Exon 1.0 ST and HG-U133 Plus2 arrays show a high degree of correspondence. More than 80% of HG-U133 Plus2 probe sets may be mapped to the Exon chip, and fold changes are found well preserved for over 96% of those probe sets detected present. Since HG-U133 Plus2 arrays have already been extensively validated, these results lend a significant degree of confidence to exon arrays.
Management and conservation of ecosystems relies on biodiversity data; however, broad-scale biological data are often limited. Predictive modelling using environmental variables has recently proven a ...valuable tool in addressing this gap. Wave exposure is a particularly important environmental variable that structures shallow reef systems, but it is rarely quantified across the large areas often used for predictive studies. Therefore, we investigated approaches that quantify exposure and can be readily applied across a large area. We generated 6 quantitative indices that emphasise different aspects of exposure using a numerical wave model and cartographic fetch models. The utility of these indices for predictive modelling in shallow temperate reef systems was assessed by how well they explained community and genera-level algal patterns in Tasmania, Australia, which is a region that experiences a wide range of wave exposure conditions. Exposure indices were significant predictors of algal patterns, explaining up to 18% of community level patterns and up to 37% of the variance associated with the occurrence and cover of algal genera. Fetch-based indices in particular appear to be a viable option for quantifying exposure on shallow reefs. These indices can be generated within a Geographic Information System (GIS) program for specific sites of interest, along coastlines or on a grid, and are potentially accessible to ecologists. Quantification of exposure across broad regions using fetch indices will allow ecologists to makes advances in predictive modelling studies, but also facilitate studies that test the generality of hypotheses and mechanisms driving patterns previously observed using qualitative measures.
The desire to perform microarray experiments with small starting amounts of RNA has led to the development of a variety of protocols for preparing and amplifying mRNA. This has consequences not only ...for the standardization of experimental design, but also for reproducibility and comparability between experiments. Here we investigate the differences between the Affymetrix standard and small sample protocols and address the data analysis issues that arise when comparing samples and experiments that have been processed in different ways. We show that data generated on the same platform using different protocols are not directly comparable. Further, protocols introduce systematic biases that can be largely accounted for by using the correct data analysis techniques.