Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict ...fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly.
We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes.
The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.
High-throughput sequencing of cDNA libraries (RNA-Seq) has proven to be a highly effective approach for studying bacterial transcriptomes. A central challenge in designing RNA-Seq-based experiments ...is estimating a priori the number of reads per sample needed to detect and quantify thousands of individual transcripts with a large dynamic range of abundance.
We have conducted a systematic examination of how changes in the number of RNA-Seq reads per sample influences both profiling of a single bacterial transcriptome and the comparison of gene expression among samples. Our findings suggest that the number of reads typically produced in a single lane of the Illumina HiSeq sequencer far exceeds the number needed to saturate the annotated transcriptomes of diverse bacteria growing in monoculture. Moreover, as sequencing depth increases, so too does the detection of cDNAs that likely correspond to spurious transcripts or genomic DNA contamination. Finally, even when dozens of barcoded individual cDNA libraries are sequenced in a single lane, the vast majority of transcripts in each sample can be detected and numerous genes differentially expressed between samples can be identified.
Our analysis provides a guide for the many researchers seeking to determine the appropriate sequencing depth for RNA-Seq-based studies of diverse bacterial species.
Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess ...diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments.
Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer.
Contact:
robert@drive5.com
Availability: Source, binaries and data: http://drive5.com/uchime.
Supplementary information:
Supplementary data are available at Bioinformatics online.
To compare screening recall rates and cancer detection rates of tomosynthesis plus conventional digital mammography to those of conventional digital mammography alone.
All patients presenting for ...screening mammography between October 1, 2011, and September 30, 2012, at four clinical sites were reviewed in this HIPAA-compliant retrospective study, for which the institutional review board granted approval and waived the requirement for informed consent. Patients at sites with digital tomosynthesis were offered screening with digital mammography plus tomosynthesis. Patients at sites without tomosynthesis underwent conventional digital mammography. Recall rates were calculated and stratified according to breast density and patient age. Cancer detection rates were calculated and stratified according to the presence of a risk factor for breast cancer. The Fisher exact test was used to compare the two groups. Multivariate logistic regression was used to assess the effect of screening method, breast density, patient age, and cancer risk on the odds of recall from screening.
A total of 13 158 patients presented for screening mammography; 6100 received tomosynthesis. The overall recall rate was 8.4% for patients in the tomosynthesis group and 12.0% for those in the conventional mammography group (P < .01). The addition of tomosynthesis reduced recall rates for all breast density and patient age groups, with significant differences (P < .05) found for scattered fibroglandular, heterogeneously dense, and extremely dense breasts and for patients younger than 40 years, those aged 40-49 years, those aged 50-59 years, and those aged 60-69 years. These findings persisted when multivariate logistic regression was used to control for differences in age, breast density, and elevated risk of breast cancer. The cancer detection rate was 5.7 per 1000 in patients receiving tomosynthesis versus 5.2 per 1000 in patients receiving conventional mammography alone (P = .70).
Patients undergoing tomosynthesis plus digital mammography had significantly lower screening recall rates. The greatest reductions were for those younger than 50 years and those with dense breasts. A nonsignificant 9.5% increase in cancer detection was observed in the tomosynthesis group.
A fundamental goal of genomics is to identify the complete set of expressed proteins. Automated annotation strategies rely on assumptions about protein-coding sequences (CDSs), e.g., they are ...conserved, do not overlap, and exceed a minimum length. However, an increasing number of newly discovered proteins violate these rules. Here we present an experimental and analytical framework, based on ribosome profiling and linear regression, for systematic identification and quantification of translation. Application of this approach to lipopolysaccharide-stimulated mouse dendritic cells and HCMV-infected human fibroblasts identifies thousands of novel CDSs, including micropeptides and variants of known proteins, that bear the hallmarks of canonical translation and exhibit translation levels and dynamics comparable to that of annotated CDSs. Remarkably, many translation events are identified in both mouse and human cells even when the peptide sequence is not conserved. Our work thus reveals an unexpected complexity to mammalian translation suited to provide both conserved regulatory or protein-based functions.
Display omitted
•ORF-RATER robustly identifies and quantifies translation from ribosome profiling data•ORF-RATER reveals thousands of novel micropeptides and variants of mammalian proteins•Hundreds of novel CDSs show evidence of protein-coding conservation among mammals•Many ORFs are translated in both mice and humans but lack protein-coding conservation
Fields et al. describe a ribosome profiling-based approach for empirical annotation of protein-coding regions of the genome. Of the thousands of previously unknown translated ORFs they identify in mouse and human, many are conserved or dynamically regulated. Surprisingly, a considerable subset is translated in both species despite weak sequence conservation.
Both intrinsic cell state changes and variations in the composition of stem cell populations have been implicated as contributors to aging. We used single-cell RNA-seq to dissect variability in ...hematopoietic stem cell (HSC) and hematopoietic progenitor cell populations from young and old mice from two strains. We found that cell cycle dominates the variability within each population and that there is a lower frequency of cells in the G1 phase among old compared with young long-term HSCs, suggesting that they traverse through G1 faster. Moreover, transcriptional changes in HSCs during aging are inversely related to those upon HSC differentiation, such that old short-term (ST) HSCs resemble young long-term (LT-HSCs), suggesting that they exist in a less differentiated state. Our results indicate both compositional changes and intrinsic, population-wide changes with age and are consistent with a model where a relationship between cell cycle progression and self-renewal versus differentiation of HSCs is affected by aging and may contribute to the functional decline of old HSCs.
De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model ...organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
The End of History Illusion (EoHI) is the tendency to report that a greater amount of change occurred in the past than is predicted to occur in the future. We investigated if cultural differences ...exist in the magnitude of the EoHI for self-reported life satisfaction and personality traits. We found an effect of culture such that the difference between reported past and predicted future change was greater for U.S. Americans than Japanese, and that individual differences in two aspects of the self (self-esteem and self-concept clarity) mediated the link between culture and the magnitude of the EoHI. We also found a robust cultural difference in perceptions of past change; U.S. Americans tended to think about the past more negatively than their Japanese counterparts. These findings yield new insight onto the link between cultural context and the way people remember the past and imagine the future.
Worldviews about human's relationship with the natural world play an important role in psychological health. However, very little is currently known regarding the way worldviews about nature are ...linked with psychological health during a severe natural disaster and how this link may differ according to cultural context. In this study, we measured individual differences in worldviews about nature and psychological health during the 2020 COVID-19 pandemic within two different cultural contexts (Japan and United States). We found that across Japanese and American cultural contexts, holding a harmony-with-nature worldview was positively associated with improved psychological health during the COVID-19 pandemic. We also found that culture moderated the link between mastery-over-nature worldviews and negative affect. Americans showed a stronger link between mastery-over-nature worldviews and negative affect than Japanese. These findings support the biophilia hypothesis and contribute to theories differentiating Japanese and American cultural contexts based on naïve dialecticism and susceptibility to cognitive dissonance.