Accurate fusion transcript detection is essential for comprehensive characterization of cancer transcriptomes. Over the last decade, multiple bioinformatic tools have been developed to predict ...fusions from RNA-seq, based on either read mapping or de novo fusion transcript assembly.
We benchmark 23 different methods including applications we develop, STAR-Fusion and TrinityFusion, leveraging both simulated and real RNA-seq. Overall, STAR-Fusion, Arriba, and STAR-SEQR are the most accurate and fastest for fusion detection on cancer transcriptomes.
The lower accuracy of de novo assembly-based methods notwithstanding, they are useful for reconstructing fusion isoforms and tumor viruses, both of which are important in cancer research.
Motivation: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess ...diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments.
Results: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer.
Contact:
robert@drive5.com
Availability: Source, binaries and data: http://drive5.com/uchime.
Supplementary information:
Supplementary data are available at Bioinformatics online.
High-throughput sequencing of cDNA libraries (RNA-Seq) has proven to be a highly effective approach for studying bacterial transcriptomes. A central challenge in designing RNA-Seq-based experiments ...is estimating a priori the number of reads per sample needed to detect and quantify thousands of individual transcripts with a large dynamic range of abundance.
We have conducted a systematic examination of how changes in the number of RNA-Seq reads per sample influences both profiling of a single bacterial transcriptome and the comparison of gene expression among samples. Our findings suggest that the number of reads typically produced in a single lane of the Illumina HiSeq sequencer far exceeds the number needed to saturate the annotated transcriptomes of diverse bacteria growing in monoculture. Moreover, as sequencing depth increases, so too does the detection of cDNAs that likely correspond to spurious transcripts or genomic DNA contamination. Finally, even when dozens of barcoded individual cDNA libraries are sequenced in a single lane, the vast majority of transcripts in each sample can be detected and numerous genes differentially expressed between samples can be identified.
Our analysis provides a guide for the many researchers seeking to determine the appropriate sequencing depth for RNA-Seq-based studies of diverse bacterial species.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Both intrinsic cell state changes and variations in the composition of stem cell populations have been implicated as contributors to aging. We used single-cell RNA-seq to dissect variability in ...hematopoietic stem cell (HSC) and hematopoietic progenitor cell populations from young and old mice from two strains. We found that cell cycle dominates the variability within each population and that there is a lower frequency of cells in the G1 phase among old compared with young long-term HSCs, suggesting that they traverse through G1 faster. Moreover, transcriptional changes in HSCs during aging are inversely related to those upon HSC differentiation, such that old short-term (ST) HSCs resemble young long-term (LT-HSCs), suggesting that they exist in a less differentiated state. Our results indicate both compositional changes and intrinsic, population-wide changes with age and are consistent with a model where a relationship between cell cycle progression and self-renewal versus differentiation of HSCs is affected by aging and may contribute to the functional decline of old HSCs.
A fundamental goal of genomics is to identify the complete set of expressed proteins. Automated annotation strategies rely on assumptions about protein-coding sequences (CDSs), e.g., they are ...conserved, do not overlap, and exceed a minimum length. However, an increasing number of newly discovered proteins violate these rules. Here we present an experimental and analytical framework, based on ribosome profiling and linear regression, for systematic identification and quantification of translation. Application of this approach to lipopolysaccharide-stimulated mouse dendritic cells and HCMV-infected human fibroblasts identifies thousands of novel CDSs, including micropeptides and variants of known proteins, that bear the hallmarks of canonical translation and exhibit translation levels and dynamics comparable to that of annotated CDSs. Remarkably, many translation events are identified in both mouse and human cells even when the peptide sequence is not conserved. Our work thus reveals an unexpected complexity to mammalian translation suited to provide both conserved regulatory or protein-based functions.
Display omitted
•ORF-RATER robustly identifies and quantifies translation from ribosome profiling data•ORF-RATER reveals thousands of novel micropeptides and variants of mammalian proteins•Hundreds of novel CDSs show evidence of protein-coding conservation among mammals•Many ORFs are translated in both mice and humans but lack protein-coding conservation
Fields et al. describe a ribosome profiling-based approach for empirical annotation of protein-coding regions of the genome. Of the thousands of previously unknown translated ORFs they identify in mouse and human, many are conserved or dynamically regulated. Surprisingly, a considerable subset is translated in both species despite weak sequence conservation.
Mammals have extremely limited regenerative capabilities; however, axolotls are profoundly regenerative and can replace entire limbs. The mechanisms underlying limb regeneration remain poorly ...understood, partly because the enormous and incompletely sequenced genomes of axolotls have hindered the study of genes facilitating regeneration. We assembled and annotated a de novo transcriptome using RNA-sequencing profiles for a broad spectrum of tissues that is estimated to have near-complete sequence information for 88% of axolotl genes. We devised expression analyses that identified the axolotl orthologs of cirbp and kazald1 as highly expressed and enriched in blastemas. Using morpholino anti-sense oligonucleotides, we find evidence that cirbp plays a cytoprotective role during limb regeneration whereas manipulation of kazald1 expression disrupts regeneration. Our transcriptome and annotation resources greatly complement previous transcriptomic studies and will be a valuable resource for future research in regenerative biology.
Display omitted
•Creation of a transcriptome with near-complete sequence data for 88% of axolotl genes•Expression analyses identify tissue-enriched transcripts for key tissues•The RNA-binding protein cirbp plays a cytoprotective role in limb regeneration•Knockdown and overexpression of kazald1 in blastema cells impair limb regeneration
Discovery of genes driving axolotl limb regeneration has been challenging, due to limited genomic resources. Bryant et al. have created a transcriptome with near-complete sequence information for most axolotl genes, identified transcriptional profiles that distinguish blastemas from differentiated limb tissues, and uncovered functional roles for cirbp and kazald1 in limb regeneration.
De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model ...organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer ...selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.
This trial compared surgery with physical therapy (followed by surgery as needed) in patients with a meniscal tear and mild-to-moderate knee osteoarthritis. Functional outcomes and pain were similar ...in the two groups at 6 months; 30% of the PT group crossed over to surgery.
Symptomatic, radiographically confirmed osteoarthritis of the knee affects more than 9 million people in the United States.
1
Meniscal tears are also highly prevalent, with imaging evidence of a meniscal tear observed in 35% of persons older than 50 years of age; two thirds of these tears are asymptomatic.
2
Meniscal damage is especially prevalent among persons with osteoarthritis
3
,
4
and is frequently treated surgically with arthroscopic partial meniscectomy. This procedure, in which the surgeon trims the torn meniscus back to a stable rim, is performed for a range of indications in more than 465,000 persons annually in the United States.
5
The . . .
Humans and other mammals are limited in their natural abilities to regenerate lost body parts. By contrast, many salamanders are highly regenerative and can spontaneously replace lost limbs even as ...adults. Because salamander limbs are anatomically similar to human limbs, knowing how they regenerate should provide important clues for regenerative medicine. Although interest in understanding the mechanics of this process has never wavered, until recently researchers have been vexed by seemingly impenetrable logistics of working with these creatures at a molecular level. Chief among the problems has been the very large size of salamander genomes, and not a single salamander genome has been fully sequenced to date. Recently the enormous gap in sequence information has been bridged by approaches that leverage mRNA as the starting point. Together with functional experimentation, these data are rapidly enabling researchers to finally uncover the molecular mechanisms underpinning the astonishing biological process of limb regeneration.
The salamander experimental toolset has now largely caught up with the interest in understanding limb regeneration, finally allowing precise experimentation at a cellular and molecular level.
A huge amount of transcript data has emerged from which to gather clues about how limb regeneration occurs.
Differential gene expression analysis has enabled the identification of transcripts that are highly enriched, as well as highly repressed, in key tissues required for limb regeneration. These are prime starting points for hypothesis generation and functional experimentation.
Several genes whose involvement would not have been predicted by candidate gene approaches have now been implicated in limb regeneration, underscoring the need to take unbiased approaches to gene discovery.
De novo transcriptomes and reference-tissue sequence data are important new resources for the field.