Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate ...bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing.
We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read.
Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.
Alternative splicing is widely recognized for its roles in regulating genes and creating gene diversity. However, despite many efforts, the repertoire of gene splicing variation is still incompletely ...characterized, even in humans. Here we describe a new computational system, ASprofile, and its application to RNA-seq data from Illumina’s Human Body Map project (>2.5 billion reads). Using the system, we identified putative alternative splicing events in 16 different human tissues, which provide a dynamic picture of splicing variation across the tissues. We detected 26,989 potential exon skipping events representing differences in splicing patterns among the tissues. A large proportion of the events (>60%) were novel, involving new exons (~3000), new introns (~16000), or both. When tracing these events across the sixteen tissues, only a small number (4-7%) appeared to be differentially expressed (‘switched’) between two tissues, while 30-45% showed little variation, and the remaining 50-65% were not present in one or both tissues compared. Novel exon skipping events appeared to be slightly less variable than known events, but were more tissue-specific. Our study represents the first effort to build a comprehensive catalog of alternative splicing in normal human tissues from RNA-seq data, while providing insights into the role of alternative splicing in shaping tissue transcriptome differences. The catalog of events and the ASprofile software are freely available from the Zenodo repository
(
http://zenodo.org/record/7068
; doi:
10.5281/zenodo.7068
) and from our web site
http://ccb.jhu.edu/software/ASprofile
.
Transcript assembly from RNA-seq reads is a critical step in gene expression and subsequent functional analyses. Here we present PsiCLASS, an accurate and efficient transcript assembler based on an ...approach that simultaneously analyzes multiple RNA-seq samples. PsiCLASS combines mixture statistical models for exonic feature selection across multiple samples with splice graph based dynamic programming algorithms and a weighted voting scheme for transcript selection. PsiCLASS achieves significantly better sensitivity-precision tradeoff, and renders precision up to 2-3 fold higher than the StringTie system and Scallop plus TACO, the two best current approaches. PsiCLASS is efficient and scalable, assembling 667 GEUVADIS samples in 9 h, and has robust accuracy with large numbers of samples.
Next generation sequencing of cellular RNA is making it possible to characterize genes and alternative splicing in unprecedented detail. However, designing bioinformatics tools to accurately capture ...splicing variation has proven difficult. Current programs can find major isoforms of a gene but miss lower abundance variants, or are sensitive but imprecise. CLASS2 is a novel open source tool for accurate genome-guided transcriptome assembly from RNA-seq reads based on the model of splice graph. An extension of our program CLASS, CLASS2 jointly optimizes read patterns and the number of supporting reads to score and prioritize transcripts, implemented in a novel, scalable and efficient dynamic programming algorithm. When compared against reference programs, CLASS2 had the best overall accuracy and could detect up to twice as many splicing events with precision similar to the best reference program. Notably, it was the only tool to produce consistently reliable transcript models for a wide range of applications and sequencing strategies, including ribosomal RNA-depleted samples. Lightweight and multi-threaded, CLASS2 requires <3GB RAM and can analyze a 350 million read set within hours, and can be widely applied to transcriptomics studies ranging from clinical RNA sequencing, to alternative splicing analyses, and to the annotation of new genomes.
Tools for differential splicing detection have failed to provide a comprehensive and consistent view of splicing variation. We present MntJULiP, a novel method for comprehensive and accurate ...quantification of splicing differences between two or more conditions. MntJULiP detects both changes in intron splicing ratios and changes in absolute splicing levels with high accuracy, and can find classes of variation overlooked by other tools. MntJULiP identifies over 29,000 differentially spliced introns in 1398 GTEx brain samples, including 11,242 novel introns discovered in this dataset. Highly scalable, MntJULiP can process thousands of samples within hours to reveal splicing constituents of phenotypic differentiation.
Objective
This open‐label 12‐week study was conducted to evaluate the efficacy and safety of tofacitinib, a JAK inhibitor, in treatment‐refractory active dermatomyositis (DM).
Methods
Tofacitinib in ...extended‐release doses of 11 mg was administered daily to 10 subjects with DM. Prior to treatment, a complete washout of all steroid‐sparing agents was performed. The primary outcome measure was assessment of disease activity improvement based on the International Myositis Assessment and Clinical Studies group definition of improvement. Response rate was measured as the total improvement score according to the 2016 American College of Rheumatology (ACR)/European League Against Rheumatism (EULAR) myositis response criteria. Secondary outcome measures included Cutaneous Dermatomyositis Disease Area and Severity Index (CDASI) scores, chemokine levels, immunohistochemical analysis of STAT1 expression in the skin, RNA sequencing analysis, and safety.
Results
At 12 weeks, the primary outcome was met in all 10 subjects. Five (50%) of 10 subjects experienced moderate improvement in disease activity, and the other 50% experienced minimal improvement according to the 2016 ACR/EULAR myositis response criteria. The secondary outcome of the mean change in the CDASI activity score over 12 weeks was statistically significant (mean ± SD 28 ± 15.4 at baseline versus 9.5 ± 8.5 at 12 weeks) (P = 0.0005). Serum chemokine levels of CXCL9/CXCL10 showed a statistically significant change from baseline. A marked decrease in STAT1 signaling in association with suppression of interferon target gene expression was demonstrated in 3 of 9 skin biopsy samples from subjects with dermatomyositis. The mean ± SD level of creatine kinase in the 10 subjects at baseline was 82 ± 34.8 IU/liter, highlighting that disease activity was predominantly located in the skin.
Conclusion
This is the first prospective, open‐label clinical trial of tofacitinib in DM that demonstrates strong clinical efficacy of a pan‐JAK inhibitor, as measured by validated myositis response criteria. Future randomized controlled trials using JAK inhibitors should be considered for treating DM.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
Lighter is a fast, memory-efficient tool for correcting sequencing errors. Lighter avoids counting k-mers. Instead, it uses a pair of Bloom filters, one holding a sample of the input k-mers and the ...other holding k-mers likely to be correct. As long as the sampling fraction is adjusted in inverse proportion to the depth of sequencing, Bloom filter size can be held constant while maintaining near-constant accuracy. Lighter is parallelized, uses no secondary storage, and is both faster and more memory-efficient than competing approaches while achieving comparable accuracy.
Short telomere syndromes manifest as familial idiopathic pulmonary fibrosis; they are the most common premature aging disorders. We used genome-wide linkage to identify heterozygous loss of function ...of
, a zinc-knuckle containing protein, as a cause of autosomal dominant pulmonary fibrosis. ZCCHC8 associated with
and was required for telomerase function. In ZCCHC8 knockout cells and in mutation carriers, genomically extended telomerase RNA (
) accumulated at the expense of mature
, consistent with a role for ZCCHC8 in mediating
3' end targeting to the nuclear RNA exosome. We generated
-null mice and found that heterozygotes, similar to human mutation carriers, had
insufficiency but an otherwise preserved transcriptome. In contrast,
mice developed progressive and fatal neurodevelopmental pathology with features of a ciliopathy. The
brain transcriptome was highly dysregulated, showing accumulation and 3' end misprocessing of other low-abundance RNAs, including those encoding cilia components as well as the intronless replication-dependent histones. Our data identify a novel cause of human short telomere syndromes-familial pulmonary fibrosis and uncover nuclear exosome targeting as an essential 3' end maturation mechanism that vertebrate
shares with replication-dependent histones.
To determine if urinary microbial communities similar to those described in adults exist in children and to profile the urinary and gastrointestinal microbiome in children presenting to urology for ...both routine and complex urologic procedures.
Prepubertal boys (n = 20, ages 3 months-8 years; median age 15 months) who required elective urologic procedures were eligible. Urine samples were collected via sterile catheterization and fecal samples were obtained by rectal swabs. DNA was extracted from urine pellet and fecal samples and subjected to bacterial profiling via 16S rDNA Illumina sequencing and 16S rDNA quantitative polymerase chain reaction. We assessed within and between sample diversity and differential species abundance between samples.
Urine samples had low bacterial biomass that reflected the presence of bacterial populations. The most abundant genera detected in urine samples are not common to skin microbiota and several of the genera have been previously identified in the urinary microbiome of adults. We report presumably atypical compositional differences in both the urinary and gastrointestinal microbiome in children with prior antibiotic exposure and highlight an important case of a child who had undergone lifelong antibiotic treatment as prophylaxis for congenital abnormalities.
This study provides one of the first characterizations of the urinary microbiome in prepubertal males. Defining the baseline healthy microbiome in children may lay the foundation for understanding the long-term impact of factors such as antibiotic use in the development of a healthy microbiome as well as the development of future urologic and gastrointestinal diseases.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Breast cancer transcriptome acquires a myriad of regulation changes, and splicing is critical for the cell to "tailor-make" specific functional transcripts. We systematically revealed splicing ...signatures of the three most common types of breast tumors using RNA sequencing: TNBC, non-TNBC and HER2-positive breast cancer. We discovered subtype specific differentially spliced genes and splice isoforms not previously recognized in human transcriptome. Further, we showed that exon skip and intron retention are predominant splice events in breast cancer. In addition, we found that differential expression of primary transcripts and promoter switching are significantly deregulated in breast cancer compared to normal breast. We validated the presence of novel hybrid isoforms of critical molecules like CDK4, LARP1, ADD3, and PHLPP2. Our study provides the first comprehensive portrait of transcriptional and splicing signatures specific to breast cancer sub-types, as well as previously unknown transcripts that prompt the need for complete annotation of tissue and disease specific transcriptome.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK