The use of massively parallel sequencing of maternal cfDNA for non-invasive prenatal testing (NIPT) of aneuploidy is widely available. Recently, the scope of testing has increased to include selected ...subchromosomal abnormalities, but the number of samples reported has been small. We developed a calling pipeline based on a segmentation algorithm for the detection of these rearrangements in maternal plasma. The same read depth used in our standard pipeline for aneuploidy NIPT detected 15/18 (83%) samples with pathogenic rearrangements > 6 Mb but only 2/10 samples with rearrangements < 6 Mb, unless they were maternally inherited. There were two false-positive calls in 534 samples with no known subchromosomal abnormalities (specificity 99.6%). Using higher read depths, we detected 29/31 fetal subchromosomal abnormalities, including the three samples with maternally inherited microduplications. We conclude that test sensitivity is a function of the fetal fraction, read depth, and size of the fetal CNV and that at least one of the two false negatives is due to a low fetal fraction. The lack of an independent method for determining fetal fraction, especially for female fetuses, leads to uncertainty in test sensitivity, which currently has implications for this technique’s future as a clinical diagnostic test. Furthermore, to be effective, NIPT must be able to detect chromosomal rearrangements across the whole genome for a very low false-positive rate. Because standard NIPT can only detect the majority of larger (>6 Mb) chromosomal rearrangements and requires knowledge of fetal fraction, we consider that it is not yet ready for routine clinical implementation.
Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present ...scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.
Abstract
High-throughput single-cell RNA-seq (scRNA-seq) is a powerful tool for studying gene expression in single cells. Most current scRNA-seq bioinformatics tools focus on analysing overall ...expression levels, largely ignoring alternative mRNA isoform expression. We present a computational pipeline, Sierra, that readily detects differential transcript usage from data generated by commonly used polyA-captured scRNA-seq technology. We validate Sierra by comparing cardiac scRNA-seq cell types to bulk RNA-seq of matched populations, finding significant overlap in differential transcripts. Sierra detects differential transcript usage across human peripheral blood mononuclear cells and the Tabula Muris, and 3
′
UTR shortening in cardiac fibroblasts. Sierra is available at
https://github.com/VCCRI/Sierra
.
Differences in cell-type composition across subjects and conditions often carry biological significance. Recent advancements in single cell sequencing technologies enable cell-types to be identified ...at the single cell level, and as a result, cell-type composition of tissues can now be studied in exquisite detail. However, a number of challenges remain with cell-type composition analysis - none of the existing methods can identify cell-type perfectly and variability related to cell sampling exists in any single cell experiment. This necessitates the development of method for estimating uncertainty in cell-type composition.
We developed a novel single cell differential composition (scDC) analysis method that performs differential cell-type composition analysis via bootstrap resampling. scDC captures the uncertainty associated with cell-type proportions of each subject via bias-corrected and accelerated bootstrap confidence intervals. We assessed the performance of our method using a number of simulated datasets and synthetic datasets curated from publicly available single cell datasets. In simulated datasets, scDC correctly recovered the true cell-type proportions. In synthetic datasets, the cell-type compositions returned by scDC were highly concordant with reference cell-type compositions from the original data. Since the majority of datasets tested in this study have only 2 to 5 subjects per condition, the addition of confidence intervals enabled better comparisons of compositional differences between subjects and across conditions.
scDC is a novel statistical method for performing differential cell-type composition analysis for scRNA-seq data. It uses bootstrap resampling to estimate the standard errors associated with cell-type proportion estimates and performs significance testing through GLM and GLMM models. We have made this method available to the scientific community as part of the scdney package (Single Cell Data Integrative Analysis) R package, available from https://github.com/SydneyBioX/scdney.
Transcriptional analysis of brain tissue from people with molecularly defined causes of obesity may highlight disease mechanisms and therapeutic targets. We performed RNA sequencing of hypothalamus ...from individuals with Prader-Willi syndrome (PWS), a genetic obesity syndrome characterized by severe hyperphagia. We found that upregulated genes overlap with the transcriptome of mouse Agrp neurons that signal hunger, while downregulated genes overlap with the expression profile of Pomc neurons activated by feeding. Downregulated genes are expressed mainly in neuronal cells and contribute to neurogenesis, neurotransmitter release, and synaptic plasticity, while upregulated, predominantly microglial genes are involved in inflammatory responses. This transcriptional signature may be mediated by reduced brain-derived neurotrophic factor expression. Additionally, we implicate disruption of alternative splicing as a potential molecular mechanism underlying neuronal dysfunction in PWS. Transcriptomic analysis of the human hypothalamus may identify neural mechanisms involved in energy homeostasis and potential therapeutic targets for weight loss.
Display omitted
•Overlap between genes expressed in human PWS hypothalamus and mouse Agrp neurons•Downregulated genes are involved in neuronal development•SNORD116 deletion reduces neural development and survival in cells•Alternative splicing is disturbed in PWS
Prader-Willi syndrome (PWS) is a genetic obesity syndrome. Bochukova et al. report gene expression changes in the hypothalamus of people with PWS that support neurodegeneration and neuroinflammation as key processes involved in this condition.
Non-invasive prenatal testing (NIPT) of fetal aneuploidy using cell-free fetal DNA is becoming part of routine clinical practice. RAPIDR (Reliable Accurate Prenatal non-Invasive Diagnosis R package) ...is an easy-to-use open-source R package that implements several published NIPT analysis methods. The input to RAPIDR is a set of sequence alignment files in the BAM format, and the outputs are calls for aneuploidy, including trisomies 13, 18, 21 and monosomy X as well as fetal sex. RAPIDR has been extensively tested with a large sample set as part of the RAPID project in the UK. The package contains quality control steps to make it robust for use in the clinical setting.
RAPIDR is implemented in R and can be freely downloaded via CRAN from here: http://cran.r-project.org/web/packages/RAPIDR/index.html.
kitty.lo@ucl.ac.uk
Supplementary data are available at Bioinformatics online.
To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical ...sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ~97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7-500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.
Abstract Introduction Cervical cancer is the fourth most common female cancer worldwide. The prognosis for women with advanced-stage or recurrent cervical cancer remains poor and response to ...treatment is variable. Standardized management protocols leave little room for individualization. We report on a novel blood-based liquid biopsy for specific PIK3CA mutations as a clinically useful biomarker in patients with invasive cervical cancer. Methods One hundred seventeen Hong Kong Chinese women with primary invasive cervical cancer and their pre-treatment plasma samples were investigated. Two PIK3CA mutations, p.E542K and p.E545K were measured in cell free DNA (cfDNA) extracted from plasma using droplet digital PCR. This liquid biopsy of PIK3CA in cervical cancer was correlated to clinico-pathological features to verify the potential of PIK3CA as a clinically useful molecular biomarker for predicting disease prognosis and monitoring for progression. Results PIK3CA mutations, either p.E542K or p.E545K, were detected in plasma cfDNA from 22.2% of the patients. PIK3CA mutation status was significantly correlated to median tumor size ( p < 0.01). PIK3CA mutations detected in the plasma were significantly associated with decreased disease-free survival and overall survival ( p < 0.05). Conclusions As a liquid molecular biopsy, analysis of circulating PIK3CA mutations shows promise as a way to refine risk stratification of individual patients with cervical cancer, and provides a platform for further research to offer individualized therapy with the purpose of improving outcomes.
Innate immune activation beyond the central nervous system is emerging as a vital component of the pathogenesis of neurodegeneration. Huntington's disease (HD) is a fatal neurodegenerative disorder ...caused by a CAG repeat expansion in the huntingtin gene. The systemic innate immune system is thought to act as a modifier of disease progression; however, the molecular mechanisms remain only partially understood. Here we use RNA-sequencing to perform whole transcriptome analysis of primary monocytes from thirty manifest HD patients and thirty-three control subjects, cultured with and without a proinflammatory stimulus. In contrast with previous studies that have required stimulation to elicit phenotypic abnormalities, we demonstrate significant transcriptional differences in HD monocytes in their basal, unstimulated state. This includes previously undetected increased resting expression of genes encoding numerous proinflammatory cytokines, such as IL6 Further pathway analysis revealed widespread resting enrichment of proinflammatory functional gene sets, while upstream regulator analysis coupled with Western blotting suggests that abnormal basal activation of the NFĸB pathway plays a key role in mediating these transcriptional changes. That HD myeloid cells have a proinflammatory phenotype in the absence of stimulation is consistent with a priming effect of mutant huntingtin, whereby basal dysfunction leads to an exaggerated inflammatory response once a stimulus is encountered. These data advance our understanding of mutant huntingtin pathogenesis, establish resting myeloid cells as a key source of HD immune dysfunction, and further demonstrate the importance of systemic immunity in the potential treatment of HD and the wider study of neurodegeneration.
There is widespread transcriptional dysregulation in Huntington's disease (HD) brain, but analysis is inevitably limited by advanced disease and postmortem changes. However, mutant HTT is ...ubiquitously expressed and acts systemically, meaning blood, which is readily available and contains cells that are dysfunctional in HD, could act as a surrogate for brain tissue. We conducted an RNA-Seq transcriptomic analysis using whole blood from two HD cohorts, and performed gene set enrichment analysis using public databases and weighted correlation network analysis modules from HD and control brain datasets. We identified dysregulated gene sets in blood that replicated in the independent cohorts, correlated with disease severity, corresponded to the most significantly dysregulated modules in the HD caudate, the most prominently affected brain region, and significantly overlapped with the transcriptional signature of HD myeloid cells. High-throughput sequencing technologies and use of gene sets likely surmounted the limitations of previously inconsistent HD blood expression studies. Our results suggest transcription is disrupted in peripheral cells in HD through mechanisms that parallel those in brain. Immune upregulation in HD overlapped with Alzheimer's disease, suggesting a common pathogenic mechanism involving macrophage phagocytosis and microglial synaptic pruning, and raises the potential for shared therapeutic approaches.