The ability to quantify differentiation potential of single cells is a task of critical importance. Here we demonstrate, using over 7,000 single-cell RNA-Seq profiles, that differentiation potency of ...a single cell can be approximated by computing the signalling promiscuity, or entropy, of a cell's transcriptome in the context of an interaction network, without the need for feature selection. We show that signalling entropy provides a more accurate and robust potency estimate than other entropy-based measures, driven in part by a subtle positive correlation between the transcriptome and connectome. Signalling entropy identifies known cell subpopulations of varying potency and drug resistant cancer stem-cell phenotypes, including those derived from circulating tumour cells. It further reveals that expression heterogeneity within single-cell populations is regulated. In summary, signalling entropy allows in silico estimation of the differentiation potency and plasticity of single cells and bulk samples, providing a means to identify normal and cancer stem-cell phenotypes.
This Comment describes some of the common pitfalls encountered in deriving and validating predictive statistical models from high-dimensional data. It offers a fresh perspective on some key ...statistical issues, providing some guidelines to avoid pitfalls, and to help unfamiliar readers better assess the reliability and significance of their results.
DNA methylation changes that accrue in the stem cell pool of an adult tissue in line with the cumulative number of cell divisions may contribute to the observed variation in cancer risk among tissues ...and individuals. Thus, the construction of epigenetic "mitotic" clocks that can measure the lifetime number of stem cell divisions is of paramount interest.
Building upon a dynamic model of DNA methylation gain in unmethylated CpG-rich regions, we here derive a novel mitotic clock ("epiTOC2") that can directly estimate the cumulative number of stem cell divisions in a tissue. We compare epiTOC2 to a different mitotic model, based on hypomethylation at solo-WCGW sites ("HypoClock"), in terms of their ability to measure mitotic age of normal adult tissues and predict cancer risk.
Using epiTOC2, we estimate the intrinsic stem cell division rate for different normal tissue types, demonstrating excellent agreement (Pearson correlation = 0.92, R
= 0.85, P = 3e-6) with those derived from experiment. In contrast, HypoClock's estimates do not (Pearson correlation = 0.30, R
= 0.09, P = 0.29). We validate these results in independent datasets profiling normal adult tissue types. While both epiTOC2 and HypoClock correctly predict an increased mitotic rate in cancer, epiTOC2 is more robust and significantly better at discriminating preneoplastic lesions characterized by chronic inflammation, a major driver of tissue turnover and cancer risk. Our data suggest that DNA methylation loss at solo-WCGWs is significant only when cells are under high replicative stress and that epiTOC2 is a better mitotic age and cancer risk prediction model for normal adult tissues.
These results have profound implications for our understanding of epigenetic clocks and for developing cancer risk prediction or early detection assays. We propose that measurement of DNAm at the 163 epiTOC2 CpGs in adult pre-neoplastic lesions, and potentially in serum cell-free DNA, could provide the basis for building feasible pre-diagnostic or cancer risk assays. epiTOC2 is freely available from https://doi.org/10.5281/zenodo.2632938.
The Illumina Infinium 450 k DNA Methylation Beadchip is a prime candidate technology for Epigenome-Wide Association Studies (EWAS). However, a difficulty associated with these beadarrays is that ...probes come in two different designs, characterized by widely different DNA methylation distributions and dynamic range, which may bias downstream analyses. A key statistical issue is therefore how best to adjust for the two different probe designs.
Here we propose a novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes. The strategy involves application of a three-state beta-mixture model to assign probes to methylation states, subsequent transformation of probabilities into quantiles and finally a methylation-dependent dilation transformation to preserve the monotonicity and continuity of the data. We validate our method on cell-line data, fresh frozen and paraffin-embedded tumour tissue samples and demonstrate that BMIQ compares favourably with two competing methods. Specifically, we show that BMIQ improves the robustness of the normalization procedure, reduces the technical variation and bias of type2 probe values and successfully eliminates the type1 enrichment bias caused by the lower dynamic range of type2 probes. BMIQ will be useful as a preprocessing step for any study using the Illumina Infinium 450 k platform.
BMIQ is freely available from http://code.google.com/p/bmiq/.
a.teschendorff@ucl.ac.uk
Supplementary data are available at Bioinformatics online.
Abstract
Motivation
A clear identification of the primary site of tumor is of great importance to the next targeted site-specific treatments and could efficiently improve patient's overall survival. ...Even though many classifiers based on gene expression had been proposed to predict the tumor primary, only a few studies focus on using DNA methylation (DNAm) profiles to develop classifiers, and none of them compares the performance of classifiers based on different profiles.
Results
We introduced novel selection strategies to identify highly tissue-specific CpG sites and then used the random forest approach to construct the classifiers to predict the origin of tumors. We also compared the prediction performance by applying similar strategy on miRNA expression profiles. Our analysis indicated that these classifiers had an accuracy of 96.05% (Maximum-Relevance-Maximum-Distance: 90.02-99.99%) or 95.31% (principal component analysis: 79.82-99.91%) on independent DNAm datasets, and an overall accuracy of 91.30% (range 79.33-98.74%) on independent miRNA test sets for predicting tumor origin. This suggests that our feature selection methods are very effective to identify tissue-specific biomarkers and the classifiers we developed can efficiently predict the origin of tumors. We also developed a user-friendly webserver that helps users to predict the tumor origin by uploading miRNA expression or DNAm profile of their interests.
Availability and implementation
The webserver, and relative data, code are accessible at http://server.malab.cn/MMCOP/.
Supplementary information
Supplementary data are available at Bioinformatics online.
On epigenetic stochasticity, entropy and cancer risk Teschendorff, Andrew E
Philosophical transactions of the Royal Society of London. Series B. Biological sciences,
04/2024, Letnik:
379, Številka:
1900
Journal Article
Recenzirano
Odprti dostop
Epigenetic changes are known to accrue in normal cells as a result of ageing and cumulative exposure to cancer risk factors. Increasing evidence points towards age-related epigenetic changes being ...acquired in a quasi-stochastic manner, and that they may play a causal role in cancer development. Here, I describe the quasi-stochastic nature of DNA methylation (DNAm) changes in ageing cells as well as in normal cells at risk of neoplastic transformation, discussing the implications of this stochasticity for developing cancer risk prediction strategies, and in particular, how it may require a conceptual paradigm shift in how we select cancer risk markers. I also describe the mounting evidence that a significant proportion of DNAm changes in ageing and cancer development are related to cell proliferation, reflecting tissue-turnover and the opportunity this offers for predicting cancer risk via the development of epigenetic mitotic-like clocks. Finally, I describe how age-associated DNAm changes may be causally implicated in cancer development via an irreversible suppression of tissue-specific transcription factors that increases epigenetic and transcriptomic entropy, promoting a more plastic yet aberrant cancer stem-cell state. This article is part of a discussion meeting issue 'Causes and consequences of stochastic processes in development and disease'.
The Illumina Infinium HumanMethylation450 BeadChip is a new platform for high-throughput DNA methylation analysis. Several methods for normalization and processing of these data have been published ...recently. Here we present an integrated analysis pipeline offering a choice of the most popular normalization methods while also introducing new methods for calling differentially methylated regions and detecting copy number aberrations.
ChAMP is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at bioconductor.org
The Illumina Infinium HumanMethylationEPIC BeadChip is the new platform for high-throughput DNA methylation analysis, effectively doubling the coverage compared to the older 450 K array. Here we ...present a significantly updated and improved version of the Bioconductor package ChAMP, which can be used to analyze EPIC and 450k data. Many enhanced functionalities have been added, including correction for cell-type heterogeneity, network analysis and a series of interactive graphical user interfaces.
ChAMP is a BioC package available from https://bioconductor.org/packages/release/bioc/html/ChAMP.html.
a.teschendorff@ucl.ac.uk or s.beck@ucl.ac.uk or a.feber@ucl.ac.uk.
Supplementary data are available at Bioinformatics online.
It is now well established that the genomic landscape of DNA methylation (DNAm) gets altered as a function of age, a process we here call 'epigenetic drift'. The biological, functional, clinical and ...evolutionary significance of this epigenetic drift, however, remains unclear. We here provide a brief review of epigenetic drift, focusing on the potential implications for ageing, stem cell biology and disease risk prediction. It has been demonstrated that epigenetic drift affects most of the genome, suggesting a global deregulation of DNAm patterns with age. A component of this drift is tissue-specific, allowing remarkably accurate age-predictive models to be constructed. Another component is tissue-independent, targeting stem cell differentiation pathways and affecting stem cells, which may explain the observed decline of stem cell function with age. Age-associated increases in DNAm target developmental genes, overlapping those associated with environmental disease risk factors and with disease itself, notably cancer. In particular, cancers and precursor cancer lesions exhibit aggravated age DNAm signatures. Epigenetic drift is also influenced by genetic factors. Thus, drift emerges as a promising biomarker for premature or biological ageing, and could potentially be used in geriatrics for disease risk prediction. Finally, we propose, in the context of human evolution, that epigenetic drift may represent a case of epigenetic thrift, or bet-hedging. In summary, this review demonstrates the growing importance of the 'ageing epigenome', with potentially far-reaching implications for understanding the effect of age on stem cell function and differentiation, as well as for disease prevention.