A special collection on multi-omics data sharing, launched today at Scientific Data, offers to the scientific community a compendium of multi-omics datasets ready for reuse, which showcase the ...diversity of multi-omics projects and highlights innovative approaches for preprocessing, quality control, hosting and access.
Stephan Beck discusses recent developments in sharing personal genomes as part of the Personal Genome Project in the UK and globally, and how these efforts are advancing research.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
The Illumina Infinium HumanMethylationEPIC BeadChip is the new platform for high-throughput DNA methylation analysis, effectively doubling the coverage compared to the older 450 K array. Here we ...present a significantly updated and improved version of the Bioconductor package ChAMP, which can be used to analyze EPIC and 450k data. Many enhanced functionalities have been added, including correction for cell-type heterogeneity, network analysis and a series of interactive graphical user interfaces.
ChAMP is a BioC package available from https://bioconductor.org/packages/release/bioc/html/ChAMP.html.
a.teschendorff@ucl.ac.uk or s.beck@ucl.ac.uk or a.feber@ucl.ac.uk.
Supplementary data are available at Bioinformatics online.
An outstanding challenge of epigenome-wide association studies (EWASs) performed in complex tissues is the identification of the specific cell type(s) responsible for the observed differential DNA ...methylation. Here we present a statistical algorithm called CellDMC ( https://github.com/sjczheng/EpiDISH ), which can identify differentially methylated positions and the specific cell type(s) driving the differential methylation. We validated CellDMC on in silico mixtures of DNA methylation data generated with different technologies, as well as on real mixtures from epigenome-wide association and cancer epigenome studies. CellDMC achieved over 90% sensitivity and specificity in scenarios where current state-of-the-art methods did not identify differential methylation. By applying CellDMC to an EWAS performed in buccal swabs, we identified smoking-associated differentially methylated positions occurring in the epithelial compartment, which we validated in smoking-related lung cancer. CellDMC may be useful in the identification of causal DNA-methylation alterations in disease.
Intra-sample cellular heterogeneity presents numerous challenges to the identification of biomarkers in large Epigenome-Wide Association Studies (EWAS). While a number of reference-based ...deconvolution algorithms have emerged, their potential remains underexplored and a comparative evaluation of these algorithms beyond tissues such as blood is still lacking.
Here we present a novel framework for reference-based inference, which leverages cell-type specific DNAse Hypersensitive Site (DHS) information from the NIH Epigenomics Roadmap to construct an improved reference DNA methylation database. We show that this leads to a marginal but statistically significant improvement of cell-count estimates in whole blood as well as in mixtures involving epithelial cell-types. Using this framework we compare a widely used state-of-the-art reference-based algorithm (called constrained projection) to two non-constrained approaches including CIBERSORT and a method based on robust partial correlations. We conclude that the widely-used constrained projection technique may not always be optimal. Instead, we find that the method based on robust partial correlations is generally more robust across a range of different tissue types and for realistic noise levels. We call the combined algorithm which uses DHS data and robust partial correlations for inference, EpiDISH (Epigenetic Dissection of Intra-Sample Heterogeneity). Finally, we demonstrate the added value of EpiDISH in an EWAS of smoking.
Estimating cell-type fractions and subsequent inference in EWAS may benefit from the use of non-constrained reference-based cell-type deconvolution methods.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The Illumina Infinium 450 k DNA Methylation Beadchip is a prime candidate technology for Epigenome-Wide Association Studies (EWAS). However, a difficulty associated with these beadarrays is that ...probes come in two different designs, characterized by widely different DNA methylation distributions and dynamic range, which may bias downstream analyses. A key statistical issue is therefore how best to adjust for the two different probe designs.
Here we propose a novel model-based intra-array normalization strategy for 450 k data, called BMIQ (Beta MIxture Quantile dilation), to adjust the beta-values of type2 design probes into a statistical distribution characteristic of type1 probes. The strategy involves application of a three-state beta-mixture model to assign probes to methylation states, subsequent transformation of probabilities into quantiles and finally a methylation-dependent dilation transformation to preserve the monotonicity and continuity of the data. We validate our method on cell-line data, fresh frozen and paraffin-embedded tumour tissue samples and demonstrate that BMIQ compares favourably with two competing methods. Specifically, we show that BMIQ improves the robustness of the normalization procedure, reduces the technical variation and bias of type2 probe values and successfully eliminates the type1 enrichment bias caused by the lower dynamic range of type2 probes. BMIQ will be useful as a preprocessing step for any study using the Illumina Infinium 450 k platform.
BMIQ is freely available from http://code.google.com/p/bmiq/.
a.teschendorff@ucl.ac.uk
Supplementary data are available at Bioinformatics online.
Cell type heterogeneity presents a challenge to the interpretation of epigenome data, compounded by the difficulty in generating reliable single-cell DNA methylomes for large numbers of cells and ...samples. We present EPISCORE, a computational algorithm that performs virtual microdissection of bulk tissue DNA methylation data at single cell-type resolution for any solid tissue. EPISCORE applies a probabilistic epigenetic model of gene regulation to a single-cell RNA-seq tissue atlas to generate a tissue-specific DNA methylation reference matrix, allowing quantification of cell-type proportions and cell-type-specific differential methylation signals in bulk tissue data. We validate EPISCORE in multiple epigenome studies and tissue types.
The Illumina Infinium HumanMethylation450 BeadChip is a new platform for high-throughput DNA methylation analysis. Several methods for normalization and processing of these data have been published ...recently. Here we present an integrated analysis pipeline offering a choice of the most popular normalization methods while also introducing new methods for calling differentially methylated regions and detecting copy number aberrations.
ChAMP is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at bioconductor.org
Epigenetic clocks comprise a set of CpG sites whose DNA methylation levels measure subject age. These clocks are acknowledged as a highly accurate molecular correlate of chronological age in humans ...and other vertebrates. Also, extensive research is aimed at their potential to quantify biological aging rates and test longevity or rejuvenating interventions. Here, we discuss key challenges to understand clock mechanisms and biomarker utility. This requires dissecting the drivers and regulators of age-related changes in single-cell, tissue- and disease-specific models, as well as exploring other epigenomic marks, longitudinal and diverse population studies, and non-human models. We also highlight important ethical issues in forensic age determination and predicting the trajectory of biological aging in an individual.
The cataloging of the vascular plants of the Americas has a centuries-long history, but it is only in recent decades that an overview of the entire flora has become possible. We present an integrated ...assessment of all known native species of vascular plants in the Americas. Twelve regional and national checklists, prepared over the past 25 years and including two large ongoing flora projects, were merged into a single list. Our publicly searchable checklist includes 124,993 species, 6227 genera, and 355 families, which correspond to 33% of the 383,671 vascular plant species known worldwide. In the past 25 years, the rate at which new species descriptions are added has averaged 744 annually for the Americas, and we can expect the total to reach about 150,000.
Full text
Available for:
BFBNIB, NMLJ, NUK, ODKLJ, PNG, SAZU, UL, UM, UPUK