Illumina Infinium DNA Methylation BeadChips represent the most widely used genome-scale DNA methylation assays. Existing strategies for masking Infinium probes overlapping repeats or single ...nucleotide polymorphisms (SNPs) are based largely on ad hoc assumptions and subjective criteria. In addition, the recently introduced MethylationEPIC (EPIC) array expands on the utility of this platform, but has not yet been well characterized. We present in this paper an extensive characterization of probes on the EPIC and HM450 microarrays, including mappability to the latest genome build, genomic copy number of the 3΄ nested subsequence and influence of polymorphisms including a previously unrecognized color channel switch for Type I probes. We show empirical evidence for exclusion criteria for underperforming probes, providing a sounder basis than current ad hoc criteria for exclusion. In addition, we describe novel probe uses, exemplified by the addition of a total of 1052 SNP probes to the existing 59 explicit SNP probes on the EPIC array and the use of these probes to predict ethnicity. Finally, we present an innovative out-of-band color channel application for the dual use of 62 371 probes as internal bisulfite conversion controls.
Multicellular eukaryotic genomes show enormous differences in size. A substantial part of this variation is due to the presence of transposable elements (TEs). They contribute significantly to a ...cell’s mass of DNA and have the potential to become involved in host gene control. We argue that the suppression of their activities by methylation of the C–phosphate–G (CpG) dinucleotide in DNA is essential for their long-term accommodation in the host genome and, therefore, to its expansion. An inevitable consequence of cytosine methylation is an increase in C-to-T transition mutations via deamination, which causes CpG loss. Cytosine deamination is often needed for TEs to take on regulatory functions in the host genome. Our study of the whole-genome sequences of 53 organisms showed a positive correlation between the size of a genome and the percentage of TEs it contains, as well as a negative correlation between size and the CpG observed/expected (O/E) ratio in both TEs and the host DNA. TEs are seldom found at promoters and transcription start sites, but they are found more at enhancers, particularly after they have accumulated C-to-T and other mutations. Therefore, the methylation of TE DNA allows for genome expansion and also leads to new opportunities for gene control by TE-based regulatory sites.
Infinium methylation arrays are not available for the vast majority of non-human mammals. Moreover, even if species-specific arrays were available, probe differences between them would confound ...cross-species comparisons. To address these challenges, we developed the mammalian methylation array, a single custom array that measures up to 36k CpGs per species that are well conserved across many mammalian species. We designed a set of probes that can tolerate specific cross-species mutations. We annotate the array in over 200 species and report CpG island status and chromatin states in select species. Calibration experiments demonstrate the high fidelity in humans, rats, and mice. The mammalian methylation array has several strengths: it applies to all mammalian species even those that have not yet been sequenced, it provides deep coverage of conserved cytosines facilitating the development of epigenetic biomarkers, and it increases the probability that biological insights gained in one species will translate to others.
DNA methylation loss occurs frequently in cancer genomes, primarily within lamina-associated, late-replicating regions termed partially methylated domains (PMDs). We profiled 39 diverse primary ...tumors and 8 matched adjacent tissues using whole-genome bisulfite sequencing (WGBS) and analyzed them alongside 343 additional human and 206 mouse WGBS datasets. We identified a local CpG sequence context associated with preferential hypomethylation in PMDs. Analysis of CpGs in this context ('solo-WCGWs') identified previously undetected PMD hypomethylation in almost all healthy tissue types. PMD hypomethylation increased with age, beginning during fetal development, and appeared to track the accumulation of cell divisions. In cancer, PMD hypomethylation depth correlated with somatic mutation density and cell cycle gene expression, consistent with its reflection of mitotic history and suggesting its application as a mitotic clock. We propose that late replication leads to lifelong progressive methylation loss, which acts as a biomarker for cellular aging and which may contribute to oncogenesis.
Dissecting intercellular epigenetic differences is key to understanding tissue heterogeneity. Recent advances in single-cell DNA methylome profiling have presented opportunities to resolve this ...heterogeneity at the maximum resolution. While these advances enable us to explore frontiers of chromatin biology and better understand cell lineage relationships, they pose new challenges in data processing and interpretation. This review surveys the current state of computational tools developed for single-cell DNA methylome data analysis. We discuss critical components of single-cell DNA methylome data analysis, including data preprocessing, quality control, imputation, dimensionality reduction, cell clustering, supervised cell annotation, cell lineage reconstruction, gene activity scoring, and integration with transcriptome data. We also highlight unique aspects of single-cell DNA methylome data analysis and discuss how techniques common to other single-cell omics data analyses can be adapted to analyze DNA methylomes. Finally, we discuss existing challenges and opportunities for future development.
We present the genome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types from The Cancer Genome Atlas (TCGA). We identify 562,709 transposase-accessible DNA elements ...that substantially extend the compendium of known cis-regulatory elements. Integration of ATAC-seq (the assay for transposase-accessible chromatin using sequencing) with TCGA multi-omic data identifies a large number of putative distal enhancers that distinguish molecular subtypes of cancers, uncovers specific driving transcription factors via protein-DNA footprints, and nominates long-range gene-regulatory interactions in cancer. These data reveal genetic risk loci of cancer predisposition as active DNA regulatory elements in cancer, identify gene-regulatory interactions underlying cancer immune evasion, and pinpoint noncoding mutations that drive enhancer activation and may affect patient survival. These results suggest a systematic approach to understanding the noncoding genome in cancer to advance diagnosis and therapy.
Oxford Nanopore sequencing can detect DNA methylations from ionic current signal of single molecules, offering a unique advantage over conventional methods. Additionally, adaptive sampling, a ...software-controlled enrichment method for targeted sequencing, allows reduced representation methylation sequencing that can be applied to CpG islands or imprinted regions. Here we present DeepMod2, a comprehensive deep-learning framework for methylation detection using ionic current signal from Nanopore sequencing. DeepMod2 implements both a bidirectional long short-term memory (BiLSTM) model and a Transformer model and can analyze POD5 and FAST5 signal files generated on R9 and R10 flowcells. Additionally, DeepMod2 can run efficiently on central processing unit (CPU) through model pruning and can infer epihaplotypes or haplotype-specific methylation calls from phased reads. We use multiple publicly available and newly generated datasets to evaluate the performance of DeepMod2 under varying scenarios. DeepMod2 has comparable performance to Guppy and Dorado, which are the current state-of-the-art methods from Oxford Nanopore Technologies that remain closed-source. Moreover, we show a high correlation (r = 0.96) between reduced representation and whole-genome Nanopore sequencing. In summary, DeepMod2 is an open-source tool that enables fast and accurate DNA methylation detection from whole-genome or adaptive sequencing data on a diverse range of flowcell types.
EHMT2 is the main euchromatic H3K9 methyltransferase. Embryos with zygotic, or maternal mutation in the Ehmt2 gene exhibit variable developmental delay. To understand how EHMT2 prevents variable ...developmental delay we performed RNA sequencing of mutant and somite stage-matched normal embryos at 8.5-9.5 days of gestation. Using four-way comparisons between delayed and normal embryos we clarified what it takes to be normal and what it takes to develop. We identified differentially expressed genes, for example Hox genes that simply reflected the difference in developmental progression of wild type and the delayed mutant uterus-mate embryos. By comparing wild type and zygotic mutant embryos along the same developmental window we detected a role of EHMT2 in suppressing variation in the transcriptional switches. We identified transcription changes where precise switching during development occurred only in the normal but not in the mutant embryo. At the 6-somite stage, gastrulation-specific genes were not precisely switched off in the Ehmt2-/- zygotic mutant embryos, while genes involved in organ growth, connective tissue development, striated muscle development, muscle differentiation, and cartilage development were not precisely switched on. The Ehmt2mat-/+ maternal mutant embryos displayed high transcriptional variation consistent with their variable survival. Variable derepression of transcripts occurred dominantly in the maternally inherited allele. Transcription was normal in the parental haploinsufficient wild type embryos despite their delay, consistent with their good prospects. Global profiling of transposable elements revealed EHMT2 targeted DNA methylation and suppression at LTR repeats, mostly ERVKs. In Ehmt2-/- embryos, transcription over very long distances initiated from such misregulated 'driver' ERVK repeats, encompassing a multitude of misexpressed 'passenger' repeats. In summary, EHMT2 reduced transcriptional variation of developmental switch genes and developmentally switching repeat elements at the six-somite stage embryos. These findings establish EHMT2 as a suppressor of transcriptional and developmental variation at the transition between gastrulation and organ specification.
Vitamin C deficiency is found in patients with cancer and might complicate various therapy paradigms. Here we show how this deficiency may influence the use of DNA methyltransferase inhibitors ...(DNMTis) for treatment of hematological neoplasias. In vitro, when vitamin C is added at physiological levels to low doses of the DNMTi 5-aza-2′-deoxycytidine (5-aza-CdR), there is a synergistic inhibition of cancer-cell proliferation and increased apoptosis. These effects are associated with enhanced immune signals including increased expression of bidirectionally transcribed endogenous retrovirus (ERV) transcripts, increased cytosolic dsRNA, and activation of an IFN-inducing cellular response. This synergistic effect is likely the result of both passive DNA demethylation by DNMTi and active conversion of 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) by ten–eleven translocation (TET) enzymes at LTR regions of ERVs, because vitamin C acts as a cofactor for TET proteins. In addition, TET2 knockout reduces the synergy between the two compounds. Furthermore, we show that many patients with hematological neoplasia are markedly vitamin C deficient. Thus, our data suggest that correction of vitamin C deficiency in patients with hematological and other cancers may improve responses to epigenetic therapy with DNMTis.
Ovarian cancer ranks as the most deadly gynecologic cancer, and there is an urgent need to develop more effective therapies. Previous studies have shown that G9A, a histone methyltransferase that ...catalyzes mono- and dimethylation of histone H3 lysine9, is highly expressed in ovarian cancer tumors, and its overexpression is associated with poor prognosis. Here we report that pharmacologic inhibition of G9A in ovarian cancer cell lines with high levels of
expression induces synergistic antitumor effects when combined with the DNA methylation inhibitor (DNMTi) 5-aza-2'-deoxycytidine (5-aza-CdR). These antitumor effects included upregulation of endogenous retroviruses (ERV), activation of the viral defense response, and induction of cell death, which have been termed "viral mimicry" effects induced by DNMTi. G9Ai treatment further reduced H3K9me2 levels within the long terminal repeat regions of ERV, resulting in further increases of ERV expression and enhancing "viral mimicry" effects. In contrast, G9Ai and 5-aza-CdR were not synergistic in cell lines with low basal
levels. Taken together, our results suggest that the synergistic effects of combination treatment with DNMTi and G9Ai may serve as a novel therapeutic strategy for patients with ovarian cancer with high levels of G9A expression.
Dual inhibition of DNA methylation and histone H3 lysine 9 dimethylation by 5-aza-CdR and G9Ai results in synergistic upregulation of ERV and induces an antiviral response, serving as a basis for exploring this novel combination treatment in patients with ovarian cancer.
.