Plasma consists of DNA released from multiple tissues within the body. Using genome-wide bisulfite sequencing of plasma DNA and deconvolution of the sequencing data with reference to methylation ...profiles of different tissues, we developed a general approach for studying the major tissue contributors to the circulating DNA pool. We tested this method in pregnant women, patients with hepatocellular carcinoma, and subjects following bone marrow and liver transplantation. In most subjects, white blood cells were the predominant contributors to the circulating DNA pool. The placental contributions in the plasma of pregnant women correlated with the proportional contributions as revealed by fetal-specific genetic markers. The graft-derived contributions to the plasma in the transplant recipients correlated with those determined using donor-specific genetic markers. Patients with hepatocellular carcinoma showed elevated plasma DNA contributions from the liver, which correlated with measurements made using tumor-associated copy number aberrations. In hepatocellular carcinoma patients and in pregnant women exhibiting copy number aberrations in plasma, comparison of methylation deconvolution results using genomic regions with different copy number status pinpointed the tissue type responsible for the aberrations. In a pregnant woman diagnosed as having follicular lymphoma during pregnancy, methylation deconvolution indicated a grossly elevated contribution from B cells into the plasma DNA pool and localized B cells as the origin of the copy number aberrations observed in plasma. This method may serve as a powerful tool for assessing a wide range of physiological and pathological conditions based on the identification of perturbed proportional contributions of different tissues into plasma.
We explored the presence of extrachromosomal circular DNA (eccDNA) in the plasma of pregnant women. Through sequencing following either restriction enzyme or Tn5 transposase treatment, we identified ...eccDNA molecules in the plasma of pregnant women. These eccDNA molecules showed bimodal size distributions peaking at ∼202 and ∼338 bp with distinct 10-bp periodicity observed throughout the size ranges within both peaks, suggestive of their nucleosomal origin. Also, the predominance of the 338-bp peak of eccDNA indicated that eccDNA had a larger size distribution than linear DNA in human plasma. Moreover, eccDNA of fetal origin were shorter than the maternal eccDNA. Genomic annotation of the overall population of eccDNA molecules revealed a preference of these molecules to be generated from 5′-untranslated regions (5′-UTRs), exonic regions, and CpG island regions. Two sets of trinucleotide repeat motifs flanking the junctional sites of eccDNA supported multiple possible models for eccDNA generation. This work highlights the topologic analysis of plasma DNA, which is an emerging direction for circulating nucleic acid research and applications.
Cell-free DNA (cfDNA) in human plasma is a class of biomarkers with many current and potential future diagnostic applications. Recent studies have shown that cfDNA molecules are not randomly ...fragmented and possess information related to their tissues of origin. Pathologies causing death of cells from particular tissues result in perturbations in the relative distribution of DNA from the affected tissues. Such tissue-of-origin analysis is particularly useful in the development of liquid biopsies for cancer. It is therefore of value to accurately determine the relative contributions of the tissues to the plasma DNA pool in a simultaneous manner. In this work, we report that in open chromatin regions, cfDNA molecules show characteristic fragmentation patterns reflected by sequencing coverage imbalance and differentially phased fragment end signals. The latter refers to differences in the read densities of sequences corresponding to the orientation of the upstream and downstream ends of cfDNA molecules in relation to the reference genome. Such cfDNA fragmentation patterns preferentially occur in tissue-specific open chromatin regions where the corresponding tissues contributed DNA into the plasma. Quantitative analyses of such signals allow measurement of the relative contributions of various tissues toward the plasma DNA pool. These findings were validated by plasma DNA sequencing data obtained from pregnant women, organ transplantation recipients, and cancer patients. Orientation-aware plasma DNA fragmentation analysis therefore has potential diagnostic applications in noninvasive prenatal testing, organ transplantation monitoring, and cancer liquid biopsy.
Cell-free DNA in human plasma is nonrandomly fragmented and reflects genomewide nucleosomal organization. Previous studies had demonstrated tissue-specific preferred end sites in plasma DNA of ...pregnant women. In this study, we performed integrative analysis of preferred end sites with the size characteristics of plasma DNA fragments. We mined the preferred end sites in short and long plasma DNA molecules separately and found that these “size-tagged” ends showed improved accuracy in fetal DNA fraction estimation and enhanced noninvasive fetal trisomy 21 testing. Further analysis revealed that the fetal and maternal preferred ends were generated from different locations within the nucleosomal structure. Hence, fetal DNA was frequently cut within the nucleosome core while maternal DNA was mostly cut within the linker region. We further demonstrated that the nucleosome accessibility in placental cells was higher than that for white blood cells, which might explain the difference in the cutting positions and the shortness of fetal DNA in maternal plasma. Interestingly, short and long size-tagged ends were also observable in the plasma of nonpregnant healthy subjects and demonstrated size differences similar to those in the pregnant samples. Because the nonpregnant samples did not contain fetal DNA, the data suggested that the interrelationship of preferred DNA ends, chromatin accessibility, and plasma DNA size profile is likely a general one, extending beyond the context of pregnancy. Plasma DNA fragment end patterns have thus shed light on production mechanisms and show utility in future developments in plasma DNA-based noninvasive molecular diagnostics.
5-Methylcytosine (5mC) is an important type of epigenetic modification. Bisulfite sequencing (BS-seq) has limitations, such as severe DNA degradation. Using single molecule real-time sequencing, we ...developed a methodology to directly examine 5mC. This approach holistically examined kinetic signals of a DNA polymerase (including interpulse duration and pulse width) and sequence context for every nucleotide within a measurement window, termed the holistic kinetic (HK) model. The measurement window of each analyzed double-stranded DNA molecule comprised 21 nucleotides with a cytosine in a CpG site in the center. We used amplified DNA (unmethylated) and M.SssI-treated DNA (methylated) (M.SssI being a CpG methyltransferase) to train a convolutional neural network. The area under the curve for differentiating methylation states using such samples was up to 0.97. The sensitivity and specificity for genome-wide 5mC detection at single-base resolution reached 90% and 94%, respectively. The HK model was then tested on human-mouse hybrid fragments in which each member of the hybrid had a different methylation status. The model was also tested on human genomic DNA molecules extracted from various biological samples, such as buffy coat, placental, and tumoral tissues. The overall methylation levels deduced by the HK model were well correlated with those by BS-seq (
= 0.99;
< 0.0001) and allowed the measurement of allele-specific methylation patterns in imprinted genes. Taken together, this methodology has provided a system for simultaneous genome-wide genetic and epigenetic analyses.
Cell-free fetal DNA is present in the plasma of pregnant women. It consists of short DNA fragments among primarily maternally derived DNA fragments. We sequenced a maternal plasma DNA sample at up to ...65-fold genomic coverage. We showed that the entire fetal and maternal genomes were represented in maternal plasma at a constant relative proportion. Plasma DNA molecules showed a predictable fragmentation pattern reminiscent of nuclease-cleaved nucleosomes, with the fetal DNA showing a reduction in a 166-base pair (bp) peak relative to a 143-bp peak, when compared with maternal DNA. We constructed a genome-wide genetic map and determined the mutational status of the fetus from the maternal plasma DNA sequences and from information about the paternal genotype and maternal haplotype. Our study suggests the feasibility of using genome-wide scanning to diagnose fetal genetic disorders prenatally in a noninvasive way.
Noninvasive prenatal testing using fetal DNA in maternal plasma is an actively researched area. The current generation of tests using massively parallel sequencing is based on counting plasma DNA ...sequences originating from different genomic regions. In this study, we explored a different approach that is based on the use of DNA fragment size as a diagnostic parameter. This approach is dependent on the fact that circulating fetal DNA molecules are generally shorter than the corresponding maternal DNA molecules. First, we performed plasma DNA size analysis using pairedend massively parallel sequencing and microchip-based capillary electrophoresis. We demonstrated that the fetal DNA fraction in maternal plasma could be deduced from the overall size distribution of maternal plasma DNA. The fetal DNA fraction is a critical parameter affecting the accuracy of noninvasive prenatal testing using maternal plasma DNA. Second, we showed that fetal chromosomal aneuploidy could be detected by observing an aberrant proportion of short fragments from an aneuploid chromosome in the paired-end sequencing data. Using this approach, we detected fetal trisomy 21 and trisomy 18 with 100% sensitivity (T21: 36/36; T18: 27/27) and 100% specificity (non-T21: 88/88; non-T18: 97/97). For trisomy 13, the sensitivity and specificity were 95.2% (20/21) and 99% (102/103), respectively. For monosomy X, the sensitivity and specificity were both 100% (10/10 and 8/8). Thus, this study establishes the principle of size-based molecular diagnostics using plasma DNA. This approach has potential applications beyond noninvasive prenatal testing to areas such as oncology and transplantation monitoring.
With the advent of massively parallel sequencing (MPS), DNA analysis can now be performed in a genomewide manner. Recent studies have demonstrated the high precision of MPS for quantifying fetal DNA ...in maternal plasma. In addition, paired-end sequencing can be used to determine the size of each sequenced DNA fragment. We applied MPS in a high-resolution investigation of the clearance profile of circulating fetal DNA.
Using paired-end MPS, we analyzed serial samples of maternal plasma collected from 13 women after cesarean delivery. We also studied the transrenal excretion of circulating fetal DNA in 3 of these individuals by analyzing serial urine samples collected after delivery.
The clearance of circulating fetal DNA occurred in 2 phases, with different kinetics. The initial rapid phase had a mean half-life of approximately 1 h, whereas the subsequent slow phase had a mean half-life of approximately 13 h. The final disappearance of circulating fetal DNA occurred at about 1 to 2 days postpartum. Although transrenal excretion was involved in the clearance of circulating fetal DNA, it was not the major route. Furthermore, we observed significant changes in the size profiles of circulating maternal DNA after delivery, but we did not observe such changes in circulating fetal DNA.
MPS of maternal plasma and urinary DNA permits high-resolution study of the clearance profile of circulating fetal DNA.
Chromosomal aneuploidy is the major reason why couples opt for prenatal diagnosis. Current methods for definitive diagnosis rely on invasive procedures, such as chorionic villus sampling and ...amniocentesis, and are associated with a risk of fetal miscarriage. Fetal DNA has been found in maternal plasma but exists as a minor fraction among a high background of maternal DNA. Hence, quantitative perturbations caused by an aneuploid chromosome in the fetal genome to the overall representation of sequences from that chromosome in maternal plasma would be small. Even with highly precise single molecule counting methods such as digital PCR, a large number of DNA molecules and hence maternal plasma volume would need to be analyzed to achieve the necessary analytical precision. Here we reasoned that instead of using approaches that target specific gene loci, the use of a locus-independent method would greatly increase the number of target molecules from the aneuploid chromosome that could be analyzed within the same fixed volume of plasma. Hence, we used massively parallel genomic sequencing to quantify maternal plasma DNA sequences for the noninvasive prenatal detection of fetal trisomy 21. Twenty-eight first and second trimester maternal plasma samples were tested. All 14 trisomy 21 fetuses and 14 euploid fetuses were correctly identified. Massively parallel plasma DNA sequencing represents a new approach that is potentially applicable to all pregnancies for the noninvasive prenatal diagnosis of fetal chromosomal aneuploidies.
Massively parallel sequencing of DNA molecules in the plasma of pregnant women has been shown to allow accurate and noninvasive prenatal detection of fetal trisomy 21. However, whether the sequencing ...approach is as accurate for the noninvasive prenatal diagnosis of trisomy 13 and 18 is unclear due to the lack of data from a large sample set. We studied 392 pregnancies, among which 25 involved a trisomy 13 fetus and 37 involved a trisomy 18 fetus, by massively parallel sequencing. By using our previously reported standard z-score approach, we demonstrated that this approach could identify 36.0% and 73.0% of trisomy 13 and 18 at specificities of 92.4% and 97.2%, respectively. We aimed to improve the detection of trisomy 13 and 18 by using a non-repeat-masked reference human genome instead of a repeat-masked one to increase the number of aligned sequence reads for each sample. We then applied a bioinformatics approach to correct GC content bias in the sequencing data. With these measures, we detected all (25 out of 25) trisomy 13 fetuses at a specificity of 98.9% (261 out of 264 non-trisomy 13 cases), and 91.9% (34 out of 37) of the trisomy 18 fetuses at 98.0% specificity (247 out of 252 non-trisomy 18 cases). These data indicate that with appropriate bioinformatics analysis, noninvasive prenatal diagnosis of trisomy 13 and trisomy 18 by maternal plasma DNA sequencing is achievable.