DNA methylation levels at cytosine-phosphate-guanine (CpG) sites with multimodal distributions among different samples have been reported recently. One possible explanation for such variability is ...that genetic variants might affect epigenetic variation. One obvious case is that mutations such as single-nucleotide polymorphisms (SNPs) interrupt CpG sites, resulting in different DNA methylation levels for different genotypes. However, the relationship between genetic variations and epigenetic differences has not been studied thoroughly, partially because of the lack of powerful and robust methods to survey genome-wide CpG sites with multimodal methylation level distributions (mmCpGs). In this article, we develop a Gaussian mixture-model clustering (GMMC)-based approach to systematically detect all mmCpGs across the genome based on the GAW20 data set. In total, 3785 and 3847 mmCpGs have been identified in pre- and posttreatment data sets, respectively. Result analysis shows that approximately 68 to 70% of mmCpGs detected from unrelated individuals either have direct overlaps with SNPs or have associations with nearby SNPs, suggesting a strong correlation between SNPs and mmCpGs. Comparison with an existing approach illustrates that our GMMC-based method is more consistent when the number of samples decreases. In conclusion, mmCpGs may reveal important connections between genetics and epigenetics and they should be carefully identified and evaluated.
Homozygosity disequilibrium (HD), indicating a nonrandom pattern of sizable runs of homozygosity that deviates from a random allocation of homozygous and heterozygous genotypes in the genome, is an ...important phenomenon in population genomics and medical genomics. We performed the first genome-wide study investigating the roles of HD in pharmacogenomics and pharmacoepigenomics by analyzing GAW20 data. We inferred whole-genome profiles of homozygosity intensities and performed genome-wide homozygosity association analyses to identify regions of HD associated with triglyceride (TG) response to fenofibrate by using LOHAS (Loss-of-Heterozygosity Analysis Suite) software. The analysis identified a region of HD contained in
at 20p12 to be significantly associated with TG response to fenofibrate. We also examined the common genetic component in TG and methylation responses to fenofibrate. The methylation response to fenofibrate was regarded as a methylation quantitative trait, and our methylation quantitative trait locus analysis identified a
-acting regulation association with marginal significance between the homozygosity intensity of
and the methylation response to fenofibrate. These findings may help delineate the genetic basis of pharmacogenomic and pharmacoepigenomic responses to fenofibrate intervention.
GAW20 provided participants with an opportunity to comprehensively examine genetic and epigenetic variation among related individuals in the context of drug treatment response. GAW20 used data from ...188 families (
= 1105) participating in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study (clinicaltrials.gov identifier NCT00083369), which included CD4+ T-cell DNA methylation at 463,995 cytosine-phosphate-guanine (CpG) sites measured before and after a 3-week treatment with fenofibrate, single-nucleotide variation at 906,600 loci, metabolic syndrome components ascertained before and after the drug intervention, and relevant covariates. All GOLDN participants were of European descent, with an average age of 48 years. In addition, approximately half were women and approximately 40% met the diagnostic criteria for metabolic syndrome. Unique advantages of the GAW20data set included longitudinal (3 weeks apart) measurements of DNA methylation, the opportunity to explore the contributions of both genotype and DNA methylation to the interindividual variability in drug treatment response, and the familial relationships between study participants. The principal disadvantage of GAW20/GOLDN data was the spurious correlation between batch effects and fenofibrate effects on methylation, which arose because the pre- and posttreatment methylation data were generated and normalized separately, and any attempts to remove time-dependent technical artifacts would also remove biologically meaningful changes brought on by fenofibrate. Despite this limitation, the GAW20 data set offered informative, multilayered omics data collected in a large population-based study of common disease traits, which resulted in creative approaches to integration and analysis of inherited human variation.
We conducted a genome-wide linkage scan to detect loci that influence the levels of fasting triglycerides in plasma. Fasting triglyceride levels were available at 4 time points (visits), 2 pre- and 2 ...post-fenofibrate intervention. Multipoint identity-by-descent (MIBD) matrices were derived from genotypes using IBDLD. Variance-component linkage analyses were then conducted using SOLAR (Sequential Oligogenic Linkage Analysis Routines). We found evidence of linkage (logarithm of odds LOD ≥3) at 5 chromosomal regions with triglyceride levels in plasma. The highest LOD scores were observed for linkage to the estimated genetic value (additive genetic component) of the log-normalized triglyceride levels in plasma. Our results suggest that a chromosome 10 locus at 37 cM (LOD
= 3.01, LOD
= 3.72) influences fasting triglyceride levels in plasma regardless of the fenofibrate intervention, and that loci in chromosomes 1 at 170 cM and 4 at 24 cM ceases to affect the triglyceride levels when fenofibrate is present, while the regions in chromosomes 6 at 136 to 162 cM and 11 at 39 to 40 cM appear to influence triglyceride levels in response to fenofibrate.
The Genetic Analysis Workshop (GAW) presents an opportunity to collaboratively evaluate methodology relevant to current issues in genetic epidemiology. The GAW20 data combine real clinical trial data ...with fictitious epigenetic drug response endpoints. Considering the evidence suggesting that networks of interactions between many genes underlie complex phenotypes, we utilize differential methylation status to identify a relevant gene set for enrichment analysis and use this to infer potential biological function underlying drug response. We highlight the pertinence of considering the potential for widespread epistatic interactions in the absence of main effects, and present evidence of epistasis between single-nucleotide polymorphisms (SNPs) on the two RNA demethylases
and
.
Epigenome-wide association studies (EWAS) have traditionally focused on the association test of single epigenetic markers with complex traits. However, it is possible that multiple ...cytosine-phosphate-guanine (CpG) sites at the same locus could jointly exert their effects on human traits. Therefore, a region-based test that combines multiple markers could be more powerful. We used 2 different region-based tests to investigate the association between changes in DNA methylation and drug response, including the median methylation level test (MMLT) and sequence kernel association test (SKAT). No genes were found to be significantly associated with the drug response (for triglycerides, the false discovery rate ranged from 0.855 to 0.999; for high-density lipoprotein cholesterol, and the false discovery rate ranged from 0.584 to 0.915). Further evidence is needed to explore potential application of gene-level methylation association analysis.
Genome-wide association studies often collect multiple phenotypes for complex diseases. Multivariate joint analyses have higher power to detect genetic variants compared with the marginal analysis of ...each phenotype and are also able to identify loci with pleiotropic effects. We extend the unified score-based association test to incorporate family structure, apply different approaches to analyze multiple traits in GAW20 real samples, and compare the results. Through simulation studies, we confirm that the Type I error rate of the pedigree-based unified score association test is appropriately controlled. In marginalanalysis of triglyceride levels, we found 1 subgenome-wide significant variant on chromosome 6. Joint analyses identified several suggestive genome-wide significant signals, with the pedigree-based unified score association test yielding the greatest number of significant results.
DNA methylation plays an important role in normal human development and disease. In epigenome-wide association studies (EWAS), a univariate test for association between a phenotype and each ...cytosine-phosphate-guanine (CpG) site has been widely used. Given the number of CpG sites tested in EWAS, a stringent significance cutoff is required to adjust for multiple testing; in addition, multiple nearby CpG sites may be associated with the phenotype, which is ignored by a univariate test. These two factors may contribute to the power loss of a univariate test. As an alternative, we propose applying an adaptive gene-based test that is powerful in genome-wide association studies (GWAS), called
, to EWAS for simultaneous testing on multiple CpG sites within or near a gene. We show its application to the GAW20 methylation data set.
The GAW20 simulation data set is based upon the companion Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study fenofibrate clinical trial data set that forms the real data example for ...GAW20. The simulated data problem consists of 200 simulated replications of what might happen if we were to repeat the GOLDN clinical trial 200 independent times, for these exact same subjects, but using a new fictitious drug (called "genomethate") that has a pharmaco-epigenetic effect on triglyceride response. For each replication, the pre-genomethate values at visits 1 and 2 are constant (ie, pedigree structures, age, sex, all phenotypes, covariates, genome-wide association study (GWAS) genotypes, and visit 2 methylation values), the same as the real GOLDN data across all 200 replications. Only the post-genomethate treatment data (ie, methylation and triglyceride levels for visits 3 and 4) change across the 200 replications. We postulate a growth curve pharmaco-epigenetic response model, in which each patient's response to genomethate treatment is individualized, and is dependent upon their genotype as well as the methylation state for key genes.
An increasing number of studies are focused on the epigenetic regulation of DNA to affect gene expression without modifications to the DNA sequence. Methylation plays an important role in shaping ...disease traits; however, previous studies were mainly experiment, based, resulting in few reports that measured gene-methylation interaction effects via statistical means. In this study, we applied the data set adaptive W-test to measure gene-methylation interactions. Performance was evaluated by the ability to detect a given set of causal markers in the data set obtained from the GAW20. Results from simulation data analyses showed that the W-test was able to detect most markers. The method was also applied to chromosome 11 of the experimental data set and identified clusters of genes with neuronal and retinal functions, including
,
, and
. Genes from the
family were also identified; these genes are potentially related to the regulation of triglyceride levels. Our results suggest that the W-test could be an efficient and effective method to detect gene-methylation interactions. Furthermore, the identified genes suggest an interesting relationship between lipid levels and the etiology of neurological disorders.