Human aging cannot be fully understood in terms of the constrained genetic setting. Epigenetic drift is an alternative means of explaining age-associated alterations. To address this issue, we ...performed whole-genome bisulfite sequencing (WGBS) of newborn and centenarian genomes. The centenarian DNA had a lower DNA methylation content and a reduced correlation in the methylation status of neighboring cytosine—phosphate—guanine (CpGs) throughout the genome in comparison with the more homogeneously methylated newborn DNA. The more hypomethylated CpGs observed in the centenarian DNA compared with the neonate covered all genomic compartments, such as promoters, exonic, intronic, and intergenic regions. For regulatory regions, the most hypomethylated sequences in the centenarian DNA were present mainly at CpG-poor promoters and in tissue-specific genes, whereas a greater level of DNA methylation was observed in CpG island promoters. We extended the study to a larger cohort of newborn and nonagenarian samples using a 450,000 CpG-site DNA methylation microarray that reinforced the observation of more hypomethylated DNA sequences in the advanced age group. WGBS and 450,000 analyses of middle-age individuals demonstrated DNA methylomes in the crossroad between the newborn and the nonagenarian/centenarian groups. Our study constitutes a unique DNA methylation analysis of the extreme points of human life at a single-nucleotide resolution level.
Abstract
Motivation
DNA methylation is essential for normal embryogenesis and development in mammals and can be captured at single base pair resolution by whole genome bisulfite sequencing (WGBS). ...Current available analysis tools are becoming rapidly outdated as they lack sensible functionality and efficiency to handle large amounts of data now commonly created.
Results
We developed gemBS, a fast high-throughput bioinformatics pipeline specifically designed for large scale BS-Seq analysis that combines a high performance BS-mapper (GEM3) and a variant caller specifically for BS-Seq data (BScall). gemBS provides genotype information and methylation estimates for all genomic cytosines in different contexts (CpG and non-CpG) and a set of quality reports for comprehensive and reproducible analysis. gemBS is highly modular and can be easily automated, while producing robust and accurate results.
Availability and implementation
gemBS is released under the GNU GPLv3+ license. Source code and documentation are freely available from www.statgen.cat/gemBS.
Supplementary information
Supplementary data are available at Bioinformatics online.
An increased level of Lp(a) lipoprotein has been identified as a risk factor for coronary artery disease that is highly heritable. The genetic determinants of the Lp(a) lipoprotein level and their ...relevance for the risk of coronary disease are incompletely understood.
We used a novel gene chip containing 48,742 single-nucleotide polymorphisms (SNPs) in 2100 candidate genes to test for associations in 3145 case subjects with coronary disease and 3352 control subjects. Replication was tested in three independent populations involving 4846 additional case subjects with coronary disease and 4594 control subjects.
Three chromosomal regions (6q26-27, 9p21, and 1p13) were strongly associated with the risk of coronary disease. The LPA locus on 6q26-27 encoding Lp(a) lipoprotein had the strongest association. We identified a common variant (rs10455872) at the LPA locus with an odds ratio for coronary disease of 1.70 (95% confidence interval CI, 1.49 to 1.95) and another independent variant (rs3798220) with an odds ratio of 1.92 (95% CI, 1.48 to 2.49). Both variants were strongly associated with an increased level of Lp(a) lipoprotein, a reduced copy number in LPA (which determines the number of kringle IV-type 2 repeats), and a small Lp(a) lipoprotein size. Replication studies confirmed the effects of both variants on the Lp(a) lipoprotein level and the risk of coronary disease. A meta-analysis showed that with a genotype score involving both LPA SNPs, the odds ratios for coronary disease were 1.51 (95% CI, 1.38 to 1.66) for one variant and 2.57 (95% CI, 1.80 to 3.67) for two or more variants. After adjustment for the Lp(a) lipoprotein level, the association between the LPA genotype score and the risk of coronary disease was abolished.
We identified two LPA variants that were strongly associated with both an increased level of Lp(a) lipoprotein and an increased risk of coronary disease. Our findings provide support for a causal role of Lp(a) lipoprotein in coronary disease.
Natural killer (NK) cells are lymphocytes involved in antimicrobial and antitumoral immune responses. Using N-ethyl-N-nitrosourea mutagenesis in mice, we identified a mutant with increased resistance ...to viral infections because of the presence of hyperresponsive NK cells. Whole-genome sequencing and functional analysis revealed a loss-of-function mutation in the Ncr1 gene encoding the activating receptor NKp46. The down-regulation of NK cell activity by NKp46 was associated with the silencing of the Helios transcription factor in NK cells. NKp46 was critical for the subsequent development of antiviral and antibacterial T cell responses, which suggests that the regulation of NK cell function by NKp46 allows for the optimal development of adaptive immune responses. NKp46 blockade enhanced NK cell reactivity in vivo, which could enable the design of immunostimulation strategies in humans.
The cost of whole-genome bisulfite sequencing (WGBS) remains a bottleneck for many studies and it is therefore imperative to extract as much information as possible from a given dataset. This is ...particularly important because even at the recommend 30X coverage for reference methylomes, up to 50% of high-resolution features such as differentially methylated positions (DMPs) cannot be called with current methods as determined by saturation analysis. To address this limitation, we have developed a tool that dynamically segments WGBS methylomes into blocks of comethylation (COMETs) from which lost information can be recovered in the form of differentially methylated COMETs (DMCs). Using this tool, we demonstrate recovery of ∼30% of the lost DMP information content as DMCs even at very low (5X) coverage. This constitutes twice the amount that can be recovered using an existing method based on differentially methylated regions (DMRs). In addition, we explored the relationship between COMETs and haplotypes in lymphoblastoid cell lines of African and European origin. Using best fit analysis, we show COMETs to be correlated in a population-specific manner, suggesting that this type of dynamic segmentation may be useful for integrated (epi)genome-wide association studies in the future.
Background/Aims In primary biliary cirrhosis (PBC), pathogenesis is influenced by genetic factors that remain poorly elucidated up to now. We investigated the impact of sequence diversity in ...candidate genes involved in immunity ( CTLA-4 and TNFα ), in bile formation (10 hepatobiliary transporter genes) and in the adaptative response to cholestasis (three nuclear receptor genes) on the susceptibility and severity of PBC. Methods A total of 42 Ht SNPs were identified and compared in 258 PBC patients and two independent groups of 286 and 269 healthy controls. All participants were white continental individuals with French ancestry. Results Ht SNPs of CTLA-4 and TNFα genes were significantly associated with susceptibility to PBC. The progression rate of liver disease under ursodeoxycholic acid (UDCA) therapy was significantly linked to SNPs of TNFα and SLC4A2 /anion exchanger 2 (AE2) genes. A multivariate Cox regression analysis including clinical and biochemical parameters showed that SLC4A2 /AE2 variant was an independent prognostic factor. Conclusions These data point to a primary role of genes encoding regulators of the immune system in the susceptibility to PBC. They also demonstrate that allelic variations in TNFα and SLC4A2 /AE2 have a significant impact on the evolutive profile of PBC under UDCA therapy.
The location of a schizophrenia susceptibility locus at chromosome 22q11 has been suggested by genome-wide linkage studies. Additional support was provided by the observation of a ...higher-than-expected frequency of 22q11 microdeletions in patients with schizophrenia and the demonstration that ≈20-30% of individuals with 22q11 microdeletions develop schizophrenia or schizoaffective disorder in adolescence and adulthood. Analysis of the extent of these microdeletions by using polymorphic markers afforded further refinement of this locus to a region of ≈1.5 Mb. Recently, a high rate of 22q11 microdeletions was also reported for a cohort of 47 patients with Childhood Onset Schizophrenia, a rare and severe form of schizophrenia with onset by age 13. It is therefore likely that this 1.5-Mb region contains one or more genes that predispose to schizophrenia. In three independent samples, we provide evidence for a contribution of the PRODH2/DGCR6 locus in 22q11-associated schizophrenia. We also uncover an unusual pattern of PRODH2 gene variation that mimics the sequence of a linked pseudogene. Several of the pseudogene-like variants we identified result in missense changes at conserved residues and may prevent synthesis of a fully functional enzyme. Our results have implications for understanding the genetic basis of the 22q11-associated psychiatric phenotypes and provide further insights into the genomic instability of this region.
An increased prevalence of microdeletions at the 22q11 locus has been reported in samples of patients with schizophrenia. 22q11 microdeletions represent the highest known genetic risk factor for the ...development of schizophrenia, second only to that of the monozygotic cotwin of an affected individual or the offspring of two schizophrenic parents. It is therefore clear that a schizophrenia susceptibility locus maps to chromosome 22q11. In light of evidence for suggestive linkage for schizophrenia in this region, we hypothesized that, whereas deletions of chromosome 22q11 may account for only a small proportion of schizophrenia cases in the general population (up to ≈2%), nondeletion variants of individual genes within the 22q11 region may make a larger contribution to susceptibility to schizophrenia in the wider population. By studying a dense collection of markers (average one single nucleotide polymorphism/20 kb over 1.5 Mb) in the vicinity of the 22q11 locus, in both family- and population-based samples, we present here results consistent with this assumption. Moreover, our results are consistent with contribution from more than one gene to the strikingly increased disease risk associated with this locus. Finer-scale haplotype mapping has identified two subregions within the 1.5-Mb locus that are likely to harbor candidate schizophrenia susceptibility genes.
An investigation into fine-scale European population structure was carried out using high-density genetic variation on nearly 6000 individuals originating from across Europe. The individuals were ...collected as control samples and were genotyped with more than 300 000 SNPs in genome-wide association studies using the Illumina Infinium platform. A major East-West gradient from Russian (Moscow) samples to Spanish samples was identified as the first principal component (PC) of the genetic diversity. The second PC identified a North-South gradient from Norway and Sweden to Romania and Spain. Variation of frequencies at markers in three separate genomic regions, surrounding LCT, HLA and HERC2, were strongly associated with this gradient. The next 18 PCs also accounted for a significant proportion of genetic diversity observed in the sample. We present a method to predict the ethnic origin of samples by comparing the sample genotypes with those from a reference set of samples of known origin. These predictions can be performed using just summary information on the known samples, and individual genotype data are not required. We discuss issues raised by these data and analyses for association studies including the matching of case-only cohorts to appropriate pre-collected control samples for genome-wide association studies.