Aims/hypothesis
Type 2 diabetes increases the risk of cardiovascular and renal complications, but early risk prediction could lead to timely intervention and better outcomes. Genetic information can ...be used to enable early detection of risk.
Methods
We developed a multi-polygenic risk score (multiPRS) that combines ten weighted PRSs (10 wPRS) composed of 598 SNPs associated with main risk factors and outcomes of type 2 diabetes, derived from summary statistics data of genome-wide association studies. The 10 wPRS, first principal component of ethnicity, sex, age at onset and diabetes duration were included into one logistic regression model to predict micro- and macrovascular outcomes in 4098 participants in the ADVANCE study and 17,604 individuals with type 2 diabetes in the UK Biobank study.
Results
The model showed a similar predictive performance for cardiovascular and renal complications in different cohorts. It identified the top 30% of ADVANCE participants with a mean of 3.1-fold increased risk of major micro- and macrovascular events (
p
= 6.3 × 10
−21
and
p
= 9.6 × 10
−31
, respectively) and a 4.4-fold (
p
= 6.8 × 10
−33
) higher risk of cardiovascular death. While in ADVANCE overall, combined intensive blood pressure and glucose control decreased cardiovascular death by 24%, the model identified a high-risk group in whom it decreased the mortality rate by 47%, and a low-risk group in whom it had no discernible effect. High-risk individuals had the greatest absolute risk reduction with a number needed to treat of 12 to prevent one cardiovascular death over 5 years.
Conclusions/interpretation
This novel multiPRS model stratified individuals with type 2 diabetes according to risk of complications and helped to target earlier those who would receive greater benefit from intensive therapy.
Graphical abstract
Humans have colonized the planet through a series of range expansions, which deeply impacted genetic diversity in newly settled areas and potentially increased the frequency of deleterious mutations ...on expanding wave fronts. To test this prediction, we studied the genomic diversity of French Canadians who colonized Quebec in the 17th century. We used historical information and records from ∼4000 ascending genealogies to select individuals whose ancestors lived mostly on the colonizing wave front and individuals whose ancestors remained in the core of the settlement. Comparison of exomic diversity reveals that: (i) both new and low-frequency variants are significantly more deleterious in front than in core individuals, (ii) equally deleterious mutations are at higher frequencies in front individuals, and (iii) front individuals are two times more likely to be homozygous for rare very deleterious mutations present in Europeans. These differences have emerged in the past six to nine generations and cannot be explained by differential inbreeding, but are consistent with relaxed selection mainly due to higher rates of genetic drift on the wave front. Demographic inference and modeling of the evolution of rare variants suggest lower effective size on the front, and lead to an estimation of selection coefficients that increase with conservation scores. Even though range expansions have had a relatively limited impact on the overall fitness of French Canadians, they could explain the higher prevalence of recessive genetic diseases in recently settled regions of Quebec.
Summary
Global demand for vegetable oils is increasing at a dramatic rate, while our understanding of the regulation of oil biosynthesis in plants remains limited. To gain insights into the ...mechanisms that govern oil synthesis and fatty acid (FA) composition in the oil palm fruit, we used a multilevel approach combining gene coexpression analysis, quantification of allele‐specific expression and joint multivariate analysis of transcriptomic and lipid data, in an interspecific backcross population between the African oil palm, Elaeis guineensis, and the American oil palm, Elaeis oleifera, which display contrasting oil contents and FA compositions. The gene coexpression network produced revealed tight transcriptional coordination of fatty acid synthesis (FAS) in the plastid with sugar sensing, plastidial glycolysis, transient starch storage and carbon recapture pathways. It also revealed a concerted regulation, along with FAS, of both the transfer of nascent FA to the endoplasmic reticulum, where triacylglycerol assembly occurs, and of the production of glycerol‐3‐phosphate, which provides the backbone of triacylglycerols. Plastid biogenesis and auxin transport were the two other biological processes most tightly connected to FAS in the network. In addition to WRINKLED1, a transcription factor (TF) known to activate FAS genes, two novel TFs, termed NF‐YB‐1 and ZFP‐1, were found at the core of the FAS module. The saturated FA content of palm oil appeared to vary above all in relation to the level of transcripts of the gene coding for β‐ketoacyl‐acyl carrier protein synthase II. Our findings should facilitate the development of breeding and engineering strategies in this and other oil crops.
Significance Statement
Global demand for vegetable oils is increasing, but our understanding of how oil biosynthesis is regulated remains limited. Here we used an interspecific backcross population of oil palm, transcript coexpression and transcript–metabolite correlations to identify novel enzymes, transcription factors and cellular processes involved in oil biosynthesis. Our findings should facilitate the development of novel breeding and engineering strategies in oil palm and other oil crops.
The mantled floral phenotype of oil palm (Elaeis guineensis) affects somatic embryogenesis-derived individuals and is morphologically similar to mutants defective in the B-class MADS-box genes. This ...somaclonal variation has been previously demonstrated to be associated to a significant deficit in genome-wide DNA methylation. In order to elucidate the possible role of DNA methylation in the transcriptional regulation of EgDEF1, the APETALA3 ortholog of oil palm, we studied this epigenetic mark within the gene in parallel with transcript accumulation in both normal and mantled developing inflorescences. We also examined the methylation and expression of two neighboring retrotransposons that might interfere with EgDEF1 regulation. We show that the EgDEF1 gene is essentially unmethylated and that its methylation pattern does not change with the floral phenotype whereas expression is dramatically different, ruling out a direct implication of DNA methylation in the regulation of this gene. Also, we find that both the gypsy element inserted within an intron of the EgDEF1 gene and the copia element located upstream from the promoter are heavily methylated and show little or no expression. Interestingly, we identify a shorter, alternative transcript produced by EgDEF1 and characterize its accumulation with respect to its full-length counterpart. We demonstrate that, depending on the floral phenotype, the respective proportions of these two transcripts change differently during inflorescence development. We discuss the possible phenotypical consequences of this alternative splicing and the new questions it raises in the search for the molecular mechanisms underlying the mantled phenotype in the oil palm.
The explosion of NGS (Next Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses. The creation of dedicated, robust and reliable pipelines able to handle ...dozens of samples from raw FASTQ data to relevant biological data is a time-consuming task in all projects relying on NGS. To address this, we created a generic and modular toolbox for developing such pipelines.
TOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools able to design pipelines that manage large sets of NGS softwares and utilities. Moreover, TOGGLE offers an easy way to manipulate the various options of the different softwares through the pipelines in using a single basic configuration file, which can be changed for each assay without having to change the code itself. We also describe one implementation of TOGGLE in a complete analysis pipeline designed for SNP discovery for large sets of genomic data, ready to use in different environments (from a single machine to HPC clusters).
TOGGLE speeds up the creation of robust pipelines with reliable log tracking and data flow, for a large range of analyses. Moreover, it enables Biologists to concentrate on the biological relevance of results, and change the experimental conditions easily. The whole code and test data are available at https://github.com/SouthGreenPlatform/TOGGLE .
The oil palm (Elaeis guineensis Jacq.) is a major cultivated crop and the world's largest source of edible vegetable oil. The genus Elaeis comprises two species E. guineensis, the commercial African ...oil palm and E. oleifera, which is used in oil palm genetic breeding. The recent publication of both the African oil palm genome assembly and the first draft sequence of its Latin American relative now allows us to tackle the challenge of understanding the genome composition, structure and evolution of these palm genomes through the annotation of their repeated sequences.
In this study, we identified, annotated and compared Transposable Elements (TE) from the African and Latin American oil palms. In a first step, Transposable Element databases were built through de novo detection in both genome sequences then the TE content of both genomes was estimated. Then putative full-length retrotransposons with Long Terminal Repeats (LTRs) were further identified in the E. guineensis genome for characterization of their structural diversity, copy number and chromosomal distribution. Finally, their relative expression in several tissues was determined through in silico analysis of publicly available transcriptome data.
Our results reveal a congruence in the transpositional history of LTR retrotransposons between E. oleifera and E. guineensis, especially the Sto-4 family. Also, we have identified and described 583 full-length LTR-retrotransposons in the Elaeis guineensis genome. Our work shows that these elements are most likely no longer mobile and that no recent insertion event has occurred. Moreover, the analysis of chromosomal distribution suggests a preferential insertion of Copia elements in gene-rich regions, whereas Gypsy elements appear to be evenly distributed throughout the genome.
Considering the high proportion of LTR retrotransposon in the oil palm genome, our work will contribute to a greater understanding of their impact on genome organization and evolution. Moreover, the knowledge gained from this study constitutes a valuable resource for both the improvement of genome annotation and the investigation of the evolutionary history of palms.
Age-related clonal hematopoiesis (ARCH) is characterized by age-associated accumulation of somatic mutations in hematopoietic stem cells (HSCs) or their pluripotent descendants. HSCs harboring driver ...mutations will be positively selected and cells carrying these mutations will rise in frequency. While ARCH is a known risk factor for blood malignancies, such as Acute Myeloid Leukemia (AML), why some people who harbor ARCH driver mutations do not progress to AML remains unclear. Here, we model the interaction of positive and negative selection in deeply sequenced blood samples from individuals who subsequently progressed to AML, compared to healthy controls, using deep learning and population genetics. Our modeling allows us to discriminate amongst evolutionary classes with high accuracy and captures signatures of purifying selection in most individuals. Purifying selection, acting on benign or mildly damaging passenger mutations, appears to play a critical role in preventing disease-predisposing clones from rising to dominance and is associated with longer disease-free survival. Through exploring a range of evolutionary models, we show how different classes of selection shape clonal dynamics and health outcomes thus enabling us to better identify individuals at a high risk of malignancy.
Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and ...trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.
Type 2 diabetes (T2D) is a very common disease in humans. Here we conduct a meta-analysis of genome-wide association studies (GWAS) with ~16 million genetic variants in 62,892 T2D cases and 596,424 ...controls of European ancestry. We identify 139 common and 4 rare variants associated with T2D, 42 of which (39 common and 3 rare variants) are independent of the known variants. Integration of the gene expression data from blood (n = 14,115 and 2765) with the GWAS results identifies 33 putative functional genes for T2D, 3 of which were targeted by approved drugs. A further integration of DNA methylation (n = 1980) and epigenomic annotation data highlight 3 genes (CAMK1D, TP53INP1, and ATP5G1) with plausible regulatory mechanisms, whereby a genetic variant exerts an effect on T2D through epigenetic regulation of gene expression. Our study uncovers additional loci, proposes putative genetic regulatory mechanisms for T2D, and provides evidence of purifying selection for T2D-associated variants.