Abstract
Context
Although calorie restriction has proven beneficial for weight loss, long-term weight control is variable between individuals.
Objective
To identify biomarkers of successful weight ...control during a dietary intervention (DI).
Design, Setting, and Participants
Adipose tissue (AT) transcriptomes were compared between 21 obese individuals who either maintained weight loss or regained weight during the DI. Results were validated on 310 individuals from the same study using quantitative reverse transcription polymerase chain reaction and protein levels of potential circulating biomarkers measured by enzyme-linked immunosorbent assay.
Intervention
Individuals underwent 8 weeks of low-calorie diet, then 6 months of ad libitum diet.
Outcome Measure
Weight changes at the end of the DI.
Results
We evaluated six genes that had altered expression during DI, encode secreted proteins, and have not previously been implicated in weight control (EGFL6, FSTL3, CRYAB, TNMD, SPARC, IGFBP3), as well as genes for which baseline expression differed between those with good and poor weight control (ASPN, USP53). Changes in plasma concentrations of EGFL6, FSTL3, and CRYAB mirrored AT messenger RNA expression; all decreased during DI in individuals with good weight control. ASPN and USP53 had higher baseline expression in individuals who went on to have good weight control. Expression quantitative trait loci analysis found polymorphisms associated with expression levels of USP53 in AT. A regulatory network was identified in which transforming growth factor β1 (TGF-β1) was responsible for downregulation of certain genes during DI in good controllers. Interestingly, ASPN is a TGF-β1 inhibitor.
Conclusions
We found circulating biomarkers associated with weight control that could influence weight management strategies and genes that may be prognostic for successful weight control.
In adipose tissue, we found three mRNA encoding circulating markers that differed between good and poor weight controllers 6 months after calorie restriction and two potentially prognostic mRNAs.
Molecular quantitative trait locus (QTL) analyses are increasingly popular to explore the genetic architecture of complex traits, but existing studies do not leverage shared regulatory patterns and ...suffer from a large multiplicity burden, which hampers the detection of weak signals such as trans associations. Here, we present a fully multivariate proteomic QTL (pQTL) analysis performed with our recently proposed Bayesian method LOCUS on data from two clinical cohorts, with plasma protein levels quantified by mass-spectrometry and aptamer-based assays. Our two-stage study identifies 136 pQTL associations in the first cohort, of which >80% replicate in the second independent cohort and have significant enrichment with functional genomic elements and disease risk loci. Moreover, 78% of the pQTLs whose protein abundance was quantified by both proteomic techniques are confirmed across assays. Our thorough comparisons with standard univariate QTL mapping on (1) these data and (2) synthetic data emulating the real data show how LOCUS borrows strength across correlated protein levels and markers on a genome-wide scale to effectively increase statistical power. Notably, 15% of the pQTLs uncovered by LOCUS would be missed by the univariate approach, including several trans and pleiotropic hits with successful independent validation. Finally, the analysis of extensive clinical data from the two cohorts indicates that the genetically-driven proteins identified by LOCUS are enriched in associations with low-grade inflammation, insulin resistance and dyslipidemia and might therefore act as endophenotypes for metabolic diseases. While considerations on the clinical role of the pQTLs are beyond the scope of our work, these findings generate useful hypotheses to be explored in future research; all results are accessible online from our searchable database. Thanks to its efficient variational Bayes implementation, LOCUS can analyze jointly thousands of traits and millions of markers. Its applicability goes beyond pQTL studies, opening new perspectives for large-scale genome-wide association and QTL analyses. Diet, Obesity and Genes (DiOGenes) trial registration number: NCT00390637.
Weight loss effectively reduces cardiometabolic health risks among people with overweight and obesity, but inter-individual variability in weight loss maintenance is large. Here we studied whether ...baseline gene expression in subcutaneous adipose tissue predicts diet-induced weight loss success.
Within the 8-month multicenter dietary intervention study DiOGenes, we classified a low weight-losers (low-WL) group and a high-WL group based on median weight loss percentage (9.9%) from 281 individuals. Using RNA sequencing, we identified the significantly differentially expressed genes between high-WL and low-WL at baseline and their enriched pathways. We used this information together with support vector machines with linear kernel to build classifier models that predict the weight loss classes.
Prediction models based on a selection of genes that are associated with the discovered pathways 'lipid metabolism' (max AUC = 0.74, 95% CI 0.62-0.86) and 'response to virus' (max AUC = 0.72, 95% CI 0.61-0.83) predicted the weight-loss classes high-WL/low-WL significantly better than models based on randomly selected genes (
< 0.01). The performance of the models based on 'response to virus' genes is highly dependent on those genes that are also associated with lipid metabolism. Incorporation of baseline clinical factors into these models did not noticeably enhance the model performance in most of the runs. This study demonstrates that baseline adipose tissue gene expression data, together with supervised machine learning, facilitates the characterization of the determinants of successful weight loss.
Angiopoietin-like protein 3 (ANGPTL3), a liver-derived protein, plays an important role in the lipid and lipoprotein metabolism. Using data available from the DiOGenes study, we assessed the link ...with clinical improvements (weight, plasma lipid, and insulin levels) and changes in liver markers, alanine aminotransferase, aspartate aminotransferase (AST), adiponectin, fetuin A and B, and cytokeratin 18 (CK-18), upon low-calorie diet (LCD) intervention. We also examined the role of genetic variation in determining the level of circulating ANGPTL3 and the relation between the identified genetic markers and markers of hepatic steatosis.
DiOGenes is a multicenter, controlled dietary intervention where obese participants followed an 8-week LCD (800 kcal/day, using a meal replacement product). Plasma ANGPTL3 and liver markers were measured using the SomaLogic (Boulder, CO) platform. Protein quantitative trait locus (pQTL) analyses assessed the link between more than four million common variants and the level of circulating ANGPTL3 at baseline and changes in levels during the LCD intervention.
Changes in ANGPTL3 during weight loss showed only marginal association with changes in triglycerides (nominal
= 0.02) and insulin (
= 0.04); these results did not remain significant after correcting for multiple testing. However, significant association (after multiple-testing correction) were observed between changes in ANGPTL3 and AST during weight loss (
= 0.004) and between ANGPTL3 and CK-18 (baseline
= 1.03 × 10
, during weight loss
= 1.47 × 10
). Our pQTL study identified two loci significantly associated with changes in ANGPTL3. One of these loci (the
-
-
gene cluster) also displayed significant association with changes in CK-18 levels during weight loss (
= 0.007).
We clarify the link between circulating levels of ANGPTL3 and specific markers of liver function. We demonstrate that changes in ANGPLT3 and CK-18 during LCD are under genetic control from
-acting variants. Our results suggest an extended function of ANGPTL3 in the inflammatory state of liver steatosis and toward liver metabolic processes.
Differences between genomes can be due to single nucleotide variants, translocations, inversions, and copy number variants (CNVs, gain or loss of DNA). The latter can range from sub-microscopic ...events to complete chromosomal aneuploidies. Small CNVs are often benign but those larger than 500 kb are strongly associated with morbid consequences such as developmental disorders and cancer. Detecting CNVs within and between populations is essential to better understand the plasticity of our genome and to elucidate its possible contribution to disease. Hence there is a need for better-tailored and more robust tools for the detection and genome-wide analyses of CNVs. While a link between a given CNV and a disease may have often been established, the relative CNV contribution to disease progression and impact on drug response is not necessarily understood. In this review we discuss the progress, challenges, and limitations that occur at different stages of CNV analysis from the detection (using DNA microarrays and next-generation sequencing) and identification of recurrent CNVs to the association with phenotypes. We emphasize the importance of germline CNVs and propose strategies to aid clinicians to better interpret structural variations and assess their clinical implications.
Weight loss success is dependent on the ability to refrain from regaining the lost weight in time. This feature was shown to be largely variable among individuals, and these differences, with their ...underlying molecular processes, are diverse and not completely elucidated. Altered plasma metabolites concentration could partly explain weight loss maintenance mechanisms. In the present work, a systems biology approach has been applied to investigate the potential mechanisms involved in weight loss maintenance within the Diogenes weight-loss intervention study.
A genome wide association study identified SNPs associated with plasma glycine levels within the CPS1 (Carbamoyl-Phosphate Synthase 1) gene (rs10206976, p-value = 4.709e-11 and rs12613336, p-value = 1.368e-08). Furthermore, gene expression in the adipose tissue showed that CPS1 expression levels were associated with successful weight maintenance and with several SNPs within CPS1 (cis-eQTL). In order to contextualize these results, a gene-metabolite interaction network of CPS1 and glycine has been built and analyzed, showing functional enrichment in genes involved in lipid metabolism and one carbon pool by folate pathways.
CPS1 is the rate-limiting enzyme for the urea cycle, catalyzing carbamoyl phosphate from ammonia and bicarbonate in the mitochondria. Glycine and CPS1 are connected through the one-carbon pool by the folate pathway and the urea cycle. Furthermore, glycine could be linked to metabolic health and insulin sensitivity through the betaine osmolyte. These considerations, and the results from the present study, highlight a possible role of CPS1 and related pathways in weight loss maintenance, suggesting that it might be partly genetically determined in humans.
Cancer genomes frequently contain somatic copy number alterations (SCNA) that can significantly perturb the expression level of affected genes and thus disrupt pathways controlling normal growth. In ...melanoma, many studies have focussed on the copy number and gene expression levels of the BRAF, PTEN and MITF genes, but little has been done to identify new genes using these parameters at the genome-wide scale. Using karyotyping, SNP and CGH arrays, and RNA-seq, we have identified SCNA affecting gene expression ('SCNA-genes') in seven human metastatic melanoma cell lines. We showed that the combination of these techniques is useful to identify candidate genes potentially involved in tumorigenesis. Since few of these alterations were recurrent across our samples, we used a protein network-guided approach to determine whether any pathways were enriched in SCNA-genes in one or more samples. From this unbiased genome-wide analysis, we identified 28 significantly enriched pathway modules. Comparison with two large, independent melanoma SCNA datasets showed less than 10% overlap at the individual gene level, but network-guided analysis revealed 66% shared pathways, including all but three of the pathways identified in our data. Frequently altered pathways included WNT, cadherin signalling, angiogenesis and melanogenesis. Additionally, our results emphasize the potential of the EPHA3 and FRS2 gene products, involved in angiogenesis and migration, as possible therapeutic targets in melanoma. Our study demonstrates the utility of network-guided approaches, for both large and small datasets, to identify pathways recurrently perturbed in cancer.
Large-scale high throughput studies using microarray technology have established that copy number variation (CNV) throughout the genome is more frequent than previously thought. Such variation is ...known to play an important role in the presence and development of phenotypes such as HIV-1 infection and Alzheimer's disease. However, methods for analyzing the complex data produced and identifying regions of CNV are still being refined.
We describe the presence of a genome-wide technical artifact, spatial autocorrelation or 'wave', which occurs in a large dataset used to determine the location of CNV across the genome. By removing this artifact we are able to obtain both a more biologically meaningful clustering of the data and an increase in the number of CNVs identified by current calling methods without a major increase in the number of false positives detected. Moreover, removing this artifact is critical for the development of a novel model-based CNV calling algorithm - CNVmix - that uses cross-sample information to identify regions of the genome where CNVs occur. For regions of CNV that are identified by both CNVmix and current methods, we demonstrate that CNVmix is better able to categorize samples into groups that represent copy number gains or losses.
Removing artifactual 'waves' (which appear to be a general feature of array comparative genomic hybridization (aCGH) datasets) and using cross-sample information when identifying CNVs enables more biological information to be extracted from aCGH experiments designed to investigate copy number variation in normal individuals.
The use of weighed food diaries in nutritional studies provides a powerful method to quantify food and nutrient intakes. Yet, mapping these records onto food composition tables (FCTs) is a ...challenging, time-consuming and error-prone process. Experts make this effort manually and no automation has been previously proposed. Our study aimed to assess automated approaches to map food items onto FCTs.
We used food diaries (~170,000 records pertaining to 4,200 unique food items) from the DiOGenes randomized clinical trial. We attempted to map these items onto six FCTs available from the EuroFIR resource. Two approaches were tested: the first was based solely on food name similarity (fuzzy matching). The second used a machine learning approach (C5.0 classifier) combining both fuzzy matching and food energy. We tested mapping food items using their original names and also an English-translation. Top matching pairs were reviewed manually to derive performance metrics: precision (the percentage of correctly mapped items) and recall (percentage of mapped items).
The simpler approach: fuzzy matching, provided very good performance. Under a relaxed threshold (score > 50%), this approach enabled to remap 99.49% of the items with a precision of 88.75%. With a slightly more stringent threshold (score > 63%), the precision could be significantly improved to 96.81% while keeping a recall rate > 95% (i.e., only 5% of the queried items would not be mapped). The machine learning approach did not lead to any improvements compared to the fuzzy matching. However, it could increase substantially the recall rate for food items without any clear equivalent in the FCTs (+7 and +20% when mapping items using their original or English-translated names). Our approaches have been implemented as R packages and are freely available from GitHub.
This study is the first to provide automated approaches for large-scale food item mapping onto FCTs. We demonstrate that both high precision and recall can be achieved. Our solutions can be used with any FCT and do not require any programming background. These methodologies and findings are useful to any small or large nutritional study (observational as well as interventional).
Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of ...traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets.
Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs.
Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.