Recent advances in high-throughput technologies have enabled the profiling of multiple layers of a biological system, including DNA sequence data (genomics), RNA expression levels (transcriptomics), ...and metabolite levels (metabolomics). This has led to the generation of vast amounts of biological data that can be integrated in so-called multi-omics studies to examine the complex molecular underpinnings of health and disease. Integrative analysis of such datasets is not straightforward and is particularly complicated by the high dimensionality and heterogeneity of the data and by the lack of universal analysis protocols. Previous reviews have discussed various strategies to address the challenges of data integration, elaborating on specific aspects, such as network inference or feature selection techniques. Thereby, the main focus has been on the integration of two omics layers in their relation to a phenotype of interest. In this review we provide an overview over a typical multi-omics workflow, focusing on integration methods that have the potential to combine metabolomics data with two or more omics. We discuss multiple integration concepts including data-driven, knowledge-based, simultaneous and step-wise approaches. We highlight the application of these methods in recent multi-omics studies, including large-scale integration efforts aiming at a global depiction of the complex relationships within and between different biological layers without focusing on a particular phenotype.
Display omitted
•Multi-omics studies can unravel the complex molecular underpinnings of diseases.•Data availability and study aims influence the selection of the integration strategy.•Knowledge-based integration can enhance the biological interpretability of results.•Data-driven integration can infer relationships between uncharacterized molecules.•Network-based, hybrid integration strategies combine the strengths of both.
Late-onset Alzheimer's disease (AD) can, in part, be considered a metabolic disease. Besides age, female sex and APOE ε4 genotype represent strong risk factors for AD that also give rise to large ...metabolic differences. We systematically investigated group-specific metabolic alterations by conducting stratified association analyses of 139 serum metabolites in 1,517 individuals from the AD Neuroimaging Initiative with AD biomarkers. We observed substantial sex differences in effects of 15 metabolites with partially overlapping differences for APOE ε4 status groups. Several group-specific metabolic alterations were not observed in unstratified analyses using sex and APOE ε4 as covariates. Combined stratification revealed further subgroup-specific metabolic effects limited to APOE ε4+ females. The observed metabolic alterations suggest that females experience greater impairment of mitochondrial energy production than males. Dissecting metabolic heterogeneity in AD pathogenesis can therefore enable grading the biomedical relevance for specific pathways within specific subgroups, guiding the way to personalized medicine.
Gepard provides a user-friendly, interactive application for the quick creation of dotplots. It utilizes suffix arrays to reduce the time complexity of dotplot calculation to Θ(m*log n). A ...client-server mode, which is a novel feature for dotplot creation software, allows the user to calculate dotplots and color them by functional annotation without any prior downloading of sequence or annotation data.
Availability: Both source codes and executable binaries are available at http://mips.gsf.de/services/analysis/gepard
Contact:
krumsiek@in.tum.de
Interactions between the gut microbial ecosystem and host lipid homeostasis are highly relevant to host physiology and metabolic diseases. We present a comprehensive multi-omics view of the effect of ...intestinal microbial colonization on hepatic lipid metabolism, integrating transcriptomic, proteomic, phosphoproteomic, and lipidomic analyses of liver and plasma samples from germfree and specific pathogen-free mice. Microbes induce monounsaturated fatty acid generation by stearoyl-CoA desaturase 1 and polyunsaturated fatty acid elongation by fatty acid elongase 5, leading to significant alterations in glycerophospholipid acyl-chain profiles. A composite classification score calculated from the observed alterations in fatty acid profiles in germfree mice clearly differentiates antibiotic-treated mice from untreated controls with high sensitivity. Mechanistic investigations reveal that acetate originating from gut microbial degradation of dietary fiber serves as precursor for hepatic synthesis of C16 and C18 fatty acids and their related glycerophospholipid species that are also released into the circulation.
Intermuscular adipose tissue (IMAT) is negatively related to insulin sensitivity, but a causal role of IMAT in the development of insulin resistance is unknown. IMAT was sampled in humans to test for ...the ability to induce insulin resistance in vitro and characterize gene expression to uncover how IMAT may promote skeletal muscle insulin resistance. Human primary muscle cells were incubated with conditioned media from IMAT, visceral (VAT), or subcutaneous adipose tissue (SAT) to evaluate changes in insulin sensitivity. RNAseq analysis was performed on IMAT with gene expression compared with skeletal muscle and SAT, and relationships to insulin sensitivity were determined in men and women spanning a wide range of insulin sensitivity measured by hyperinsulinemic-euglycemic clamp. Conditioned media from IMAT and VAT decreased insulin sensitivity similarly compared with SAT. Multidimensional scaling analysis revealed distinct gene expression patterns in IMAT compared with SAT and muscle. Pathway analysis revealed that IMAT expression of genes in insulin signaling, oxidative phosphorylation, and peroxisomal metabolism related positively to donor insulin sensitivity, whereas expression of macrophage markers, inflammatory cytokines, and secreted extracellular matrix proteins were negatively related to insulin sensitivity. Perilipin 5 gene expression suggested greater IMAT lipolysis in insulin-resistant individuals. Combined, these data show that factors secreted from IMAT modulate muscle insulin sensitivity, possibly via secretion of inflammatory cytokines and extracellular matrix proteins, and by increasing local FFA concentration in humans. These data suggest IMAT may be an important regulator of skeletal muscle insulin sensitivity and could be a novel therapeutic target for skeletal muscle insulin resistance.
Metabolomics is a relatively new high-throughput technology that aims at measuring all endogenous metabolites within a biological sample in an unbiased fashion. The resulting metabolic profiles may ...be regarded as functional signatures of the physiological state, and have been shown to comprise effects of genetic regulation as well as environmental factors. This potential to connect genotypic to phenotypic information promises new insights and biomarkers for different research fields, including biomedical and pharmaceutical research. In the statistical analysis of metabolomics data, many techniques from other omics fields can be reused. However recently, a number of tools specific for metabolomics data have been developed as well. The focus of this mini review will be on recent advancements in the analysis of metabolomics data especially by utilizing Gaussian graphical models and independent component analysis.
Hematopoiesis is an ideal model system for stem cell biology with advanced experimental access. A systems view on the interactions of core transcription factors is important for understanding ...differentiation mechanisms and dynamics. In this manuscript, we construct a Boolean network to model myeloid differentiation, specifically from common myeloid progenitors to megakaryocytes, erythrocytes, granulocytes and monocytes. By interpreting the hematopoietic literature and translating experimental evidence into Boolean rules, we implement binary dynamics on the resulting 11-factor regulatory network. Our network contains interesting functional modules and a concatenation of mutual antagonistic pairs. The state space of our model is a hierarchical, acyclic graph, typifying the principles of myeloid differentiation. We observe excellent agreement between the steady states of our model and microarray expression profiles of two different studies. Moreover, perturbations of the network topology correctly reproduce reported knockout phenotypes in silico. We predict previously uncharacterized regulatory interactions and alterations of the differentiation process, and line out reprogramming strategies.
Background
Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic ...assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.
Methods
We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.
Results
Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (
KNN
) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.
Conclusion
Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that
KNN
-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this ...work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.
In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.
In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.
Depression constitutes a leading cause of disability worldwide. Despite extensive research on its interaction with psychobiological factors, associated pathways are far from being elucidated. ...Metabolomics, assessing the final products of complex biochemical reactions, has emerged as a valuable tool for exploring molecular pathways. We conducted a metabolome-wide association analysis to investigate the link between the serum metabolome and depressed mood (DM) in 1411 participants of the KORA (Cooperative Health Research in the Augsburg Region) F4 study (discovery cohort). Serum metabolomics data comprised 353 unique metabolites measured by Metabolon. We identified 72 (5.1%) KORA participants with DM. Linear regression tests were conducted modeling each metabolite value by DM status, adjusted for age, sex, body-mass index, antihypertensive, cardiovascular, antidiabetic, and thyroid gland hormone drugs, corticoids and antidepressants. Sensitivity analyses were performed in subcohorts stratified for sex, suicidal ideation, and use of antidepressants. We replicated our results in an independent sample of 968 participants of the SHIP-Trend (Study of Health in Pomerania) study including 52 (5.4%) individuals with DM (replication cohort). We found significantly lower laurylcarnitine levels in KORA F4 participants with DM after multiple testing correction according to Benjamini/Hochberg. This finding was replicated in the independent SHIP-Trend study. Laurylcarnitine remained significantly associated (p value < 0.05) with depression in samples stratified for sex, suicidal ideation, and antidepressant medication. Decreased blood laurylcarnitine levels in depressed individuals may point to impaired fatty acid oxidation and/or mitochondrial function in depressive disorders, possibly representing a novel therapeutic target.