Gas chromatography-mass spectrometry (GC-MS)-based metabolomics is ideal for identifying and quantitating small-molecule metabolites (<650 Da), including small acids, alcohols, hydroxyl acids, amino ...acids, sugars, fatty acids, sterols, catecholamines, drugs, and toxins, often using chemical derivatization to make these compounds sufficiently volatile for gas chromatography. This unit shows how GC-MS-based metabolomics allows integration of targeted assays for absolute quantification of specific metabolites with untargeted metabolomics to discover novel compounds. Complemented by database annotations using large spectral libraries and validated standard operating procedures, GC-MS can identify and semiquantify over 200 compounds from human body fluids (e.g., plasma, urine, or stool) per study. Deconvolution software enables detection of more than 300 additional unidentified signals that can be annotated through accurate mass instruments with appropriate data processing workflows, similar to untargeted profiling using liquid chromatography-mass spectrometry. GC-MS is a mature technology that uses not only classic detectors (quadrupole) but also target mass spectrometers (triple quadrupole) and accurate mass instruments (quadrupole-time of flight). This unit covers sample preparation from mammalian samples, data acquisition, quality control, and data processing.
Blood chemicals are routinely measured in clinical or preclinical research studies to diagnose diseases, assess risks in epidemiological research, or use metabolomic phenotyping in response to ...treatments. A vast volume of blood-related literature is available via the PubMed database for data mining.
We aimed to generate a comprehensive blood exposome database of endogenous and exogenous chemicals associated with the mammalian circulating system through text mining and database fusion.
Using NCBI resources, we retrieved PubMed abstracts, PubChem chemical synonyms, and PMC supplementary tables. We then employed text mining and PubChem crowdsourcing to associate phrases relating to blood with PubChem chemicals. False positives were removed by a phrase pattern and a compound exclusion list.
A query to identify blood-related publications in the PubMed database yielded 1.1 million papers. Matching a total of 15 million synonyms from 6.5 million relevant PubChem chemicals against all blood-related publications yielded 37,514 chemicals and 851,999 publications records. Mapping PubChem compound identifiers to the PubMed database yielded 49,940 unique chemicals linked to 676,643 papers. Analysis of open-access metabolomics papers related to blood phrases in the PMC database yielded 4,039 unique compounds and 204 papers. Consolidating these three approaches summed up to a total of 41,474 achiral structures that were linked to 65,957 PubChem CIDs and to over 878,966 PubMed articles. We mapped these compounds to 50 databases such as those covering metabolites and pathways, governmental and toxicological databases, pharmacology resources, and bioassay repositories. In comparison, HMDB, the Human Metabolome Database, links 1,075 compounds to blood-related primary publications.
This new Blood Exposome Database can be used for prioritizing chemicals for systematic reviews, developing target assays in exposome research, identifying compounds in untargeted mass spectrometry, and biological interpretation in metabolomics data. The database is available at http://bloodexposome.org. https://doi.org/10.1289/EHP4713.
Celotno besedilo
Dostopno za:
CEKLJ, DOBA, IZUM, KILJ, NUK, OILJ, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK, VSZLJ
Metabolomics answers a fundamental question in biology: How does metabolism respond to genetic, environmental or phenotypic perturbations? Combining several metabolomics assays can yield datasets for ...more than 800 structurally identified metabolites. However, biological interpretations of metabolic regulation in these datasets are hindered by inherent limits of pathway enrichment statistics. We have developed ChemRICH, a statistical enrichment approach that is based on chemical similarity rather than sparse biochemical knowledge annotations. ChemRICH utilizes structure similarity and chemical ontologies to map all known metabolites and name metabolic modules. Unlike pathway mapping, this strategy yields study-specific, non-overlapping sets of all identified metabolites. Subsequent enrichment statistics is superior to pathway enrichments because ChemRICH sets have a self-contained size where p-values do not rely on the size of a background database. We demonstrate ChemRICH's efficiency on a public metabolomics data set discerning the development of type 1 diabetes in a non-obese diabetic mouse model. ChemRICH is available at www.chemrich.fiehnlab.ucdavis.edu.
Gas chromatography coupled to mass spectrometry (GC-MS) is one of the most frequently used tools for profiling primary metabolites. Instruments are mature enough to run large sequences of samples; ...novel advancements increase the breadth of compounds that can be analyzed, and improved algorithms and databases are employed to capture and utilize biologically relevant information. Around half the published reports on metabolite profiling by GC-MS focus on biological problems rather than on methodological advances. Applications span from comprehensive analysis of volatiles to assessment of metabolic fluxes for bioengineering. Method improvements emphasize extraction procedures, evaluations of quality control of GC-MS in comparison to other techniques and approaches to data processing. Two major challenges remain: rapid annotation of unknown peaks; and, integration of biological background knowledge aiding data interpretation.
Metabolites are the end products of cellular regulatory processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes. In parallel to ...the terms 'transcriptome' and proteome', the set of metabolites synthesized by a biological system constitute its 'metabolome'. Yet, unlike other functional genomics approaches, the unbiased simultaneous identification and quantification of plant metabolomes has been largely neglected. Until recently, most analyses were restricted to profiling selected classes of compounds, or to fingerprinting metabolic changes without sufficient analytical resolution to determine metabolite levels and identities individually. As a prerequisite for metabolomic analysis, careful consideration of the methods employed for tissue extraction, sample preparation, data acquisition, and data mining must be taken. In this review, the differences among metabolite target analysis, metabolite profiling, and metabolic fingerprinting are clarified, and terms are defined. Current approaches are examined, and potential applications are summarized with a special emphasis on data mining and mathematical modelling of metabolism.
•State of the art in mass spectrometry (MS)-fragmentation-based identification.•Differentiation between MSn trees and fragmentation process trees.•Analytical aspects include data acquisition, time ...requirements, and problems.•Topics include data processing, software, open access versus commercial libraries.•Highlights of recent MSn-tree studies.
Identification of unknown metabolites is the bottleneck in advancing metabolomics, leaving interpretation of metabolomics results ambiguous. The chemical diversity of metabolism is vast, making structure identification arduous and time consuming. Currently, comprehensive analysis of mass spectra in metabolomics is limited to library matching, but tandem mass spectral libraries are small compared to the large number of compounds found in the biosphere, including xenobiotics. Resolving this bottleneck requires richer data acquisition and better computational tools. Multi-stage mass spectrometry (MSn) trees show promise to aid in this regard. Fragmentation trees explore the fragmentation process, generate fragmentation rules and aid in sub-structure identification, while mass spectral trees delineate the dependencies in multi-stage MS of collision-induced dissociations. This review covers advancements over the past 10 years as a tool for metabolite identification, including algorithms, software and databases used to build and to implement fragmentation trees and mass spectral annotations.
Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order ...to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas.
An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries.
The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Liquid chromatography–mass spectrometry (LC–MS) methods are most often used for untargeted metabolomics and lipidomics. However, methods have not been standardized as accepted “best practice” ...documents, and reports lack harmonization with respect to quantitative data that enable interstudy comparisons. Researchers use a wide variety of high-resolution mass spectrometers under different operating conditions, and it is unclear if results would yield different biological conclusions depending on the instrument performance. To this end, we used 126 identical human plasma samples and 29 quality control samples from a nutritional intervention study. We investigated lipidomic data acquisitions across nine different MS instruments (1 single TOF, 1 Q/orbital ion trap, and 7 QTOF instruments). Sample preparations, chromatography conditions, and data processing methods were kept identical. Single-point internal standard calibrations were used to estimate absolute concentrations for 307 unique lipids identified by accurate mass, MS/MS spectral match, and retention times. Quantitative results were highly comparable between the LC–MS platforms tested. Using partial least-squares discriminant analysis (PLS-DA) to compare results between platforms, a 92% overlap for the most discriminating lipids based on variable importance in projection (VIP) scores was achieved for all lipids that were detected by at least two instrument platforms. Importantly, even the relative positions of individual samples on the PLS-DA projections were identical. The key for success in harmonizing results was to avoid ion saturation by carefully evaluating linear dynamic ranges using serial dilutions and adjusting the resuspension volume and/or injection volume before running actual study samples.
Metabolomic studies are targeted at identifying and quantifying all metabolites in a given biological context. Among the tools used for metabolomic research, mass spectrometry is one of the most ...powerful tools. However, metabolomics by mass spectrometry always reveals a high number of unknown compounds which complicate in depth mechanistic or biochemical understanding. In principle, mass spectrometry can be utilized within strategies of de novo structure elucidation of small molecules, starting with the computation of the elemental composition of an unknown metabolite using accurate masses with errors <5 ppm (parts per million). However even with very high mass accuracy (<1 ppm) many chemically possible formulae are obtained in higher mass regions. In automatic routines an additional orthogonal filter therefore needs to be applied in order to reduce the number of potential elemental compositions. This report demonstrates the necessity of isotope abundance information by mathematical confirmation of the concept.
High mass accuracy (<1 ppm) alone is not enough to exclude enough candidates with complex elemental compositions (C, H, N, S, O, P, and potentially F, Cl, Br and Si). Use of isotopic abundance patterns as a single further constraint removes >95% of false candidates. This orthogonal filter can condense several thousand candidates down to only a small number of molecular formulas. Example calculations for 10, 5, 3, 1 and 0.1 ppm mass accuracy are given. Corresponding software scripts can be downloaded from http://fiehnlab.ucdavis.edu. A comparison of eight chemical databases revealed that PubChem and the Dictionary of Natural Products can be recommended for automatic queries using molecular formulae.
More than 1.6 million molecular formulae in the range 0-500 Da were generated in an exhaustive manner under strict observation of mathematical and chemical rules. Assuming that ion species are fully resolved (either by chromatography or by high resolution mass spectrometry), we conclude that a mass spectrometer capable of 3 ppm mass accuracy and 2% error for isotopic abundance patterns outperforms mass spectrometers with less than 1 ppm mass accuracy or even hypothetical mass spectrometers with 0.1 ppm mass accuracy that do not include isotope information in the calculation of molecular formulae.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK