Abstract
The Human Metabolome Database or HMDB (https://hmdb.ca) has been providing comprehensive reference information about human metabolites and their associated biological, physiological and ...chemical properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technology. This year's update, HMDB 5.0, brings a number of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of metabolite entries (from 114 100 to 217 920 compounds); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addition of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indices and predicted collision cross section data and (v) enhancements to the HMDB’s search functions to facilitate better compound identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB’s ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochemistry and clinical chemistry.
Abstract
BioTransformer 3.0 (https://biotransformer.ca) is a freely available web server that supports accurate, rapid and comprehensive in silico metabolism prediction. It combines machine learning ...approaches with a rule-based system to predict small-molecule metabolism in human tissues, the human gut as well as the external environment (soil and water microbiota). Simply stated, BioTransformer takes a molecular structure as input (SMILES or SDF) and outputs an interactively sortable table of the predicted metabolites or transformation products (SMILES, PNG images) along with the enzymes that are predicted to be responsible for those reactions and richly annotated downloadable files (CSV and JSON). The entire process typically takes less than a minute. Previous versions of BioTransformer focused exclusively on predicting the metabolism of xenobiotics (such as plant natural products, drugs, cosmetics and other synthetic compounds) using a limited number of pre-defined steps and somewhat limited rule-based methods. BioTransformer 3.0 uses much more sophisticated methods and incorporates new databases, new constraints and new prediction modules to not only more accurately predict the metabolic transformation products of exogenous xenobiotics but also the transformation products of endogenous metabolites, such as amino acids, peptides, carbohydrates, organic acids, and lipids. BioTransformer 3.0 can also support customized sequential combinations of these transformations along with multiple iterations to simulate multi-step human biotransformation events. Performance tests indicate that BioTransformer 3.0 is 40–50% more accurate, far less prone to combinatorial ‘explosions’ and much more comprehensive in terms of metabolite coverage/capabilities than previous versions of BioTransformer.
Graphical Abstract
Graphical Abstract
Synopsis of BioTransfomer 3.0 functions.
Abstract
PathBank (www.pathbank.org) is a new, comprehensive, visually rich pathway database containing more than 110 000 machine-readable pathways found in 10 model organisms (Homo sapiens, ...Bos taurus, Rattus norvegicus, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, Escherichia coli and Pseudomonas aeruginosa). PathBank aims to provide a pathway for every protein and a map for every metabolite. This resource is designed specifically to support pathway elucidation and pathway discovery in transcriptomics, proteomics, metabolomics and systems biology. It provides detailed, fully searchable, hyperlinked diagrams of metabolic, metabolite signaling, protein signaling, disease, drug and physiological pathways. All PathBank pathways include information on the relevant organs, organelles, subcellular compartments, cofactors, molecular locations, chemical structures and protein quaternary structures. Each small molecule is hyperlinked to the rich data contained in public chemical databases such as HMDB or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. All PathBank pathways are accompanied with references and detailed descriptions which provide an overview of the pathway, condition or processes depicted in each diagram. Every PathBank pathway is downloadable in several machine-readable and image formats including BioPAX, SBML, PWML, SBGN, RXN, PNG and SVG. PathBank also supports community annotations and submissions through the web-based PathWhiz pathway illustrator. The vast majority of PathBank's pathways (>95%) are not found in any other public pathway database.
Abstract
The CFM-ID 4.0 web server (https://cfmid.wishartlab.com) is an online tool for predicting, annotating and interpreting tandem mass (MS/MS) spectra of small molecules. It is specifically ...designed to assist researchers pursuing studies in metabolomics, exposomics and analytical chemistry. More specifically, CFM-ID 4.0 supports the: 1) prediction of electrospray ionization quadrupole time-of-flight tandem mass spectra (ESI-QTOF-MS/MS) for small molecules over multiple collision energies (10 eV, 20 eV, and 40 eV); 2) annotation of ESI-QTOF-MS/MS spectra given the structure of the compound; and 3) identification of a small molecule that generated a given ESI-QTOF-MS/MS spectrum at one or more collision energies. The CFM-ID 4.0 web server makes use of a substantially improved MS fragmentation algorithm, a much larger database of experimental and in silico predicted MS/MS spectra and improved scoring methods to offer more accurate MS/MS spectral prediction and MS/MS-based compound identification. Compared to earlier versions of CFM-ID, this new version has an MS/MS spectral prediction performance that is ∼22% better and a compound identification accuracy that is ∼35% better on a standard (CASMI 2016) testing dataset. CFM-ID 4.0 also features a neutral loss function that allows users to identify similar or substituent compounds where no match can be found using CFM-ID’s regular MS/MS-to-compound identification utility. Finally, the CFM-ID 4.0 web server now offers a much more refined user interface that is easier to use, supports molecular formula identification (from MS/MS data), provides more interactively viewable data (including proposed fragment ion structures) and displays MS mirror plots for comparing predicted with observed MS/MS spectra. These improvements should make CFM-ID 4.0 much more useful to the community and should make small molecule identification much easier, faster, and more accurate.
Graphical Abstract
Graphical Abstract
Illustration of the two main functions supported by CFM-ID 4.0. Predicting MS/MS spectra from chemical structures (top) and predicting chemical structures from MS/MS spectra (bottom).
Abstract
The Natural Products Magnetic Resonance Database (NP-MRD) is a comprehensive, freely available electronic resource for the deposition, distribution, searching and retrieval of nuclear ...magnetic resonance (NMR) data on natural products, metabolites and other biologically derived chemicals. NMR spectroscopy has long been viewed as the ‘gold standard’ for the structure determination of novel natural products and novel metabolites. NMR is also widely used in natural product dereplication and the characterization of biofluid mixtures (metabolomics). All of these NMR applications require large collections of high quality, well-annotated, referential NMR spectra of pure compounds. Unfortunately, referential NMR spectral collections for natural products are quite limited. It is because of the critical need for dedicated, open access natural product NMR resources that the NP-MRD was funded by the National Institute of Health (NIH). Since its launch in 2020, the NP-MRD has grown quickly to become the world's largest repository for NMR data on natural products and other biological substances. It currently contains both structural and NMR data for nearly 41,000 natural product compounds from >7400 different living species. All structural, spectroscopic and descriptive data in the NP-MRD is interactively viewable, searchable and fully downloadable in multiple formats. Extensive hyperlinks to other databases of relevance are also provided. The NP-MRD also supports community deposition of NMR assignments and NMR spectra (1D and 2D) of natural products and related meta-data. The deposition system performs extensive data enrichment, automated data format conversion and spectral/assignment evaluation. Details of these database features, how they are implemented and plans for future upgrades are also provided. The NP-MRD is available at https://np-mrd.org.
Maternal pathological conditions such as infections and chronic diseases, along with unexpected events during labor, can lead to life-threatening perinatal outcomes. These outcomes can have ...irreversible consequences throughout an individual's entire life. Urinary metabolomics can provide valuable insights into early physiological adaptations in healthy newborns, as well as metabolic disturbances in premature infants or infants with birth complications. In the present study, we measured 180 metabolites and metabolite ratios in the urine of 13 healthy (hospital-discharged) and 38 critically ill newborns (admitted to the neonatal intensive care unit (NICU)). We used an in-house-developed targeted tandem mass spectrometry (MS/MS)-based metabolomic assay (TMIC Mega) combining liquid chromatography (LC-MS/MS) and flow injection analysis (FIA-MS/MS) to quantitatively analyze up to 26 classes of compounds. Average urinary concentrations (and ranges) for 167 different metabolites from 38 critically ill NICU newborns during their first 24 h of life were determined. Similar sets of urinary values were determined for the 13 healthy newborns. These reference data have been uploaded to the Human Metabolome Database. Urinary concentrations and ranges of 37 metabolites are reported for the first time for newborns. Significant differences were found in the urinary levels of 44 metabolites between healthy newborns and those admitted at the NICU. Metabolites such as acylcarnitines, amino acids and derivatives, biogenic amines, sugars, and organic acids are dysregulated in newborns with bronchopulmonary dysplasia (BPD), asphyxia, or newborns exposed to SARS-CoV-2 during the intrauterine period. Urine can serve as a valuable source of information for understanding metabolic alterations associated with life-threatening perinatal outcomes.
Acylcarnitines are fatty acid metabolites that play important roles in many cellular energy metabolism pathways. They have historically been used as important diagnostic markers for inborn errors of ...fatty acid oxidation and are being intensively studied as markers of energy metabolism, deficits in mitochondrial and peroxisomal b-oxidation activity, insulin resistance, and physical activity. Acylcarnitines are increasingly being identified as important indicators in metabolic studies of many diseases, including metabolic disorders, cardiovascular diseases, diabetes, depression, neurologic disorders, and certain cancers. The US Food and Drug Administration-approved drug L-carnitine, along with short-chain acylcarnitines (acetylcarnitine and propionylcarnitine), is now widely used as a dietary supplement. In light of their growing importance, we have undertaken an extensive review of acylcarnitines and provided a detailed description of their identity, nomenclature, classification, biochemistry, pathophysiology, supplementary use, potential drug targets, and clinical trials. We also summarize these updates in the Human Metabo lome Database, which now includes information on the structures, chemical formulae, chemical/spectral properties, descriptions, and pathways for 1240 acylcarnitines. This work lays a solid foundation for identifying, characterizing, and understanding acylcarnitines in human biosamples. We also discuss the emerging opportunities for using acylcarnitines as biomarkers and as dietary interventions or supplements for many wide-ranging indications. The opportunity to identify new drug targets involved in controlling acylcarnitine levels is also discussed. Significance Statement--This review provides a comprehensive overview of acylcarnitines, including their nomenclature, structure and biochemistry, and use as disease biomarkers and pharmaceutical agents. We present updated information contained in the Human Metabolome Database website as well as substantial mapping of the known biochemical pathways associated with acylcarnitines, thereby providing a strong foundation for further clarification of their physiological roles.
Abstract
MarkerDB is a freely available electronic database that attempts to consolidate information on all known clinical and a selected set of pre-clinical molecular biomarkers into a single ...resource. The database includes four major types of molecular biomarkers (chemical, protein, DNA genetic and karyotypic) and four biomarker categories (diagnostic, predictive, prognostic and exposure). MarkerDB provides information such as: biomarker names and synonyms, associated conditions or pathologies, detailed disease descriptions, detailed biomarker descriptions, biomarker specificity, sensitivity and ROC curves, standard reference values (for protein and chemical markers), variants (for SNP or genetic markers), sequence information (for genetic and protein markers), molecular structures (for protein and chemical markers), tissue or biofluid sources (for protein and chemical markers), chromosomal location and structure (for genetic and karyotype markers), clinical approval status and relevant literature references. Users can browse the data by conditions, condition categories, biomarker types, biomarker categories or search by sequence similarity through the advanced search function. Currently, the database contains 142 protein biomarkers, 1089 chemical biomarkers, 154 karyotype biomarkers and 26 374 genetic markers. These are categorized into 25 560 diagnostic biomarkers, 102 prognostic biomarkers, 265 exposure biomarkers and 6746 predictive biomarkers or biomarker panels. Collectively, these markers can be used to detect, monitor or predict 670 specific human conditions which are grouped into 27 broad condition categories. MarkerDB is available at https://markerdb.ca.
Abstract
PHASTEST (PHAge Search Tool with Enhanced Sequence Translation) is the successor to the PHAST and PHASTER prophage finding web servers. PHASTEST is designed to support the rapid ...identification, annotation and visualization of prophage sequences within bacterial genomes and plasmids. PHASTEST also supports rapid annotation and interactive visualization of all other genes (protein coding regions, tRNA/tmRNA/rRNA sequences) in bacterial genomes. Given that bacterial genome sequencing has become so routine, the need for fast tools to comprehensively annotate bacterial genomes has become progressively more important. PHASTEST not only offers faster and more accurate prophage annotations than its predecessors, it also provides more complete whole genome annotations and much improved genome visualization capabilities. In standardized tests, we found that PHASTEST is 31% faster and 2–3% more accurate in prophage identification than PHASTER. Specifically, PHASTEST can process a typical bacterial genome in 3.2 min (raw sequence) or in 1.3 min when given a pre-annotated GenBank file. Improvements in PHASTEST’s ability to annotate bacterial genomes now make it a particularly powerful tool for whole genome annotation. In addition, PHASTEST now offers a much more modern and responsive visualization interface that allows users to generate, edit, annotate and interactively visualize (via zooming, rotating, dragging, panning, resetting), colourful, publication quality genome maps. PHASTEST continues to offer popular options such as an API for programmatic queries, a Docker image for local installations, support for multiple (metagenomic) queries and the ability to perform automated look-ups against thousands of previously PHAST-annotated bacterial genomes. PHASTEST is available online at https://phastest.ca.
Graphical Abstract
Graphical Abstract
GCMS-ID (Gas Chromatography Mass Spectrometry compound IDentifier) is a webserver designed to enable the identification of compounds from GC-MS experiments. GC-MS instruments produce both electron ...impact mass spectra (EI-MS) and retention index (RI) data for as few as one, to as many as hundreds of different compounds. Matching the measured EI-MS, RI or EI-MS + RI data to experimentally collected EI-MS and/or RI reference libraries allows facile compound identification. However, the number of available experimental RI and EI-MS reference spectra, especially for metabolomics or exposomics-related studies, is disappointingly small. Using machine learning to accurately predict the EI-MS spectra and/or RIs for millions of metabolomics and/or exposomics-relevant compounds could (partially) solve this spectral matching problem. This computational approach to compound identification is called in silico metabolomics. GCMS-ID brings this concept of in silico metabolomics closer to reality by intelligently integrating two of our previously published webservers: CFM-EI and RIpred. CFM-EI is an EI-MS spectral prediction webserver, and RIpred is a Kovats RI prediction webserver. We have found that GCMS-ID can accurately identify compounds from experimental RI, EI-MS or RI + EI-MS data through matching to its own large library of >1 million predicted RI/EI-MS values generated for metabolomics/exposomics-relevant compounds. GCMS-ID can also predict the RI or EI-MS spectrum from a user-submitted structure or annotate a user-submitted EI-MS spectrum. GCMS-ID is freely available at https://gcms-id.ca/.