Abstract
The Human Metabolome Database or HMDB (https://hmdb.ca) has been providing comprehensive reference information about human metabolites and their associated biological, physiological and ...chemical properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technology. This year's update, HMDB 5.0, brings a number of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of metabolite entries (from 114 100 to 217 920 compounds); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addition of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indices and predicted collision cross section data and (v) enhancements to the HMDB’s search functions to facilitate better compound identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB’s ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochemistry and clinical chemistry.
Abstract
BioTransformer 3.0 (https://biotransformer.ca) is a freely available web server that supports accurate, rapid and comprehensive in silico metabolism prediction. It combines machine learning ...approaches with a rule-based system to predict small-molecule metabolism in human tissues, the human gut as well as the external environment (soil and water microbiota). Simply stated, BioTransformer takes a molecular structure as input (SMILES or SDF) and outputs an interactively sortable table of the predicted metabolites or transformation products (SMILES, PNG images) along with the enzymes that are predicted to be responsible for those reactions and richly annotated downloadable files (CSV and JSON). The entire process typically takes less than a minute. Previous versions of BioTransformer focused exclusively on predicting the metabolism of xenobiotics (such as plant natural products, drugs, cosmetics and other synthetic compounds) using a limited number of pre-defined steps and somewhat limited rule-based methods. BioTransformer 3.0 uses much more sophisticated methods and incorporates new databases, new constraints and new prediction modules to not only more accurately predict the metabolic transformation products of exogenous xenobiotics but also the transformation products of endogenous metabolites, such as amino acids, peptides, carbohydrates, organic acids, and lipids. BioTransformer 3.0 can also support customized sequential combinations of these transformations along with multiple iterations to simulate multi-step human biotransformation events. Performance tests indicate that BioTransformer 3.0 is 40–50% more accurate, far less prone to combinatorial ‘explosions’ and much more comprehensive in terms of metabolite coverage/capabilities than previous versions of BioTransformer.
Graphical Abstract
Graphical Abstract
Synopsis of BioTransfomer 3.0 functions.
Abstract
The CFM-ID 4.0 web server (https://cfmid.wishartlab.com) is an online tool for predicting, annotating and interpreting tandem mass (MS/MS) spectra of small molecules. It is specifically ...designed to assist researchers pursuing studies in metabolomics, exposomics and analytical chemistry. More specifically, CFM-ID 4.0 supports the: 1) prediction of electrospray ionization quadrupole time-of-flight tandem mass spectra (ESI-QTOF-MS/MS) for small molecules over multiple collision energies (10 eV, 20 eV, and 40 eV); 2) annotation of ESI-QTOF-MS/MS spectra given the structure of the compound; and 3) identification of a small molecule that generated a given ESI-QTOF-MS/MS spectrum at one or more collision energies. The CFM-ID 4.0 web server makes use of a substantially improved MS fragmentation algorithm, a much larger database of experimental and in silico predicted MS/MS spectra and improved scoring methods to offer more accurate MS/MS spectral prediction and MS/MS-based compound identification. Compared to earlier versions of CFM-ID, this new version has an MS/MS spectral prediction performance that is ∼22% better and a compound identification accuracy that is ∼35% better on a standard (CASMI 2016) testing dataset. CFM-ID 4.0 also features a neutral loss function that allows users to identify similar or substituent compounds where no match can be found using CFM-ID’s regular MS/MS-to-compound identification utility. Finally, the CFM-ID 4.0 web server now offers a much more refined user interface that is easier to use, supports molecular formula identification (from MS/MS data), provides more interactively viewable data (including proposed fragment ion structures) and displays MS mirror plots for comparing predicted with observed MS/MS spectra. These improvements should make CFM-ID 4.0 much more useful to the community and should make small molecule identification much easier, faster, and more accurate.
Graphical Abstract
Graphical Abstract
Illustration of the two main functions supported by CFM-ID 4.0. Predicting MS/MS spectra from chemical structures (top) and predicting chemical structures from MS/MS spectra (bottom).
Defects on the surface of steel plates are one of the most important factors affecting the quality of steel plates. It is of great importance to detect such defects through online surface inspection ...systems, whose ability of defect identification comes from self-learning through training samples. Extreme Learning Machine (ELM) is a fast machine learning algorithm with a high accuracy of identification. ELM is implemented by a hidden matrix generated with random initialization parameters, while different parameters usually result in different performances. To solve this problem, an improved ELM algorithm combined with a Genetic Algorithm was proposed and applied for the surface defect identification of hot rolled steel plates. The output matrix of the ELM’s hidden layers was treated as a chromosome, and some novel iteration rules were added. The algorithm was tested with 1675 samples of hot rolled steel plates, including pockmarks, chaps, scars, longitudinal cracks, longitudinal scratches, scales, transverse cracks, transverse scratches, and roll marks. The results showed that the highest identification accuracies for the training and the testing set obtained by the G-ELM (Genetic Extreme Learning Machine) algorithm were 98.46% and 94.30%, respectively, which were about 5% higher than those obtained by the ELM algorithm.
In silico metabolism prediction is a cheminformatic task of autonomously predicting the set of metabolic byproducts produced from a specified molecule and a set of enzymes or reactions. Here, we ...describe a novel machine learned in silico cytochrome P450 (CYP450) metabolism prediction suite, called CyProduct, that accurately predicts metabolic byproducts for a specified molecule and a human CYP450 isoform. It includes three modules: (1) CypReact, a tool that predicts if the query compound reacts with a given CYP450 enzyme, (2) CypBoM, a tool that accurately predicts the “bond site” of the reaction (i.e., which specific bonds within the query molecule react with the CYP isoform), and (3) MetaboGen, a tool that generates the metabolic byproducts based on CypBoM’s bond-site prediction. CyProduct predicts metabolic biotransformation products for each of the nine most important human CYP450 enzymes. CypBoM uses an important new concept called “bond of metabolism” (BoM), which extends the traditional “site of metabolism” (SoM) by specifying the information about the set of chemical bonds that is modified or formed in a metabolic reaction (rather than the specific atom). We created a BoM database for 1845 CYP450-mediated Phase I reactions, then used this to train the CypBoM Predictor to predict the reactive bond locations on substrate molecules. CypBoM Predictor’s cross-validated Jaccard score for reactive bond prediction ranged from 0.380 to 0.452 over the nine CYP450 enzymes. Over variants of a test set of 68 known CYP450 substrates and 30 nonreactants, CyProduct outperformed the other packages, including ADMET Predictor, BioTransformer, and GLORY, by an average of 200% (with respect to Jaccard score) in terms of predicting metabolites. The CyProduct suite and the data sets are freely available at https://bitbucket.org/wishartlab/cyproduct/src/master/.
Abstract
This paper examines the importance of institutional contexts in cross‐border buyout exit success. After tracking 2639 cross‐border buyout investments during 1998–2007 in 38 countries and ...regions as of 2016, I find that the higher the institutional quality of the country where the portfolio company is located, the higher the probability of a successful exit via IPO or M&A. The larger the institutional distance between the portfolio company country and the private equity (PE) firm country, the lower the exit success, while PE firms' international experience, industrial experience, and reputation help improve exit success. Further, their industrial experience and the establishment of a local office mitigate the adverse effects of institutional distance.
In the field of metabolomics, mass spectrometry (MS) is the method most commonly used for identifying and annotating metabolites. As this typically involves matching a given MS spectrum against an ...experimentally acquired reference spectral library, this approach is limited by the coverage and size of such libraries (which typically number in the thousands). These experimental libraries can be greatly extended by predicting the MS spectra of known chemical structures (which number in the millions) to create computational reference spectral libraries. To facilitate the generation of predicted spectral reference libraries, we developed CFM-ID, a computer program that can accurately predict ESI-MS/MS spectrum for a given compound structure. CFM-ID is one of the best-performing methods for compound-to-mass-spectrum prediction and also one of the top tools for in silico mass-spectrum-to-compound identification. This work improves CFM-ID’s ability to predict ESI-MS/MS spectra from compounds by (1) learning parameters from features based on the molecular topology, (2) adding a new approach to ring cleavage that models such cleavage as a sequence of simple chemical bond dissociations, and (3) expanding its hand-written rule-based predictor to cover more chemical classes, including acylcarnitines, acylcholines, flavonols, flavones, flavanones, and flavonoid glycosides. We demonstrate that this new version of CFM-ID (version 4.0) is significantly more accurate than previous CFM-ID versions in terms of both EI-MS/MS spectral prediction and compound identification. CFM-ID 4.0 is available at http://cfmid4.wishartlab.com/ as a web server and docker images can be downloaded at https://hub.docker.com/r/wishartlab/cfmid.
One year after identifying the first case of the 2019 coronavirus disease (COVID-19) in Canada, federal and provincial governments are still struggling to manage the pandemic. Provincial governments ...across Canada have experimented with widely varying policies in order to limit the burden of COVID-19. However, to date, the effectiveness of these policies has been difficult to ascertain. This is partly due to the lack of a publicly available, high-quality dataset on COVID-19 interventions and outcomes for Canada. The present paper provides a dataset containing important, Canadian-specific data that is known to affect COVID-19 outcomes, including sociodemographic, climatic, mobility and health system related information for all 10 Canadian provinces and their health regions. This dataset also includes longitudinal data on the daily number of COVID-19 cases, deaths, and the constantly changing intervention policies that have been implemented by each province in an attempt to control the pandemic.
We exploit the public enforcement of the anti-corruption campaign across China to identify a causal role of political corruption in corporate takeover flows through a difference-in-differences (DID) ...analysis. We find that a reduction in corruption increases cross-region takeover activities by 40% and that deal volume more than doubles. Further analyses reveal that treatment effects are more evident for non-SOEs, politically unconnected acquirers, and acquirers that are less corrupt ex ante. We also show that the impact of the anti-corruption campaign is more pronounced in segmented cities where corruption practices are more entrenched. The reduction in corruption leads to higher bidder returns, improves post-acquisition performance, and markedly strengthens local economic development. The evidence indicates that the anti-corruption campaign was effective in attracting inbound corporate investments and supporting economic growth.
We examine whether equity carve-outs (ECOs) lead to improvements in the functioning of the internal capital markets (ICM) of diversified firms. Divestitures, including spin-offs, sell-offs, and ...equity carve-outs, can be employed by firms to improve allocative efficiency. Equity carve-outs, unlike other forms of divestiture, leave the parent's ICM largely intact but provide the opportunity to enhance internal and external corporate governance mechanisms that can improve the parent's ICM. Using a US sample of 354 equity carve-outs completed between 1980 and 2013, we find that the allocative efficiency of parents is augmented significantly following transaction completion. This increase in allocative efficiency is driven by improvements in both the external and internal governance characteristics of parent companies, consistent with the expectation that motivates equity carve-outs.