Although models have been developed for predicting severity of COVID-19 from the medical history of patients, simplified models with good accuracy could be more practical. In this study, we examined ...utility of simpler models for estimating risk of hospitalization of patients with COVID-19 and mortality of these patients based on demographic characteristics (sex, age, race, median household income based on zip code) and smoking status of 12,347 patients who tested positive at Mass General Brigham centers. The corresponding electronic records were queried (02/26-07/14/2020) to construct derivation and validation cohorts. The derivation cohort was used to fit generalized linear models for estimating risk of hospitalization within 30 days of COVID-19 diagnosis and mortality within approximately 3 months for the hospitalized patients. In the validation cohort, the model resulted in c-statistics of 0.77 95% CI 0.73-0.80 for hospitalization, and 0.84 95% CI 0.74-0.94 for mortality among hospitalized patients. Higher risk was associated with older age, male sex, Black ethnicity, lower socioeconomic status, and current/past smoking status. The models can be applied to predict the absolute risks of hospitalization and mortality, and could aid in individualizing the decision making when detailed medical history of patients is not readily available.
Randomized control trials (RCTs) with placebo are the gold standard for determining efficacy of novel pharmaceutical treatments. Since their inception, over 75 years ago, researchers have amassed a ...large body of underutilized data on outcomes in the placebo control arms of these trials. Although rare disease indications have used these historical placebo data as synthetic controls to reduce burden on patients and accelerate drug discovery, broad use of historical controls is in its infancy. Large‐scale historical placebo data could be leveraged to benefit both drug developers and patients if warehoused and made more available to guide trial design and analysis. Here, we examine challenges in utilizing historical controls related to heterogeneity in trial design, outcome ascertainment, patient characteristics, and unmeasured pharmacogenomic effects. We then discuss the advantages and disadvantages of current approaches and propose a path forward to broader use of historical controls in RCTs.
For any given level of overall adiposity, individuals vary considerably in fat distribution. The inherited basis of fat distribution in the general population is not fully understood. Here, we study ...up to 38,965 UK Biobank participants with MRI-derived visceral (VAT), abdominal subcutaneous (ASAT), and gluteofemoral (GFAT) adipose tissue volumes. Because these fat depot volumes are highly correlated with BMI, we additionally study six local adiposity traits: VAT adjusted for BMI and height (VATadj), ASATadj, GFATadj, VAT/ASAT, VAT/GFAT, and ASAT/GFAT. We identify 250 independent common variants (39 newly-identified) associated with at least one trait, with many associations more pronounced in female participants. Rare variant association studies extend prior evidence for PDE3B as an important modulator of fat distribution. Local adiposity traits (1) highlight depot-specific genetic architecture and (2) enable construction of depot-specific polygenic scores that have divergent associations with type 2 diabetes and coronary artery disease. These results - using MRI-derived, BMI-independent measures of local adiposity - confirm fat distribution as a highly heritable trait with important implications for cardiometabolic health outcomes.
Various methods for understanding the structural and dynamic properties of proteins rely on the analysis of their NMR chemical shifts. These methods require the initial assignment of NMR signals to ...particular atoms in the sequence of the protein, a step that can be very time-consuming. The probabilistic interaction network of evidence (PINE) algorithm for automated assignment of backbone and side chain chemical shifts utilizes a Bayesian probabilistic network model that analyzes sequence data and peak lists from multiple NMR experiments. PINE, which is one of the most popular and reliable automated chemical shift assignment algorithms, has been available to the protein NMR community for longer than a decade. We announce here a new web server version of PINE, called Integrative PINE (I-PINE), which supports more types of NMR experiments than PINE (including three-dimensional nuclear Overhauser enhancement and four-dimensional J-coupling experiments) along with more comprehensive visualization of chemical shift based analysis of protein structure and dynamics. The I-PINE server is freely accessible at
http://i-pine.nmrfam.wisc.edu
. Help pages and tutorial including browser capability are available at:
http://i-pine.nmrfam.wisc.edu/instruction.html
. Sample data that can be used for testing the web server are available at:
http://i-pine.nmrfam.wisc.edu/examples.html
.
Recent technological advances may lead to the development of small-scale quantum computers that are capable of solving problems that cannot be tackled with classical computers. A limited number of ...algorithms have been proposed and their relevance to real-world problems is a subject of active investigation. Analysis of many-body quantum systems is particularly challenging for classical computers due to the exponential scaling of the Hilbert space dimension with the number of particles. Hence, solving the problems relevant to chemistry and condensed-matter physics is expected to be the first successful application of quantum computers. In this Article, we propose another class of problems from the quantum realm that can be solved efficiently on quantum computers: model inference for nuclear magnetic resonance (NMR) spectroscopy, which is important for biological and medical research. Our results are based on three interconnected studies. First, we use methods from classical machine learning to analyse a dataset of NMR spectra of small molecules. We perform stochastic neighbourhood embedding and identify clusters of spectra, and demonstrate that these clusters are correlated with the covalent structure of the molecules. Second, we propose a simple and efficient method, aided by a quantum simulator, to extract the NMR spectrum of any hypothetical molecule described by a parametric Heisenberg model. Third, we propose a simple variational Bayesian inference procedure for estimating the Hamiltonian parameters of experimentally relevant NMR spectra.Currently available quantum hardware is limited by noise, so practical implementations often involve a combination with classical approaches. Sels et al. identify a promising application for such a quantum–classic hybrid approach, namely inferring molecular structure from NMR spectra, by employing a range of machine learning tools in combination with a quantum simulator.
The growth of the biological nuclear magnetic resonance (NMR) field and the development of new experimental technology have mandated the revision and enlargement of the NMR-STAR ontology used to ...represent experiments, spectral and derived data, and supporting metadata. We present here a brief description of the NMR-STAR ontology and software tools for manipulating NMR-STAR data files, editing the files, extracting selected data, and creating data visualizations. Detailed information on these is accessible from the links provided.
We conduct a large-scale meta-analysis of heart failure genome-wide association studies (GWAS) consisting of over 90,000 heart failure cases and more than 1 million control individuals of European ...ancestry to uncover novel genetic determinants for heart failure. Using the GWAS results and blood protein quantitative loci, we perform Mendelian randomization and colocalization analyses on human proteins to provide putative causal evidence for the role of druggable proteins in the genesis of heart failure. We identify 39 genome-wide significant heart failure risk variants, of which 18 are previously unreported. Using a combination of Mendelian randomization proteomics and genetic cis-only colocalization analyses, we identify 10 additional putatively causal genes for heart failure. Findings from GWAS and Mendelian randomization-proteomics identify seven (CAMK2D, PRKD1, PRKD3, MAPK3, TNFSF12, APOC3 and NAE1) proteins as potential targets for interventions to be used in primary prevention of heart failure.
Even though NMR has found countless applications in the field of small molecule characterization, there is no standard file format available for the NMR data relevant to structure characterization of ...small molecules. A new format is therefore introduced to associate the NMR parameters extracted from 1D and 2D spectra of organic compounds to the proposed chemical structure. These NMR parameters, which we shall call NMReDATA (for nuclear magnetic resonance extracted data), include chemical shift values, signal integrals, intensities, multiplicities, scalar coupling constants, lists of 2D correlations, relaxation times, and diffusion rates. The file format is an extension of the existing Structure Data Format, which is compatible with the commonly used MOL format. The association of an NMReDATA file with the raw and spectral data from which it originates constitutes an NMR record. This format is easily readable by humans and computers and provides a simple and efficient way for disseminating results of structural chemistry investigations, allowing automatic verification of published results, and for assisting the constitution of highly needed open‐source structural databases.
A format for the data extracted from a set of NMR spectra (chemical shifts, coupling constants, 2D correlations, etc.) will make it easier to report, compare, verify, validate, share, and archive NMR data relevant to structure determination.
The chemical composition of saccharide complexes underlies their biomedical activities as biomarkers for cardiometabolic disease, various types of cancer, and other conditions. However, because these ...molecules may undergo major structural modifications, distinguishing between compounds of saccharide and non-saccharide origin becomes a challenging computational problem that hinders the aggregation of information about their bioactive moieties. We have developed an algorithm and software package called "Cheminformatics Tool for Probabilistic Identification of Carbohydrates" (CTPIC) that analyzes the covalent structure of a compound to yield a probabilistic measure for distinguishing saccharides and saccharide-derivatives from non-saccharides. CTPIC analysis of the RCSB Ligand Expo (database of small molecules found to bind proteins in the Protein Data Bank) led to a substantial increase in the number of ligands characterized as saccharides. CTPIC analysis of Protein Data Bank identified 7.7% of the proteins as saccharide-binding. CTPIC is freely available as a webservice at (http://ctpic.nmrfam.wisc.edu).
Identification of discrepant data in aggregated databases is a key step in data curation and remediation. We have applied the ALATIS approach, which is based on the international chemical shift ...identifier (InChI) model, to the full PubChem Compound database to generate unique and reproducible compound and atom identifiers for all entries for which three-dimensional structures were available. This exercise also served to identify entries with discrepancies between structures and chemical formulas or InChI strings. The use of unique compound identifiers and atom nomenclature should support more rigorous links between small-molecule databases including those containing atom-specific information of the type available from crystallography and spectroscopy. The comprehensive results from this analysis are publicly available through our webserver http://alatis.nmrfam.wisc.edu/.