Small molecules are usually compared by their chemical structure, but there is no unified analytic framework for representing and comparing their biological activity. We present the Chemical Checker ...(CC), which provides processed, harmonized and integrated bioactivity data on ~800,000 small molecules. The CC divides data into five levels of increasing complexity, from the chemical properties of compounds to their clinical outcomes. In between, it includes targets, off-targets, networks and cell-level information, such as omics data, growth inhibition and morphology. Bioactivity data are expressed in a vector format, extending the concept of chemical similarity to similarity between bioactivity signatures. We show how CC signatures can aid drug discovery tasks, including target identification and library characterization. We also demonstrate the discovery of compounds that reverse and mimic biological signatures of disease models and genetic perturbations in cases that could not be addressed using chemical information alone. Overall, the CC signatures facilitate the conversion of bioactivity data to a format that is readily amenable to machine learning methods.
Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched ...representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.
In the era of systems biology, multi-target pharmacological strategies hold promise for tackling disease-related networks. In this regard, drug promiscuity may be leveraged to interfere with multiple ...receptors: the so-called polypharmacology of drugs can be anticipated by analyzing the similarity of binding sites across the proteome. Here, we perform a pairwise comparison of 90,000 putative binding pockets detected in 3,700 proteins, and find that 23,000 pairs of proteins have at least one similar cavity that could, in principle, accommodate similar ligands. By inspecting these pairs, we demonstrate how the detection of similar binding sites expands the space of opportunities for the rational design of drug polypharmacology. Finally, we illustrate how to leverage these opportunities in protein-protein interaction networks related to several therapeutic classes and tumor types, and in a genome-scale metabolic model of leukemia.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
A reverse pH gradient is a hallmark of cancer metabolism, manifested by extracellular acidosis and intracellular alkalization. While consequences of extracellular acidosis are known, the roles of ...intracellular alkalization are incompletely understood. By reconstructing and integrating enzymatic pH-dependent activity profiles into cell-specific genome-scale metabolic models, we develop a computational methodology that explores how intracellular pH (pHi) can modulate metabolism. We show that in silico, alkaline pHi maximizes cancer cell proliferation coupled to increased glycolysis and adaptation to hypoxia (i.e., the Warburg effect), whereas acidic pHi disables these adaptations and compromises tumor cell growth. We then systematically identify metabolic targets (GAPDH and GPI) with predicted amplified anti-cancer effects at acidic pHi, forming a novel therapeutic strategy. Experimental testing of this strategy in breast cancer cells reveals that it is particularly effective against aggressive phenotypes. Hence, this study suggests essential roles of pHi in cancer metabolism and provides a conceptual and computational framework for exploring pHi roles in other biomedical domains.
Biological data is accumulating at an unprecedented rate, escalating the role of data‐driven methods in computational drug discovery. This scenario is favored by recent advances in machine learning ...algorithms, which are optimized for huge datasets and consistently beat the predictive performance of previous art, rapidly approaching human expert reasoning. The urge to couple biological data to cutting‐edge machine learning has spurred developments in data integration and knowledge representation, especially in the form of heterogeneous, multiplex and semantically‐rich biological networks. Today, thanks to the propitious rise in knowledge embedding techniques, these large and complex biological networks can be converted to a vector format that suits the majority of machine learning implementations. Here, we explain why this can be particularly transformative for drug discovery where, for decades, customary chemoinformatics methods have employed vector descriptors of compound structures as the standard input of their prediction tasks. A common vector format to represent biology and chemistry may push biological information into most of the existing steps of the drug discovery pipeline, boosting the accuracy of predictions and uncovering connections between small molecules and other biological entities such as targets or diseases.
This article is categorized under:
Computer and Information Science > Databases and Expert Systems
Computer and Information Science > Chemoinformatics
Embedding of large and heterogeneous biological networks.
Introduction
In the current “test and treat” era, HIV programmes are increasingly focusing resources on linkage to care and same‐day antiretroviral therapy (ART) initiation to meet UNAIDS 95‐95‐95 ...targets. After observing sub‐optimal treatment indicators in health facilities supported by the Centre for Infectious Disease Research in Zambia (CIDRZ), we piloted a “linkage assessment” tool in facility‐based HIV testing settings to uncover barriers to same‐day linkage to care and ART initiation among newly identified people living with HIV (PLHIV) and to guide HIV programme quality improvement efforts.
Methods
The one‐page, structured linkage assessment tool was developed to capture patient‐reported barriers to same‐day linkage and ART initiation using three empirically supported categories of barriers: social, personal and structural. The tool was implemented in three health facilities, two urban and one rural, in Lusaka, Zambia from 1 November 2017 to 31 January 2018, and administered to all newly identified PLHIV declining same‐day linkage and ART. Individuals selected as many reasons as relevant. We used mixed‐effects logistic regression modelling to evaluate predictors of citing specific barriers to same‐day linkage and ART, and Fisher’s Exact tests to assess differences in barrier citation by socio‐demographics and HIV testing entry point.
Results
A total of 1278 people tested HIV positive, of whom 126 (9.9%) declined same‐day linkage and ART, reporting a median of three barriers per respondent. Of these 126, 71.4% were female. Females declining same‐day ART were younger, on average, (median 28.5 years, interquartile range (IQR): 21 to 37 years) than males (median 34.5 years, IQR: 26 to 44 years). The most commonly reported barrier category was structural, “clinics were too crowded” (n = 33), followed by a social reason, “friends and family will condemn me” (n = 30). The frequency of citing personal barriers differed significantly across HIV testing point (χ2 p = 0.03). Significant predictors for citing ≥1 barrier to same‐day ART were >50 years of age (OR: 12.59, 95% CI: 6.00 to 26.41) and testing at a rural facility (OR: 9.92, 95% CI: 4.98 to 19.79).
Conclusions
Given differences observed in barriers to same‐day ART initiation reported across sex, age, testing point, and facility type, new, tailored counselling and linkage to care approaches are needed, which should be rigorously evaluated in routine programme settings.
Efforts to compile the phenotypic effects of drugs and environmental chemicals offer the opportunity to adopt a chemo-centric view of human health that does not require detailed mechanistic ...information. Here we consider thousands of chemicals and analyse the relationship of their structures with adverse and therapeutic responses. Our study includes molecules related to the aetiology of 934 health-threatening conditions and used to treat 835 diseases. We first identify chemical moieties that could be independently associated with each phenotypic effect. Using these fragments, we build accurate predictors for approximately 400 clinical phenotypes, finding many privileged and liable structures. Finally, we connect two diseases if they relate to similar chemical structures. The resulting networks of human conditions are able to predict disease comorbidities, as well as identifying potential drug side effects and opportunities for drug repositioning, and show a remarkable coincidence with clinical observations.
We present here a new approach for the systematic identification of functionally relevant conformations in proteins. Our fully automated pipeline, based on discrete molecular dynamics enriched with ...coevolutionary information, is able to capture alternative conformational states in 76% of the proteins studied, providing key atomic details for understanding their function and mechanism of action. We also demonstrate that, given its sampling speed, our method is well suited to explore structural transitions in a high-throughput manner, and can be used to determine functional conformational transitions at the entire proteome level.
Display omitted
•Automated prediction of alternative conformations in proteins•Systematically explores the correlation between coevolution and dynamics•Emphasizes the need for improving coevolution contact detection methods
Protein flexibility is as important as structure to determine biological function. Sfriso et al. present a new approach, based on discrete molecular dynamics simulations guided by coevolutionary information, for the systematic identification of functional conformations in proteins. The strategy is able to capture alternative conformational states of varying complexity.
Abstract
Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered ...simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., ‘drug treats disease’, ‘gene interacts with gene’). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.
While alternative splicing is known to diversify the functional characteristics of some genes, the extent to which protein isoforms globally contribute to functional complexity on a proteomic scale ...remains unknown. To address this systematically, we cloned full-length open reading frames of alternatively spliced transcripts for a large number of human genes and used protein-protein interaction profiling to functionally compare hundreds of protein isoform pairs. The majority of isoform pairs share less than 50% of their interactions. In the global context of interactome network maps, alternative isoforms tend to behave like distinct proteins rather than minor variants of each other. Interaction partners specific to alternative isoforms tend to be expressed in a highly tissue-specific manner and belong to distinct functional modules. Our strategy, applicable to other functional characteristics, reveals a widespread expansion of protein interaction capabilities through alternative splicing and suggests that many alternative “isoforms” are functionally divergent (i.e., “functional alloforms”).
Display omitted
•Alternative splicing can produce isoforms with vastly different interaction profiles•These differences can be as great as those between proteins encoded by different genes•Isoform-specific partners exhibit distinct expression and functional characteristics
Alternatively spliced isoforms of proteins exhibit strikingly different interaction profiles and thus, in the context of global interactome networks, appear to behave as if encoded by distinct genes rather than as minor variants of each other.