The applications of modern artificial intelligence (AI) algorithms within the field of aging research offer tremendous opportunities. Aging is an almost universal unifying feature possessed by all ...living organisms, tissues, and cells. Modern deep learning techniques used to develop age predictors offer new possibilities for formerly incompatible dynamic and static data types. AI biomarkers of aging enable a holistic view of biological processes and allow for novel methods for building causal models-extracting the most important features and identifying biological targets and mechanisms. Recent developments in generative adversarial networks (GANs) and reinforcement learning (RL) permit the generation of diverse synthetic molecular and patient data, identification of novel biological targets, and generation of novel molecular compounds with desired properties and geroprotectors. These novel techniques can be combined into a unified, seamless end-to-end biomarker development, target identification, drug discovery and real world evidence pipeline that may help accelerate and improve pharmaceutical research and development practices. Modern AI is therefore expected to contribute to the credibility and prominence of longevity biotechnology in the healthcare and pharmaceutical industry, and to the convergence of countless areas of research.
Display omitted
Telomere length, gene expression, blood chemical parameters and DNA-methylation status all undergo age-associated changes, which can be measured and used to predict chronological age ...with a varying error rate. This review focuses on technical approaches used to construct aging clocks and compares their efficiency.
Display omitted
•Biohorology is the science of measuring the passage of time in living systems.•Today there are dozens of aging clocks based on such biomarkers of aging as DNAm, gene expression and metabolomic profiles.•DNAm clocks are the most popular so far, but they have a number of frequently overlooked technical drawbacks.•Deep learning methods can be used to develop aging clocks using data types previously deemed too complicated.•Deep learning methods can also be used to extend aging clocks’ functionality beyond age prediction.
The aging process results in multiple traceable footprints, which can be quantified and used to estimate an organism's age. Examples of such aging biomarkers include epigenetic changes, telomere attrition, and alterations in gene expression and metabolite concentrations. More than a dozen aging clocks use molecular features to predict an organism's age, each of them utilizing different data types and training procedures. Here, we offer a detailed comparison of existing mouse and human aging clocks, discuss their technological limitations and the underlying machine learning algorithms. We also discuss promising future directions of research in biohorology — the science of measuring the passage of time in living systems. Overall, we expect deep learning, deep neural networks and generative approaches to be the next power tools in this timely and actively developing field.
The application of artificial intelligence (AI) has been considered a revolutionary change in drug discovery and development. In 2020, the AlphaFold computer program predicted protein structures for ...the whole human genome, which has been considered a remarkable breakthrough in both AI applications and structural biology. Despite the varying confidence levels, these predicted structures could still significantly contribute to structure-based drug design of novel targets, especially the ones with no or limited structural information. In this work, we successfully applied AlphaFold to our end-to-end AI-powered drug discovery engines, including a biocomputational platform PandaOmics and a generative chemistry platform Chemistry42. A novel hit molecule against a novel target without an experimental structure was identified, starting from target selection towards hit identification, in a cost- and time-efficient manner. PandaOmics provided the protein of interest for the treatment of hepatocellular carcinoma (HCC) and Chemistry42 generated the molecules based on the structure predicted by AlphaFold, and the selected molecules were synthesized and tested in biological assays. Through this approach, we identified a small molecule hit compound for cyclin-dependent kinase 20 (CDK20) with a binding constant Kd value of 9.2 ± 0.5 μM (
n
= 3) within 30 days from target selection and after only synthesizing 7 compounds. Based on the available data, a second round of AI-powered compound generation was conducted and through this, a more potent hit molecule, ISM042-2-048, was discovered with an average Kd value of 566.7 ± 256.2 nM (
n
= 3). Compound ISM042-2-048 also showed good CDK20 inhibitory activity with an IC
50
value of 33.4 ± 22.6 nM (
n
= 3). In addition, ISM042-2-048 demonstrated selective anti-proliferation activity in an HCC cell line with CDK20 overexpression, Huh7, with an IC
50
of 208.7 ± 3.3 nM, compared to a counter screen cell line HEK293 (IC
50
= 1706.7 ± 670.0 nM). This work is the first demonstration of applying AlphaFold to the hit identification process in drug discovery.
A novel CDK20 small molecule inhibitor discovered by artificial intelligence based on an AlphaFold-predicted structure demonstrates the first application of AlphaFold in hit identification for efficient drug discovery.
The human gut microbiome is a complex ecosystem that both affects and is affected by its host status. Previous metagenomic analyses of gut microflora revealed associations between specific microbes ...and host age. Nonetheless there was no reliable way to tell a host's age based on the gut community composition. Here we developed a method of predicting hosts' age based on microflora taxonomic profiles using a cross-study dataset and deep learning. Our best model has an architecture of a deep neural network that achieves the mean absolute error of 5.91 years when tested on external data. We further advance a procedure for inferring the role of particular microbes during human aging and defining them as potential aging biomarkers. The described intestinal clock represents a unique quantitative model of gut microflora aging and provides a starting point for building host aging and gut community succession into a single narrative.
Display omitted
•DNNs are the most appropriate model to predict host age from gut microflora profiles•Our DNN models reach MAE of 5.9 years in independent verification•Feature importance analysis gives a starting point for anti-aging intervention design
Microbiology; Microbiome; Bioinformatics; Applied Computing in Medical Science; Artificial Intelligence; Deep Learning; Aging; Biogerontology; Aging Clock
Multiple cancer types have limited targeted therapeutic options, in part due to incomplete understanding of the molecular processes underlying tumorigenesis and significant intra- and inter-tumor ...heterogeneity. Identification of novel molecular biomarkers stratifying cancer patients with different survival outcomes may provide new opportunities for target discovery and subsequent development of tailored therapies. Here, we applied the artificial intelligence-driven PandaOmics platform ( https://pandaomics.com/ ) to explore gene expression changes in rare DNA repair-deficient disorders and identify novel cancer targets. Our analysis revealed that CEP135, a scaffolding protein associated with early centriole biogenesis, is commonly downregulated in DNA repair diseases with high cancer predisposition. Further screening of survival data in 33 cancers available at TCGA database identified sarcoma as a cancer type where lower survival was significantly associated with high CEP135 expression. Stratification of cancer patients based on CEP135 expression enabled us to examine therapeutic targets that could be used for the improvement of existing therapies against sarcoma. The latter was based on application of the PandaOmics target-ID algorithm coupled with in vitro studies that revealed polo-like kinase 1 (PLK1) as a potential therapeutic candidate in sarcoma patients with high CEP135 levels and poor survival. While further target validation is required, this study demonstrated the potential of in silico-based studies for a rapid biomarker discovery and target characterization.
Coronavirus disease 2019 (COVID-19) is an acute infection of the respiratory tract that emerged in December 2019 in Wuhan, China. It was quickly established that both the symptoms and the disease ...severity may vary from one case to another and several strains of SARS-CoV-2 have been identified. To gain a better understanding of the wide variety of SARS-CoV-2 strains and their associated symptoms, thousands of SARS-CoV-2 genomes have been sequenced in dozens of countries. In this article, we introduce COVIDomic, a multi-omics online platform designed to facilitate the analysis and interpretation of the large amount of health data collected from patients with COVID-19. The COVIDomic platform provides a comprehensive set of bioinformatic tools for the multi-modal metatranscriptomic data analysis of COVID-19 patients to determine the origin of the coronavirus strain and the expected severity of the disease. An integrative analytical workflow, which includes microbial pathogens community analysis, COVID-19 genetic epidemiology and patient stratification, allows to analyze the presence of the most common microbial organisms, their antibiotic resistance, the severity of the infection and the set of the most probable geographical locations from which the studied strain could have originated. The online platform integrates a user friendly interface which allows easy visualization of the results. We envision this tool will not only have immediate implications for management of the ongoing COVID-19 pandemic, but will also improve our readiness to respond to other infectious outbreaks.
Deep generative adversarial networks (GANs) are the emerging technology in drug discovery and biomarker development. In our recent work, we demonstrated a proof-of-concept of implementing deep ...generative adversarial autoencoder (AAE) to identify new molecular fingerprints with predefined anticancer properties. Another popular generative model is the variational autoencoder (VAE), which is based on deep neural architectures. In this work, we developed an advanced AAE model for molecular feature extraction problems, and demonstrated its advantages compared to VAE in terms of (a) adjustability in generating molecular fingerprints; (b) capacity of processing very large molecular data sets; and (c) efficiency in unsupervised pretraining for regression model. Our results suggest that the proposed AAE model significantly enhances the capacity and efficiency of development of the new molecules with specific anticancer properties using the deep generative models.
The significance of automated drug design using virtual generative models has steadily grown in recent years. While deep learning-driven solutions have received growing attention, only a few modern ...AI-assisted generative chemistry platforms have demonstrated the ability to produce valuable structures. At the same time, virtual fragment-based drug design, which was previously less popular due to the high computational costs, has become more attractive with the development of new chemoinformatic techniques and powerful computing technologies.
We developed Quantum-assisted Fragment-based Automated Structure Generator (QFASG), a fully automated algorithm designed to construct ligands for a target protein using a library of molecular fragments. QFASG was applied to generating new structures of CAMKK2 and ATM inhibitors.
New low-micromolar inhibitors of CAMKK2 and ATM were designed using the algorithm.
These findings highlight the algorithm's potential in designing primary hits for further optimization and showcase the capabilities of QFASG as an effective tool in this field.
In this article, we propose the deep neural network Adversarial Threshold Neural Computer (ATNC). The ATNC model is intended for the de novo design of novel small-molecule organic structures. The ...model is based on generative adversarial network architecture and reinforcement learning. ATNC uses a Differentiable Neural Computer as a generator and has a new specific block, called adversarial threshold (AT). AT acts as a filter between the agent (generator) and the environment (discriminator + objective reward functions). Furthermore, to generate more diverse molecules we introduce a new objective reward function named Internal Diversity Clustering (IDC). In this work, ATNC is tested and compared with the ORGANIC model. Both models were trained on the SMILES string representation of the molecules, using four objective functions (internal similarity, Muegge druglikeness filter, presence or absence of sp3-rich fragments, and IDC). The SMILES representations of 15K druglike molecules from the ChemDiv collection were used as a training data set. For the different functions, ATNC outperforms ORGANIC. Combined with the IDC, ATNC generates 72% of valid and 77% of unique SMILES strings, while ORGANIC generates only 7% of valid and 86% of unique SMILES strings. For each set of molecules generated by ATNC and ORGANIC, we analyzed distributions of four molecular descriptors (number of atoms, molecular weight, logP, and tpsa) and calculated five chemical statistical features (internal diversity, number of unique heterocycles, number of clusters, number of singletons, and number of compounds that have not been passed through medicinal chemistry filters). Analysis of key molecular descriptors and chemical statistical features demonstrated that the molecules generated by ATNC elicited better druglikeness properties. We also performed in vitro validation of the molecules generated by ATNC; results indicated that ATNC is an effective method for producing hit compounds.
Drug discovery and development is a notoriously risky process with high failure rates at every stage, including disease modeling, target discovery, hit discovery, lead optimization, preclinical ...development, human safety, and efficacy studies. Accurate prediction of clinical trial outcomes may help significantly improve the efficiency of this process by prioritizing therapeutic programs that are more likely to succeed in clinical trials and ultimately benefit patients. Here, we describe inClinico, a transformer‐based artificial intelligence software platform designed to predict the outcome of phase II clinical trials. The platform combines an ensemble of clinical trial outcome prediction engines that leverage generative artificial intelligence and multimodal data, including omics, text, clinical trial design, and small molecule properties. inClinico was validated in retrospective, quasi‐prospective, and prospective validation studies internally and with pharmaceutical companies and financial institutions. The platform achieved 0.88 receiver operating characteristic area under the curve in predicting the phase II to phase III transition on a quasi‐prospective validation dataset. The first prospective predictions were made and placed on date‐stamped preprint servers in 2016. To validate our model in a real‐world setting, we published forecasted outcomes for several phase II clinical trials achieving 79% accuracy for the trials that have read out. We also present an investment application of inClinico using date stamped virtual trading portfolio demonstrating 35% 9‐month return on investment.