Ligand- and structure-based drug design approaches complement phenotypic and target screens, respectively, and are the two major frameworks for guiding early-stage drug discovery efforts. Since the ...beginning of this century, the advent of the genomic era has presented researchers with a myriad of high throughput biological data (parts lists and their interaction networks) to address efficacy and toxicity, augmenting the traditional ligand- and structure-based approaches. This data rich era has also presented us with challenges related to integrating and analyzing these multi-platform and multi-dimensional datasets and translating them into viable hypotheses. Hence in the present paper, we review these existing approaches to drug discovery research and argue the case for a new systems biology based approach. We present the basic principles and the foundational arguments/underlying assumptions of the systems biology based approaches to drug design. Also discussed are systems biology data types (key entities, their attributes and their relationships with each other, and data models/representations), software and tools used for both retrospective and prospective analysis, and the hypotheses that can be inferred. In addition, we summarize some of the existing resources for a systems biology based drug discovery paradigm (open TG-GATEs, DrugMatrix, CMap and LINCs) in terms of their strengths and limitations.
The prediction of binding poses and affinities is an area of active interest in computer-aided drug design (CADD). Given the documented limitations with either ligand or structure based approaches, ...we employed an integrated approach and developed a rapid protocol for binding mode and affinity predictions. This workflow was applied to the three protein targets of Community Structure–Activity Resource-2014 (CSAR-2014) exercise: Factor Xa (FXa), Spleen Tyrosine Kinase (SYK), and tRNA (guanine-N(1))-methyltransferase (TrmD). Our docking and scoring workflow incorporates compound clustering and ligand and protein structure based pharmacophore modeling, followed by local docking, minimization, and scoring. While the former part of the protocol ensures high-quality ligand alignments and mapping, the subsequent minimization and scoring provides the predicted binding modes and affinities. We made blind predictions of docking pose for 1, 5, and 14 ligands docked into 1, 2, and 12 crystal structures of FXa, SYK, and TrmD, respectively. The resulting 174 poses were compared with cocrystallized structures (1, 5, and 14 complexes) made available at the end of CSAR. Our predicted poses were related to the experimentally determined structures with a mean root-mean-square deviation value of 3.4 Å. Further, we were able to classify high and low affinity ligands with the area under the curve values of 0.47, 0.60, and 0.69 for FXa, SYK, and TrmD, respectively, indicating the validity of our approach in at least two of the three systems. Detailed critical analysis of the results and CSAR methodology ranking procedures suggested that a straightforward application of our workflow has limitations, as some of the performance measures do not reflect the actual utility of pose and affinity predictions in the biological context of individual systems.
Abstract
DNA-binding proteins (DBPs) perform diverse biological functions ranging from transcription to pathogen sensing. Machine learning methods can not only identify DBPs de novo but also provide ...insights into their DNA-recognition dynamics. However, it remains unclear whether available methods that can accurately predict DNA-binding sites in known DBPs can also identify novel DBPs. Moreover, sequence information is blind to the cellular- and disease-specific contexts of DBP activities, whereas the under-utilized knowledge from public gene expression data offers great promise. To address these issues, we have developed novel methods for predicting DBPs by integrating sequence and gene expression-derived features and applied them to explore human, mouse and Arabidopsis proteomes. While our sequence-based models outperformed the gene expression-based ones, some proteins with weaker DBP-like sequence features were correctly predicted by gene expression-based features, suggesting that these proteins acquire a tangible DBP functionality in a conducive gene expression environment. Analysis of motif enrichment among the co-expressed genes of top 100 candidates DBPs from hitherto unannotated genes provides further avenues to explore their functional associations.
The D3R 2015 grand drug design challenge provided a set of blinded challenges for evaluating the applicability of our protocols for pose and affinity prediction. In the present study, we report the ...application of two different strategies for the two D3R protein targets HSP90 and MAP4K4. HSP90 is a well-studied target system with numerous co-crystal structures and SAR data. Furthermore the D3R HSP90 test compounds showed high structural similarity to existing HSP90 inhibitors in BindingDB. Thus, we adopted an integrated docking and scoring approach involving a combination of both pharmacophoric and heavy atom similarity alignments, local minimization and quantitative structure activity relationships modeling, resulting in the reasonable prediction of pose with the root mean square deviation (RMSD) values of 1.75 Å for mean pose 1, 1.417 Å for the mean best pose and 1.85 Å for the mean all poses and affinity (ROC AUC = 0.702 at 7.5 pIC50 cut-off and R = 0.45 for 180 compounds). The second protein, MAP4K4, represents a novel system with limited SAR and co-crystal structure data and little structural similarity of the D3R MAP4K4 test compounds to known MAP4K4 ligands. For this system, we implemented an exhaustive pose and affinity prediction protocol involving docking and scoring using the PLANTS software which considers side chain flexibility together with protein–ligand fingerprints analysis assisting in pose prioritization. This protocol through fares poorly in pose prediction (with the RMSD values of 4.346 Å for mean pose 1, 4.69 Å for mean best pose and 4.75 Å for mean all poses) and produced reasonable affinity prediction (AUC = 0.728 at 7.5 pIC50 cut-off and R = 0.67 for 18 compounds, ranked 1st among 80 submissions).
Drug metabolism determines the fate of a drug when it enters the human body and is a critical factor in defining their absorption, distribution, metabolism, excretion and toxicity (ADMET) ...characteristics. Among the various drug metabolizing enzymes, cytochrome P450s (CYP450) constitute an important protein family that aside from functioning in xenobiotic metabolism, is also responsible for a diverse array of other roles encompassing steroid and cholesterol biosynthesis, fatty acid metabolism, calcium homeostasis, neuroendocrine functions and growth regulation. Although CYP450 typically converts xenobiotics into safe metabolites, there are some situations whereby the metabolite is more toxic than its parent molecule. Computational modeling has been instrumental in CYP450 research by rationalizing the nature of the binding event (i.e. inhibit or induce CYP450s) or metabolic stability of query compounds of interest. A plethora of computational approaches encompassing ligand, structure and systems based approaches have been utilized to model CYP450-ligand interactions. This review provides a brief background on the CYP450 family (i.e. its roles, advantages and disadvantages as well as its modulators) and then discusses the various computational approaches that have been used to model CYP450-ligand interaction. Particular focus was given to the use of quantitative structure-activity relationship (QSAR) and more recent proteochemometric modeling studies. Finally, a perspective on the current state of the art and future trends of the field is also provided.
Neuroblastoma are pediatric, extracranial malignancies showing alarming survival prognosis outcomes due to their resilience to current aggressive treatment regimens, including chemotherapies with ...cisplatin (CDDP) provided in the first line of therapy regimens. Metabolic deregulation supports tumor cell survival in drug-treated conditions. However, metabolic pathways underlying cisplatin-resistance are least studied in neuroblastoma. Our metabolomics analysis revealed that cisplatin-insensitive cells alter their metabolism; especially, the metabolism of amino acids was upregulated in cisplatin-insensitive cells compared to the cisplatin-sensitive neuroblastoma cell line. A significant increase in amino acid levels in cisplatin-insensitive cells led us to hypothesize that the mechanisms upregulating intracellular amino acid pools facilitate insensitivity in neuroblastoma. We hereby report that amino acid depletion reduces cell survival and cisplatin-insensitivity in neuroblastoma cells. Since cells regulate their amino acids levels through processes, such as autophagy, we evaluated the effects of hydroxychloroquine (HCQ), a terminal autophagy inhibitor, on the survival and amino acid metabolism of cisplatin-insensitive neuroblastoma cells. Our results demonstrate that combining HCQ with CDDP abrogated the amino acid metabolism in cisplatin-insensitive cells and sensitized neuroblastoma cells to sub-lethal doses of cisplatin. Our results suggest that targeting of amino acid replenishing mechanisms could be considered as a potential approach in developing combination therapies for treating neuroblastomas.
Neuroblastoma (NB) is an enigmatic and deadliest pediatric cancer to treat. The major obstacles to the effective immunotherapy treatments in NB are defective immune cells and the immune evasion ...tactics deployed by the tumor cells and the stromal microenvironment. Nervous system development during embryonic and pediatric stages is critically mediated by non-coding RNAs such as micro RNAs (miR). Hence, we explored the role of miRs in anti-tumor immune response via a range of data-driven workflows and in vitro & in vivo experiments. Using the TARGET, NB patient dataset (n=249), we applied the robust bioinformatic workflows incorporating differential expression, co-expression, survival, heatmaps, and box plots. We initially demonstrated the role of miR-15a-5p (miR-15a) and miR-15b-5p (miR-15b) as tumor suppressors, followed by their negative association with stromal cell percentages and a statistically significant negative regulation of T and natural killer (NK) cell signature genes, especially CD274 (PD-L1) in stromal-low patient subsets. The NB phase-specific expression of the miR-15a/miR-15b-PD-L1 axis was further corroborated using the PDX (n=24) dataset. We demonstrated miR-15a/miR-15b mediated degradation of PD-L1 mRNA through its interaction with the 3'-untranslated region and the RNA-induced silencing complex using sequence-specific luciferase activity and Ago2 RNA immunoprecipitation assays. In addition, we established miR-15a/miR-15b induced CD8+T and NK cell activation and cytotoxicity against NB in vitro. Moreover, injection of murine cells expressing miR-15a reduced tumor size, tumor vasculature and enhanced the activation and infiltration of CD8+T and NK cells into the tumors in vivo. We further established that blocking the surface PD-L1 using an anti-PD-L1 antibody rescued miR-15a/miR-15b induced CD8+T and NK cell-mediated anti-tumor responses. These findings demonstrate that miR-15a and miR-15b induce an anti-tumor immune response by targeting PD-L1 in NB.
Display omitted
Neuroblastoma is one of the deadliest pediatric cancers and shows resistance to therapy due to a diminished anti-tumor immune response. Patients with higher PD-L1, reduced miR-15a, and miR-15b are further associated with poor survival. Our findings demonstrate that miR-15a and miR-15b induce an anti-tumor immune response by targeting PD-L1 in neuroblastoma.
A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as ...hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective.
Lung cancer is the leading cause of cancer deaths in the world. The most common type of lung cancer is lung adenocarcinoma (AC). The genetic mechanisms of the early stages and lung AC progression ...steps are poorly understood. There is currently no clinically applicable gene test for the early diagnosis and AC aggressiveness. Among the major reasons for the lack of reliable diagnostic biomarkers are the extraordinary heterogeneity of the cancer cells, complex and poorly understudied interactions of the AC cells with adjacent tissue and immune system, gene variation across patient cohorts, measurement variability, small sample sizes and sub-optimal analytical methods. We suggest that gene expression profiling of the primary tumours and adjacent tissues (PT-AT) handled with a rational statistical and bioinformatics strategy of biomarker prediction and validation could provide significant progress in the identification of clinical biomarkers of AC. To minimise sample-to-sample variability, repeated multivariate measurements in the same object (organ or tissue, e.g. PT-AT in lung) across patients should be designed, but prediction and validation on the genome scale with small sample size is a great methodical challenge.
To analyse PT-AT relationships efficiently in the statistical modelling, we propose an Extreme Class Discrimination (ECD) feature selection method that identifies a sub-set of the most discriminative variables (e.g. expressed genes). Our method consists of a paired Cross-normalization (CN) step followed by a modified sign Wilcoxon test with multivariate adjustment carried out for each variable. Using an Affymetrix U133A microarray paired dataset of 27 AC patients, we reviewed the global reprogramming of the transcriptome in human lung AC tissue versus normal lung tissue, which is associated with about 2,300 genes discriminating the tissues with 100% accuracy. Cluster analysis applied to these genes resulted in four distinct gene groups which we classified as associated with (i) up-regulated genes in the mitotic cell cycle lung AC, (ii) silenced/suppressed gene specific for normal lung tissue, (iii) cell communication and cell motility and (iv) the immune system features. The genes related to mutagenesis, specific lung cancers, early stage of AC development, tumour aggressiveness and metabolic pathway alterations and adaptations of cancer cells are strongly enriched in the AC PT-AT discriminative gene set. Two AC diagnostic biomarkers SPP1 and CENPA were successfully validated on RT-RCR tissue array. ECD method was systematically compared to several alternative methods and proved to be of better performance and as well as it was validated by comparison of the predicted gene set with literature meta-signature.
We developed a method that identifies and selects highly discriminative variables from high dimensional data spaces of potential biomarkers based on a statistical analysis of paired samples when the number of samples is small. This method provides superior selection in comparison to conventional methods and can be widely used in different applications. Our method revealed at least 23 hundreds patho-biologically essential genes associated with the global transcriptional reprogramming of human lung epithelium cells and lung AC aggressiveness. This gene set includes many previously published AC biomarkers reflecting inherent disease complexity and specifies the mechanisms of carcinogenesis in the lung AC. SPP1, CENPA and many other PT-AT discriminative genes could be considered as the prospective diagnostic and prognostic biomarkers of lung AC.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In today's world of high-throughput in silico screening, the development of virtual screening methodologies to prioritize small molecules as new chemical entities (NCEs) for synthesis is of current ...interest. Among several approaches to virtual screening, structure-based virtual screening has been considered the most effective. However the problems associated with the ranking of potential solutions in terms of scoring functions remains one of the major bottlenecks in structure-based virtual screening technology. It has been suggested that scoring functions may be used as filters for distinguishing binders from nonbinders instead of accurately predicting their binding free energies. Subsequently, several improvements have been made in this area, which include the use of multiple rather than single scoring functions and application of either consensus or multivariate statistical methods or both to improve the discrimination between binders and nonbinders. In view of it, the discriminative ability (distinguishing binders from nonbinders) of binary QSAR models derived using LUDI and MOE scoring functions has been compared with the models derived by Jacobbsson et al. on five data sets viz. estrogen receptor αmimics (ERα_mimics), estrogen receptor αtoxins (ERα_toxins), matrix metalloprotease 3 inhibitors (MMP-3), factor Xa inhibitors (fXa), and acetylcholine esterase inhibitors (AChE). The overall analyses reveal that binary QSAR is comparable to the PLS discriminant analysis, rule-based, and Bayesian classification methods used by Jacobsson et al. Further the scoring functions implemented in LUDI and MOE can score a wide range of protein−ligand interactions and are comparable to the scoring functions implemented in ICM and Cscore. Thus the binary QSAR models derived using LUDI and MOE scoring functions may be useful as a preliminary screening layer in a multilayered virtual screening paradigm.