An attractive drug target to combat COVID‐19 is the main protease (Mpro) because of its key role in the viral life cycle by processing the polyproteins translated from the viral RNA. Studying the ...crystal structures of the protease is important to enhance our understanding of its mechanism of action at the atomic‐level resolution, and consequently may provide crucial structural insights for structure‐based drug discovery. In the current study, we report a comparative structural analysis of the Mpro substrate binding site for both apo and holo forms to identify key interacting residues and conserved water molecules during the ligand‐binding process. It is shown that in addition to the catalytic dyad residues (His41 and Cys145), the oxyanion hole residues (Asn142–Ser144) and residues His164–Glu166 form essential parts of the substrate‐binding pocket of the protease in the binding process. Furthermore, we address the issue of the substrate‐binding pocket flexibility and show that two adjacent loops in the Mpro structures (residues Thr45–Met49 and Arg188–Ala191) with high flexibility can regulate the binding cavity’ accessibility for different ligand sizes. Moreover, we discuss in detail the various structural and functional roles of several important conserved and mobile water molecules within and around the binding site in the proper enzymatic function of Mpro. We also present a new docking protocol in the framework of the ensemble docking strategy. The performance of the docking protocol has been evaluated in predicting ligand binding pose and affinity ranking for two popular docking programs; AutoDock4 and AutoDock Vina. Our docking results suggest that the top‐ranked poses of the most populated clusters obtained by AutoDock Vina are the most important representative docking runs that show a very good performance in estimating experimental binding poses and affinity ranking.
Despite considerable advances obtained by applying machine learning approaches in protein-ligand affinity predictions, the incorporation of receptor flexibility has remained an important bottleneck. ...While ensemble docking has been used widely as a solution to this problem, the optimum choice of receptor conformations is still an open question considering the issues related to the computational cost and false positive pose predictions. Here, a combination of ensemble learning and ensemble docking is suggested to rank different conformations of the target protein in light of their importance for the final accuracy of the model. Available X-ray structures of cyclin-dependent kinase 2 (CDK2) in complex with different ligands are used as an initial receptor ensemble, and its redundancy is removed through a graph-based redundancy removal, which is shown to be more efficient and less subjective than clustering-based representative selection methods. A set of ligands with available experimental affinity are docked to this nonredundant receptor ensemble, and the energetic features of the best scored poses are used in an ensemble learning procedure based on the random forest method. The importance of receptors is obtained through feature selection measures, and it is shown that a few of the most important conformations are sufficient to reach 1 kcal/mol accuracy in affinity prediction with considerable improvement of the early enrichment power of the models compared to the different ensemble docking without learning strategies. A clear strategy has been provided in which machine learning selects the most important experimental conformers of the receptor among a large set of protein-ligand complexes while simultaneously maintaining the final accuracy of affinity predictions at the highest level possible for available data. Our results could be informative for future attempts to design receptor-specific docking-rescoring strategies.
Background
Helicobacter pylori is considered a true human pathogen for which rising drug resistance constitutes a drastic concern globally. The present study aimed to reconstruct a genome‐scale ...metabolic model (GSMM) to decipher the metabolic capability of H. pylori strains in response to clarithromycin and rifampicin along with identification of novel drug targets.
Materials and Methods
The iIT341 model of H. pylori was updated based on genome annotation data, and biochemical knowledge from literature and databases. Context‐specific models were generated by integrating the transcriptomic data of clarithromycin and rifampicin resistance into the model. Flux balance analysis was employed for identifying essential genes in each strain, which were further prioritized upon being nonhomologs to humans, virulence factor analysis, druggability, and broad‐spectrum analysis. Additionally, metabolic differences between sensitive and resistant strains were also investigated based on flux variability analysis and pathway enrichment analysis of transcriptomic data.
Results
The reconstructed GSMM was named as HpM485 model. Pathway enrichment and flux variability analyses demonstrated reduced activity in the ribosomal pathway in both clarithromycin‐ and rifampicin‐resistant strains. Also, a significant decrease was detected in the activity of metabolic pathways of clarithromycin‐resistant strain. Moreover, 23 and 16 essential genes were exclusively detected in clarithromycin‐ and rifampicin‐resistant strains, respectively. Based on prioritization analysis, cyclopropane fatty acid synthase and phosphoenolpyruvate synthase were identified as putative drug targets in clarithromycin‐ and rifampicin‐resistant strains, respectively.
Conclusions
We present a robust and reliable metabolic model of H. pylori. This model can predict novel drug targets to combat drug resistance and explore the metabolic capability of H. pylori in various conditions.
The molecular docking simulation is a key computational tool in modern drug discovery research that its predictive performance strongly depends on the employed scoring functions. Many recent studies ...have shown that the application of machine learning algorithms in the development of scoring functions has led to a significant improvement in docking performance. In this work, we introduce a new machine learning (ML) based scoring function called ET‐Score, which employs the distance‐weighted interatomic contacts between atom type pairs of the ligand and the protein for featurizing protein−ligand complexes and Extremely Randomized Trees algorithm for the training process. The performance of ET‐Score is compared with some successful ML‐based scoring functions and several popular classical scoring functions on the PDBbind 2016v core set. It is shown that our ET‐Score model (with Pearson's correlation of 0.827 and RMSE of 1.332) achieves very good performance in comparison with most of the ML‐based scoring functions and all classical scoring functions despite its extremely low computational cost. ET‐Score's codes are freely available on the web at https://github.com/miladrayka/ET_Score.
Antimicrobial peptides are promising tools to fight against ever-growing antibiotic resistance. However, despite many advantages, their toxicity to mammalian cells is a critical obstacle in clinical ...application and needs to be addressed. In this study, by using an up-to-date dataset, a machine learning model has been trained successfully to predict the toxicity of antimicrobial peptides. The comprehensive set of features of both physico-chemical and linguistic-based with local and global essences have undergone feature selection to identify key properties behind toxicity of antimicrobial peptides. After feature selection, the hybrid model showed the best performance with a recall of 0. 876 and a F1 score of 0. 849. The obtained model can be useful in extracting AMPs with low toxicity from AMP libraries in clinical applications. On the other hand, several properties with local nature including positions of strand forming and hydrophobic residues in final selected features show that these properties are critical definer of peptide properties and should be considered in developing models for activity prediction of peptides. The executable code is available at https://git.io/JRZaT.
In this study, we use some modified semiempirical quantum mechanics (SQM) methods for improving the molecular docking process. To this end, the three popular SQM Hamiltonians, PM6, PM6‐D3H4X, and PM7 ...are employed for geometry optimization of some binding modes of ligands docked into the human cyclin‐dependent kinase 2 (CDK2) by two widely used docking tools, AutoDock and AutoDock Vina. The results were analyzed with two different evaluation metrics: the symmetry‐corrected heavy‐atom RMSD and the fraction of recovered ligand‐protein contacts. It is shown that the evaluation of the fraction of recovered contacts is more useful to measure the similarity between two structures when interacting with a protein. It was also found that AutoDock is more successful than AutoDock Vina in producing the correct ligand poses (RMSD≤2.0 Å) and ranking of the poses. It is also demonstrated that the ligand optimization at the SQM level improves the docking results and the SQM structures have a significantly better fit to the observed crystal structures. Finally, the SQM optimizations reduce the number of close contacts in the docking poses and successfully remove most of the clash or bad contacts between ligand and protein.
Spermatogenesis is a complex process of cellular division and differentiation that begins with spermatogonia stem cells and leads to functional spermatozoa production. However, many of the molecular ...mechanisms underlying this process remain unclear. Single-cell RNA sequencing (scRNA-seq) is used to sequence the entire transcriptome at the single-cell level to assess cell-to-cell variability. In this study, more than 33,000 testicular cells from different scRNA-seq datasets with normal spermatogenesis were integrated to identify single-cell heterogeneity on a more comprehensive scale. Clustering, cell type assignments, differential expressed genes and pseudotime analysis characterized 5 spermatogonia, 4 spermatocyte, and 4 spermatid cell types during the spermatogenesis process. The UTF1 and ID4 genes were introduced as the most specific markers that can differentiate two undifferentiated spermatogonia stem cell sub-cellules. The C7orf61 and TNP can differentiate two round spermatid sub-cellules. The topological analysis of the weighted gene co-expression network along with the integrated scRNA-seq data revealed some bridge genes between spermatogenesis's main stages such as DNAJC5B, C1orf194, HSP90AB1, BST2, EEF1A1, CRISP2, PTMS, NFKBIA, CDKN3, and HLA-DRA. The importance of these key genes is confirmed by their role in male infertility in previous studies. It can be stated that, this integrated scRNA-seq of spermatogenic cells offers novel insights into cell-to-cell heterogeneity and suggests a list of key players with a pivotal role in male infertility from the fertile spermatogenesis datasets. These key functional genes can be introduced as candidates for filtering and prioritizing genotype-to-phenotype association in male infertility.
Obstructive azoospermia (OA), defined as an obstruction in any region of the male genital tract, accounts for 40% of all azoospermia cases. Of all OA cases, ~30% are thought to have a genetic origin, ...however, hitherto, the underlying genetic etiology of the majority of these cases remain unknown. To address this, we took a family-based whole-exome sequencing approach to identify causal variants of OA in a multiplex family with epidydimal obstruction. A novel gain-of-function missense variant in CLDN2 (c.481G>C; p.Gly161Arg) was found to co-segregate with the phenotype, consistent with the X-linked inheritance pattern observed in the pedigree. To assess the pathogenicity of this variant, the wild and mutant protein structures were modeled and their potential for strand formation in multimeric form was assessed and compared. The results showed that dimeric and tetrameric arrangements of Claudin-2 were not only reduced, but were also significantly altered by this single residue change. We, therefore, envisage that this amino acid change likely forms a polymeric discontinuous strand, which may lead to the disruption of tight junctions among epithelial cells. This missense variant is thus likely to be responsible for the disruption of the blood-epididymis barrier, causing dislodged epithelial cells to clog the genital tract, hence causing OA. This study not only sheds light on the underlying pathobiology of OA, but also provides a basis for more efficient diagnosis in the clinical setting.
The assembly of the amyloid-β peptide (Aβ) into toxic oligomers and fibrils is associated with Alzheimer’s disease and dementia. Therefore, disrupting amyloid assembly by direct targeting of the Aβ ...monomeric form with small molecules or antibodies is a promising therapeutic strategy. However, given the dynamic nature of Aβ, standard computational tools cannot be easily applied for high-throughput structure-based virtual screening in drug discovery projects. In the current study, we propose a computational pipeline—in the framework of the ensemble docking strategy—to identify catechins’ binding sites in monomeric Aβ42. It is shown that both hydrophobic aromatic interactions and hydrogen bonding are crucial for the binding of catechins to Aβ42. Additionally, it has been found that all the studied ligands, especially EGCG, can act as potent inhibitors against amyloid aggregation by blocking the central hydrophobic region of Aβ. Our findings are evaluated and confirmed with multi-microsecond MD simulations. Finally, it is suggested that our proposed pipeline, with low computational cost in comparison with MD simulations, is a suitable approach for the virtual screening of ligand libraries against Aβ.
The human protein disulfide isomerase (hPDI), is an essential four-domain multifunctional enzyme. As a result of disulfide shuffling in its terminal domains, hPDI exists in two oxidation states with ...different conformational preferences which are important for substrate binding and functional activities. Here, we address the redox-dependent conformational dynamics of hPDI through molecular dynamics (MD) simulations. Collective domain motions are identified by the principal component analysis of MD trajectories and redox-dependent opening-closing structure variations are highlighted on projected free energy landscapes. Then, important structural features that exhibit considerable differences in dynamics of redox states are extracted by statistical machine learning methods. Mapping the structural variations to time series of residue interaction networks also provides a holistic representation of the dynamical redox differences. With emphasizing on persistent long-lasting interactions, an approach is proposed that compiled these time series networks to a single dynamic residue interaction network (DRIN). Differential comparison of DRIN in oxidized and reduced states reveals chains of residue interactions that represent potential allosteric paths between catalytic and ligand binding sites of hPDI.