Computational techniques have been applied in the drug discovery pipeline since the 1980s. Given the low computational resources of the time, the first molecular modeling strategies relied on a rigid ...view of the ligand-target binding process. During the years, the evolution of hardware technologies has gradually allowed simulating the dynamic nature of the binding event. In this work, we present an overview of the evolution of structure-based drug discovery techniques in the study of ligand-target recognition phenomenon, going from the static molecular docking toward enhanced molecular dynamics strategies.
The dynamics behavior of a protein is essential for its functionality. Here, Doucet et al. demonstrate how the evolutionary analysis of conformational pathways within a protein family serves to ...identify common core scaffolds that accommodate branch-specific functional regions controlled by flexibility switches, offering a model for evolutionary-dynamics based protein design.
The dynamics behavior of a protein is essential for its functionality. Here, Doucet et al. demonstrate how the evolutionary analysis of conformational pathways within a protein family serves to identify common core scaffolds that accommodate branch-specific functional regions controlled by flexibility switches, offering a model for evolutionary-dynamics based protein design.
We describe the testing and release of AutoDock4 and the accompanying graphical user interface AutoDockTools. AutoDock4 incorporates limited flexibility in the receptor. Several tests are reported ...here, including a redocking experiment with 188 diverse ligand-protein complexes and a cross-docking experiment using flexible sidechains in 87 HIV protease complexes. We also report its utility in analysis of covalently bound ligands, using both a grid-based docking method and a modification of the flexible sidechain technique.
Display omitted
•Protein flexibility plays a crucial role in biological function.•MEDUSA is a web-server for prediction of protein flexibility from sequence.•It uses a deep convolutional network to ...assign flexibility class for each residue.•MEDUSA provides binary, three-class and five-class predictions.•It provides important insights about the protein function mechanism.
Information on the protein flexibility is essential to understand crucial molecular mechanisms such as protein stability, interactions with other molecules and protein functions in general. B-factor obtained in the X-ray crystallography experiments is the most common flexibility descriptor available for the majority of the resolved protein structures. Since the gap between the number of the resolved protein structures and available protein sequences is continuously growing, it is important to provide computational tools for protein flexibility prediction from amino acid sequence. In the current study, we report a Deep Learning based protein flexibility prediction tool MEDUSA (https://www.dsimb.inserm.fr/MEDUSA). MEDUSA uses evolutionary information extracted from protein homologous sequences and amino acid physico-chemical properties as input for a convolutional neural network to assign a flexibility class to each protein sequence position. Trained on a non-redundant dataset of X-ray structures, MEDUSA provides flexibility prediction in two, three and five classes. MEDUSA is freely available as a web-server providing a clear visualization of the prediction results as well as a standalone utility (https://github.com/DSIMB/medusa). Analysis of the MEDUSA output allows a user to identify the potentially highly deformable protein regions and general dynamic properties of the protein.
Display omitted
•Polymerase α-Primase is highly flexible.•PRIM2C flexibility is essential for RNA primer length regulation.•Weak affinity of PRIM1 RNA catalytic subunit drives dissociation during RNA ...priming.•Flexibility and dissociation contribute to RNA primer length counting and handover to Pol α.
DNA replication in eukaryotes relies on the synthesis of a ∼30-nucleotide RNA/DNA primer strand through the dual action of the heterotetrameric polymerase α-primase (pol-prim) enzyme. Synthesis of the 7-10-nucleotide RNA primer is regulated by the C-terminal domain of the primase regulatory subunit (PRIM2C) and is followed by intramolecular handoff of the primer to pol α for extension by ∼20 nucleotides of DNA. Here, we provide evidence that RNA primer synthesis is governed by a combination of the high affinity and flexible linkage of the PRIM2C domain and the surprisingly low affinity of the primase catalytic domain (PRIM1) for substrate. Using a combination of small angle X-ray scattering and electron microscopy, we found significant variability in the organization of PRIM2C and PRIM1 in the absence and presence of substrate, and that the population of structures with both PRIM2C and PRIM1 in a configuration aligned for synthesis is low. Crosslinking was used to visualize the orientation of PRIM2C and PRIM1 when engaged by substrate as observed by electron microscopy. Microscale thermophoresis was used to measure substrate affinities for a series of pol-prim constructs, which showed that the PRIM1 catalytic domain does not bind the template or emergent RNA-primed templates with appreciable affinity. Together, these findings support a model of RNA primer synthesis in which generation of the nascent RNA strand and handoff of the RNA-primed template from primase to polymerase α is mediated by the high degree of inter-domain flexibility of pol-prim, the ready dissociation of PRIM1 from its substrate, and the much higher affinity of the POLA1cat domain of polymerase α for full-length RNA-primed templates.
Display omitted
•Aap undergoes Zn2+-dependent assembly into amyloid fibrils that stabilize biofilms.•Point mutants were analyzed for their ability to assemble and aggregate into fibrils.•B-repeat ...monomers from Aap are highly extended, mostly rigid rods.•Data-driven dimer and tetramer models suggest a mechanism for amyloidogenesis.•Challenges inherent in SAXS analysis of highly elongated structures are discussed.
Staphylococcus epidermidis is a commensal bacterium on human skin that is also the leading cause of medical device-related infections. The accumulation-associated protein (Aap) from S. epidermidis is a critical factor for infection via its ability to mediate biofilm formation. The B-repeat superdomain of Aap is composed of 5 to 17 Zn2+-binding B-repeats, which undergo rapid, reversible assembly to form dimer and tetramer species. The tetramer can then undergo a conformational change and nucleate highly stable functional amyloid fibrils. In this study, multiple techniques including analytical ultracentrifugation (AUC) and small-angle X-ray scattering (SAXS) are used to probe a panel of B-repeat mutant constructs that assemble to distinct oligomeric states to define the structural characteristics of B-repeat dimer and tetramer species. The B-repeat region from Aap forms an extremely elongated conformation that presents several challenges for standard SAXS analyses. Specialized approaches, such as cross-sectional analyses, allowed for in-depth interpretation of data, while explicit-solvent calculations via WAXSiS allowed for accurate evaluation of atomistic models. The resulting models suggest mechanisms by which Aap functional amyloid fibrils form, illuminating an important contributing factor to recurrent staphylococcal infections.
Advances in sequencing techniques and statistical methods have made it possible not only to predict sequences of ancestral proteins but also to identify thousands of mutations in the human exome, ...some of which are disease associated. These developments have motivated numerous theories and raised many questions regarding the fundamental principles behind protein evolution, which have been traditionally investigated horizontally using the tip of the phylogenetic tree through comparative studies of extant proteins within a family. In this article, we review a vertical comparison of the modern and resurrected ancestral proteins. We focus mainly on the dynamical properties responsible for a protein's ability to adapt new functions in response to environmental changes. Using the Dynamic Flexibility Index and the Dynamic Coupling Index to quantify the relative flexibility and dynamic coupling at a site-specific, single-amino-acid level, we provide evidence that the migration of hinges, which are often functionally critical rigid sites, is a mechanism through which proteins can rapidly evolve. Additionally, we show that disease-associated mutations in proteins often result in flexibility changes even at positions distal from mutational sites, particularly in the modulation of active site dynamics.
We propose a pipeline that combines AlphaFold2 (AF2) and crosslinking mass spectrometry (XL-MS) to model the structure of proteins with multiple conformations. The pipeline consists of two main ...steps: ensemble generation using AF2 and conformer selection using XL-MS data. For conformer selection, we developed two scores—the monolink probability score (MP) and the crosslink probability score (XLP)—both of which are based on residue depth from the protein surface. We benchmarked MP and XLP on a large dataset of decoy protein structures and showed that our scores outperform previously developed scores. We then tested our methodology on three proteins having an open and closed conformation in the Protein Data Bank: Complement component 3 (C3), luciferase, and glutamine-binding periplasmic protein, first generating ensembles using AF2, which were then screened for the open and closed conformations using experimental XL-MS data. In five out of six cases, the most accurate model within the AF2 ensembles—or a conformation within 1 Å of this model—was identified using crosslinks, as assessed through the XLP score. In the remaining case, only the monolinks (assessed through the MP score) successfully identified the open conformation of glutamine-binding periplasmic protein, and these results were further improved by including the “occupancy” of the monolinks. This serves as a compelling proof-of-concept for the effectiveness of monolinks. In contrast, the AF2 assessment score was only able to identify the most accurate conformation in two out of six cases. Our results highlight the complementarity of AF2 with experimental methods like XL-MS, with the MP and XLP scores providing reliable metrics to assess the quality of the predicted models. The MP and XLP scoring functions mentioned above are available at https://gitlab.com/topf-lab/xlms-tools.
Display omitted
•Multiple conformations of proteins can be predicted with AlphaFold2.•Crosslinks and monolinks can be used to select the relevant protein conformation.•Tools for data analysis are available at https://gitlab.com/topf-lab/xlms-tools.
Many proteins can be described by more than one structure, each of which has a functional significance. Crosslinks and monolinks from crosslinking mass spectrometry (XL-MS) can help probe a protein's conformation under specific physiological conditions. We present a pipeline to predict a protein's specific conformation. We employ a combination of ensemble generation using AlphaFold2 and ensemble scoring using condition-specific XL-MS data. Both crosslinks and monolinks were found to be useful in finding the correct protein conformation.
Molecular dynamics (MD) simulations of proteins reveal the existence of many transient surface pockets; however, the factors determining what small subset of these represent druggable or functionally ...relevant ligand binding sites, called “cryptic sites,” are not understood. Here, we examine multiple X-ray structures for a set of proteins with validated cryptic sites, using the computational hot spot identification tool FTMap. The results show that cryptic sites in ligand-free structures generally have a strong binding energy hot spot very close by. As expected, regions around cryptic sites exhibit above-average flexibility, and close to 50% of the proteins studied here have unbound structures that could accommodate the ligand without clashes. Nevertheless, the strong hot spot neighboring each cryptic site is almost always exploited by the bound ligand, suggesting that binding may frequently involve an induced fit component. We additionally evaluated the structural basis for cryptic site formation, by comparing unbound to bound structures. Cryptic sites are most frequently occluded in the unbound structure by intrusion of loops (22.5%), side chains (19.4%), or in some cases entire helices (5.4%), but motions that create sites that are too open can also eliminate pockets (19.4%). The flexibility of cryptic sites frequently leads to missing side chains or loops (12%) that are particularly evident in low resolution crystal structures. An interesting observation is that cryptic sites formed solely by the movement of side chains, or of backbone segments with fewer than five residues, result only in low affinity binding sites with limited use for drug discovery.
The analysis of hydrogen deuterium exchange by mass spectrometry as a function of temperature and mutation has emerged as a generic and efficient tool for the spatial resolution of protein networks ...that are proposed to function in the thermal activation of catalysis. In this work, we extend temperature-dependent hydrogen deuterium exchange from apo-enzyme structures to protein–ligand complexes. Using adenosine deaminase as a prototype, we compared the impacts of a substrate analog (1-deaza-adenosine) and a very tight-binding inhibitor/transition state analog (pentostatin) at single and multiple temperatures. At a single temperature, we observed different hydrogen deuterium exchange-mass spectrometry properties for the two ligands, as expected from their 106-fold differences in strength of binding. By contrast, analogous patterns for temperature-dependent hydrogen deuterium exchange mass spectrometry emerge in the presence of both 1-deaza-adenosine and pentostatin, indicating similar impacts of either ligand on the enthalpic barriers for local protein unfolding. We extended temperature-dependent hydrogen deuterium exchange to a function-altering mutant of adenosine deaminase in the presence of pentostatin and revealed a protein thermal network that is highly similar to that previously reported for the apo-enzyme (19936-19949). Finally, we discuss the differential impacts of pentostatin binding on overall protein flexibility versus site-specific thermal transfer pathways in the context of models for substrate-induced changes to a distributed protein conformational landscape that act in synergy with embedded protein thermal networks to achieve efficient catalysis.