Proteins are increasingly used in basic and applied biomedical research. Many proteins, however, are only marginally stable and can be expressed in limited amounts, thus hampering research and ...applications. Research has revealed the thermodynamic, cellular, and evolutionary principles and mechanisms that underlie marginal stability. With this growing understanding, computational stability design methods have advanced over the past two decades starting from methods that selectively addressed only some aspects of marginal stability. Current methods are more general and, by combining phylogenetic analysis with atomistic design, have shown drastic improvements in solubility, thermal stability, and aggregation resistance while maintaining the protein's primary molecular activity. Stability design is opening the way to rational engineering of improved enzymes, therapeutics, and vaccines and to the application of protein design methodology to large proteins and molecular activities that have proven challenging in the past.
Upon heterologous overexpression, many proteins misfold or aggregate, thus resulting in low functional yields. Human acetylcholinesterase (hAChE), an enzyme mediating synaptic transmission, is a ...typical case of a human protein that necessitates mammalian systems to obtain functional expression. We developed a computational strategy and designed an AChE variant bearing 51 mutations that improved core packing, surface polarity, and backbone rigidity. This variant expressed at ∼2,000-fold higher levels in E. coli compared to wild-type hAChE and exhibited 20°C higher thermostability with no change in enzymatic properties or in the active-site configuration as determined by crystallography. To demonstrate broad utility, we similarly designed four other human and bacterial proteins. Testing at most three designs per protein, we obtained enhanced stability and/or higher yields of soluble and active protein in E. coli. Our algorithm requires only a 3D structure and several dozen sequences of naturally occurring homologs, and is available at http://pross.weizmann.ac.il.
Display omitted
•A new computational method is used to stabilize five recalcitrant proteins•Designed variants show higher expression and stability with unmodified function•A designed human acetylcholinesterase variant expresses solubly in bacteria•The method is fully automated and implemented on a webserver
Heterologous expression of proteins and their mutants often results in misfolding and aggregation. Goldenzweig et al. (2016) developed an automated algorithm for protein stabilization requiring minimal experimental testing; for instance, the five tested variants of human acetylcholinesterase showed ≥100-fold higher soluble bacterial expression and higher melting temperatures than wild-type.
The folding of natural biopolymers into unique three-dimensional structures that determine their function is remarkable considering the vast number of alternative states and requires a large gap in ...the energy of the functional state compared to the many alternatives. This Perspective explores the implications of this energy gap for computing the structures of naturally occurring biopolymers, designing proteins with new structures and functions, and optimally integrating experiment and computation in these endeavors. Possible parallels between the generation of functional molecules in computational design and natural evolution are highlighted.
Antibodies developed for research and clinical applications may exhibit suboptimal stability, expressibility, or affinity. Existing optimization strategies focus on surface mutations, whereas natural ...affinity maturation also introduces mutations in the antibody core, simultaneously improving stability and affinity. To systematically map the mutational tolerance of an antibody variable fragment (Fv), we performed yeast display and applied deep mutational scanning to an anti-lysozyme antibody and found that many of the affinity-enhancing mutations clustered at the variable light-heavy chain interface, within the antibody core. Rosetta design combined enhancing mutations, yielding a variant with tenfold higher affinity and substantially improved stability. To make this approach broadly accessible, we developed AbLIFT, an automated web server that designs multipoint core mutations to improve contacts between specific Fv light and heavy chains (http://AbLIFT.weizmann.ac.il). We applied AbLIFT to two unrelated antibodies targeting the human antigens VEGF and QSOX1. Strikingly, the designs improved stability, affinity, and expression yields. The results provide proof-of-principle for bypassing laborious cycles of antibody engineering through automated computational affinity and stability design.
Macromolecular modeling and design are increasingly useful in basic research, biotechnology, and teaching. However, the absence of a user-friendly modeling framework that provides access to a wide ...range of modeling capabilities is hampering the wider adoption of computational methods by non-experts. RosettaScripts is an XML-like language for specifying modeling tasks in the Rosetta framework. RosettaScripts provides access to protocol-level functionalities, such as rigid-body docking and sequence redesign, and allows fast testing and deployment of complex protocols without need for modifying or recompiling the underlying C++ code. We illustrate these capabilities with RosettaScripts protocols for the stabilization of proteins, the generation of computationally constrained libraries for experimental selection of higher-affinity binding proteins, loop remodeling, small-molecule ligand docking, design of ligand-binding proteins, and specificity redesign in DNA-binding proteins.
•Stability-threshold effects and biomolecular epistasis limit protein design.•A deeper understanding of these limitations led to successful design methods.•New design methods enable effective protein ...optimization without experimental iterations.•Some design methods are available through web servers enabling non-expert users.
Our ability to design new or improved biomolecular activities depends on understanding the sequence-function relationships in proteins. The large size and fold complexity of most proteins, however, obscure these relationships, and protein-optimization methods continue to rely on laborious experimental iterations. Recently, a deeper understanding of the roles of stability-threshold effects and biomolecular epistasis in proteins has led to the development of hybrid methods that combine phylogenetic analysis with atomistic design calculations. These methods enable reliable and even single-step optimization of protein stability, expressibility, and activity in proteins that were considered outside the scope of computational design. Furthermore, ancestral-sequence reconstruction produces insights on missing links in the evolution of enzymes and binders that may be used in protein design. Through the combination of phylogenetic and atomistic calculations, the long-standing goal of general computational methods that can be universally applied to study and optimize proteins finally seems within reach.
Display omitted
•New deep learning methods provide atomically accurate protein structure predictions.•The new predictors filter out coevolution-based couplings not due to direct contacts.•The new ...predictors yield one structure for allosteric proteins with multiple states.•Future modeling approaches that address protein dynamics and allostery are proposed.
Recent progress in structure-prediction methods that rely on deep learning suggests that the atomic structure of almost any protein may soon be predictable directly from its amino acid sequence. This much-awaited revolution was driven by substantial improvements in the reliability of methods for inferring the spatial distances between amino acid pairs from an analysis of homologous sequences. Improved reliability has been accompanied, however, by a reduced ability to detect amino acid relationships that are not due to direct spatial contacts, such as those that arise from protein dynamics or allostery. Given the central importance of dynamics and allostery to protein activity, we argue that an important future advance would extend modeling beyond predicting a single static structure. Here, we briefly review some of the developments that have led to the remarkable recent achievement in structure prediction and speculate what methods and sources of information may be leveraged in the future to develop a modeling framework that addresses protein dynamics and allostery.
Substantial improvements in enzyme activity demand multiple mutations at spatially proximal positions in the active site. Such mutations, however, often exhibit unpredictable epistatic (non-additive) ...effects on activity. Here we describe FuncLib, an automated method for designing multipoint mutations at enzyme active sites using phylogenetic analysis and Rosetta design calculations. We applied FuncLib to two unrelated enzymes, a phosphotriesterase and an acetyl-CoA synthetase. All designs were active, and most showed activity profiles that significantly differed from the wild-type and from one another. Several dozen designs with only 3–6 active-site mutations exhibited 10- to 4,000-fold higher efficiencies with a range of alternative substrates, including hydrolysis of the toxic organophosphate nerve agents soman and cyclosarin and synthesis of butyryl-CoA. FuncLib is implemented as a web server (http://FuncLib.weizmann.ac.il); it circumvents iterative, high-throughput experimental screens and opens the way to designing highly efficient and diverse catalytic repertoires.
Display omitted
•FuncLib is a new method that designs diverse multipoint mutants in enzyme active sites•Designs are efficient and functionally diverse, bypassing high-throughput screening•Designs exhibit up to 4 orders of magnitude improvement in several activities•FuncLib is implemented as a web-server (http://funclib.weizmann.ac.il)
Khersonsky et al. present FuncLib, an automated method for designing catalytic repertoires using phylogenetic analysis and Rosetta design calculations. FuncLib resulted in efficient enzymes, including new hydrolases with the potential to treat nerve agent poisoning.
Lassa virus (LASV) is a human pathogen, causing substantial morbidity and mortality
. Similar to other Arenaviridae, it presents a class-I spike complex on its surface that facilitates cell entry. ...The virus's cellular receptor is matriglycan, a linear carbohydrate that is present on α-dystroglycan
, but the molecular mechanism that LASV uses to recognize this glycan is unknown. In addition, LASV and other arenaviruses have a unique signal peptide that forms an integral and functionally important part of the mature spike
; yet the structure, function and topology of the signal peptide in the membrane remain uncertain
. Here we solve the structure of a complete native LASV spike complex, finding that the signal peptide crosses the membrane once and that its amino terminus is located in the extracellular region. Together with a double-sided domain-switching mechanism, the signal peptide helps to stabilize the spike complex in its native conformation. This structure reveals that the LASV spike complex is preloaded with matriglycan, suggesting the mechanism of binding and rationalizing receptor recognition by α-dystroglycan-tropic arenaviruses. This discovery further informs us about the mechanism of viral egress and may facilitate the rational design of novel therapeutics that exploit this binding site.
We describe a general computational method for designing proteins that bind a surface patch of interest on a target macromolecule. Favorable interactions between disembodied amino acid residues and ...the target surface are identified and used to anchor de novo designed interfaces. The method was used to design proteins that bind a conserved surface patch on the stem of the influenza hemagglutinin (HA) from the 1918 H1N1 pandemic virus. After affinity maturation, two of the designed proteins, HB36 and HB80, bind H1 and H5 HAs with low nanomolar affinity. Further, HB80 inhibits the HA fusogenic conformational changes induced at low pH. The crystal structure of HB36 in complex with 1918/H1 HA revealed that the actual binding interface is nearly identical to that in the computational design model. Such designed binding proteins may be useful for both diagnostics and therapeutics.