As methods for analysis of biomolecular structure and dynamics using nuclear magnetic resonance spectroscopy (NMR) continue to advance, the resulting 3D structures, chemical shifts, and other NMR ...data are broadly impacting biology, chemistry, and medicine. Structure model assessment is a critical area of NMR methods development, and is an essential component of the process of making these structures accessible and useful to the wider scientific community. For these reasons, the Worldwide Protein Data Bank (wwPDB) has convened an NMR Validation Task Force (NMR-VTF) to work with wwPDB partners in developing metrics and policies for biomolecular NMR data harvesting, structure representation, and structure quality assessment. This paper summarizes the recommendations of the NMR-VTF, and lays the groundwork for future work in developing standards and metrics for biomolecular NMR structure quality assessment.
•Well-developed methods can form a basis for standardized NMR structure assessment•Tools for validation of X-ray crystal structures are appropriate for NMR structures•NMR structure validation is generally applicable only to “well-defined” regions•Further research is required to address key issues of NMR structure validation
As analyses of biomolecular structure and dynamics by nuclear magnetic resonance (NMR) spectroscopy continue to advance, the resulting structures have a broad impact. Montelione et al. summarize recommendations for developing metrics and policies for biomolecular NMR structure representation and validation.
There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences
. Here we investigate whether the ...information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.
Unlike random heteropolymers, natural proteins fold into unique ordered structures. Understanding how these are encoded in amino-acid sequences is complicated by energetically unfavourable non-ideal ...features--for example kinked α-helices, bulged β-strands, strained loops and buried polar groups--that arise in proteins from evolutionary selection for biological function or from neutral drift. Here we describe an approach to designing ideal protein structures stabilized by completely consistent local and non-local interactions. The approach is based on a set of rules relating secondary structure patterns to protein tertiary motifs, which make possible the design of funnel-shaped protein folding energy landscapes leading into the target folded state. Guided by these rules, we designed sequences predicted to fold into ideal protein structures consisting of α-helices, β-strands and minimal loops. Designs for five different topologies were found to be monomeric and very stable and to adopt structures in solution nearly identical to the computational models. These results illuminate how the folding funnels of natural proteins arise and provide the foundation for engineering a new generation of functional proteins free from natural evolution.
Degeneracy in the genetic code, which enables a single protein to be encoded by a multitude of synonymous gene sequences, has an important role in regulating protein expression, but substantial ...uncertainty exists concerning the details of this phenomenon. Here we analyse the sequence features influencing protein expression levels in 6,348 experiments using bacteriophage T7 polymerase to synthesize messenger RNA in Escherichia coli. Logistic regression yields a new codon-influence metric that correlates only weakly with genomic codon-usage frequency, but strongly with global physiological protein concentrations and also mRNA concentrations and lifetimes in vivo. Overall, the codon content influences protein expression more strongly than mRNA-folding parameters, although the latter dominate in the initial ~16 codons. Genes redesigned based on our analyses are transcribed with unaltered efficiency but translated with higher efficiency in vitro. The less efficiently translated native sequences show greatly reduced mRNA levels in vivo. Our results suggest that codon content modulates a kinetic competition between protein elongation and mRNA degradation that is a central feature of the physiology and also possibly the regulation of translation in E. coli.
Biomolecules exhibit dynamic behavior that single-state models of their structures cannot fully capture. We review some recent advances for investigating multiple conformations of biomolecules, ...including experimental methods, molecular dynamics simulations, and machine learning. We also address the challenges associated with representing single- and multiple-state models in data archives, with a particular focus on NMR structures. Establishing standardized representations and annotations will facilitate effective communication and understanding of these complex models to the broader scientific community.
Display omitted
•Improved methods have advanced multi-conformational structural modeling.•Two or more multiple-state conformations often best describe a protein structure.•Single-state representation depicts local model uncertainty on one representative conformer.•Consistent data structures are needed for archiving multiple-state models.
Membrane-traversing peptides offer opportunities for targeting intracellular proteins and oral delivery. Despite progress in understanding the mechanisms underlying membrane traversal in natural ...cell-permeable peptides, there are still several challenges to designing membrane-traversing peptides with diverse shapes and sizes. Conformational flexibility appears to be a key determinant of membrane permeability of large macrocycles. We review recent developments in the design and validation of chameleonic cyclic peptides, which can switch between alternative conformations to enable improved permeability through cell membranes, while still maintaining reasonable solubility and exposed polar functional groups for target protein binding. Finally, we discuss the principles, strategies, and practical considerations for rational design, discovery, and validation of permeable chameleonic peptides.
The de novo design of protein-protein interfaces is a stringent test of our understanding of the principles underlying protein-protein interactions and would enable unique approaches to biological ...and medical challenges. Here we describe a motif-based method to computationally design protein-protein complexes with native-like interface composition and interaction density. Using this method we designed a pair of proteins, Prb and Pdar, that heterodimerize with a Kd of 130 nM, 1000-fold tighter than any previously designed de novo protein-protein complex. Directed evolution identified two point mutations that improve affinity to 180 pM. Crystal structures of an affinity-matured complex reveal binding is entirely through the designed interface residues. Surprisingly, in the in vitro evolved complex one of the partners is rotated 180° relative to the original design model, yet still maintains the central computationally designed hotspot interaction and preserves the character of many peripheral interactions. This work demonstrates that high-affinity protein interfaces can be created by designing complementary interaction surfaces on two noninteracting partners and underscores remaining challenges.
Display omitted
► We present a computational method to design de novo protein-protein complexes ► We used this method to design a synthetic protein pair that binds with high affinity ► With directed evolution we improved binding affinity several orders of magnitude