There has been considerable recent progress in protein structure prediction using deep neural networks to predict inter-residue distances from amino acid sequences
. Here we investigate whether the ...information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occurring proteins used in training the models. We generate random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting residue-residue distance maps, which, as expected, are quite featureless. We then carry out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (Kullback-Leibler divergence) between the inter-residue distance distributions predicted by the network and background distributions averaged over all proteins. Optimization from different random starting points resulted in novel proteins spanning a wide range of sequences and predicted structures. We obtained synthetic genes encoding 129 of the network-'hallucinated' sequences, and expressed and purified the proteins in Escherichia coli; 27 of the proteins yielded monodisperse species with circular dichroism spectra consistent with the hallucinated structures. We determined the three-dimensional structures of three of the hallucinated proteins, two by X-ray crystallography and one by NMR, and these closely matched the hallucinated models. Thus, deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute alongside traditional physics-based models to the de novo design of proteins with new functions.
Biomolecules exhibit dynamic behavior that single-state models of their structures cannot fully capture. We review some recent advances for investigating multiple conformations of biomolecules, ...including experimental methods, molecular dynamics simulations, and machine learning. We also address the challenges associated with representing single- and multiple-state models in data archives, with a particular focus on NMR structures. Establishing standardized representations and annotations will facilitate effective communication and understanding of these complex models to the broader scientific community.
Display omitted
•Improved methods have advanced multi-conformational structural modeling.•Two or more multiple-state conformations often best describe a protein structure.•Single-state representation depicts local model uncertainty on one representative conformer.•Consistent data structures are needed for archiving multiple-state models.
Membrane-traversing peptides offer opportunities for targeting intracellular proteins and oral delivery. Despite progress in understanding the mechanisms underlying membrane traversal in natural ...cell-permeable peptides, there are still several challenges to designing membrane-traversing peptides with diverse shapes and sizes. Conformational flexibility appears to be a key determinant of membrane permeability of large macrocycles. We review recent developments in the design and validation of chameleonic cyclic peptides, which can switch between alternative conformations to enable improved permeability through cell membranes, while still maintaining reasonable solubility and exposed polar functional groups for target protein binding. Finally, we discuss the principles, strategies, and practical considerations for rational design, discovery, and validation of permeable chameleonic peptides.
Conventional protein structure determination from nuclear magnetic resonance data relies heavily on side-chain proton-to-proton distances. The necessary side-chain resonance assignment, however, is ...labor intensive and prone to error. Here we show that structures can be accurately determined without nuclear magnetic resonance (NMR) information on the side chains for proteins up to 25 kilodaltons by incorporating backbone chemical shifts, residual dipolar couplings, and amide proton distances into the Rosetta protein structure modeling methodology. These data, which are too sparse for conventional methods, serve only to guide conformational search toward the lowest-energy conformations in the folding landscape; the details of the computed models are determined by the physical chemistry implicit in the Rosetta all-atom energy function. The new method is not hindered by the deuteration required to suppress nuclear relaxation processes for proteins greater than 15 kilodaltons and should enable routine NMR structure determination for larger proteins.
Effective control of COVID-19 requires antivirals directed against SARS-CoV-2. We assessed 10 hepatitis C virus (HCV) protease-inhibitor drugs as potential SARS-CoV-2 antivirals. There is a striking ...structural similarity of the substrate binding clefts of SARS-CoV-2 main protease (Mpro) and HCV NS3/4A protease. Virtual docking experiments show that these HCV drugs can potentially bind into the Mpro substrate-binding cleft. We show that seven HCV drugs inhibit both SARS-CoV-2 Mpro protease activity and SARS-CoV-2 virus replication in Vero and/or human cells. However, their Mpro inhibiting activities did not correlate with their antiviral activities. This conundrum is resolved by demonstrating that four HCV protease inhibitor drugs, simeprevir, vaniprevir, paritaprevir, and grazoprevir inhibit the SARS CoV-2 papain-like protease (PLpro). HCV drugs that inhibit PLpro synergize with the viral polymerase inhibitor remdesivir to inhibit virus replication, increasing remdesivir’s antiviral activity as much as 10-fold, while those that only inhibit Mpro do not synergize with remdesivir.
Display omitted
•Several HCV protease-inhibitor drugs inhibit SARS-CoV-2 Mpro and/or PLpro•These HCV drugs also inhibit SARS-CoV-2 replication in Vero and/or human cells•HCV drugs that inhibit PLpro synergize with remdesivir to inhibit SARS-CoV-2•HCV drugs that selectively inhibit Mpro are not synergistic with remdesivir
Bafna et al. report that several available hepatitis C virus drugs inhibit the SARS-CoV-2 Mpro and/or PLpro proteases and SARS-CoV-2 replication in cell culture. The four HCV drugs that inhibit PLpro enzyme activity also synergize with remdesivir to inhibit virus replication, increasing the antiviral activity of remdesivir and HCV drugs.
While structural symmetry is a prevailing feature of homo-oligomeric proteins, asymmetry provides unique mechanistic opportunities. We present the crystal structure of full-length TRAP1, the ...mitochondrial Hsp90 molecular chaperone, in a catalytically active closed state. The TRAP1 homodimer adopts a distinct, asymmetric conformation, where one protomer is reconfigured via a helix swap at the middle:C-terminal domain (MD:CTD) interface. This interface plays a critical role in client binding. Solution methods validate the asymmetry and show extension to Hsp90 homologs. Point mutations that disrupt unique contacts at each MD:CTD interface reduce catalytic activity and substrate binding and demonstrate that each protomer needs access to both conformations. Crystallographic data on a dimeric NTD:MD fragment suggests that asymmetry arises from strain induced by simultaneous NTD and CTD dimerization. The observed asymmetry provides the potential for an additional step in the ATPase cycle, allowing sequential ATP hydrolysis steps to drive both client remodeling and client release.
Display omitted
•Crystal structure of the TRAP1 homodimer reveals an asymmetric closed state•SAXS and DEER validate asymmetry in solution and conservation across Hsp90 homologs•MD:CTD interfaces are functional for chaperone activity and stabilize asymmetry•Changes in asymmetry result in rearrangement of client binding residues
Hsp90 is critical to many signaling pathways and functions through affecting the state of its “clients.” Lavery et al. present crystal structures of the mitochondrial homolog in a catalytically active conformation, revealing an asymmetric state. They show that this state is functional for chaperone activity and propose a model that utilizes structural asymmetry for client remodeling.
We use computational design coupled with experimental characterization to systematically investigate the design principles for macrocycle membrane permeability and oral bioavailability. We designed ...184 6–12 residue macrocycles with a wide range of predicted structures containing noncanonical backbone modifications and experimentally determined structures of 35; 29 are very close to the computational models. With such control, we show that membrane permeability can be systematically achieved by ensuring all amide (NH) groups are engaged in internal hydrogen bonding interactions. 84 designs over the 6–12 residue size range cross membranes with an apparent permeability greater than 1 × 10−6 cm/s. Designs with exposed NH groups can be made membrane permeable through the design of an alternative isoenergetic fully hydrogen-bonded state favored in the lipid membrane. The ability to robustly design membrane-permeable and orally bioavailable peptides with high structural accuracy should contribute to the next generation of designed macrocycle therapeutics.
Display omitted
•Computational design of diverse permeable macrocycles beyond the “rule-of-five” space•X-ray and NMR structures of designed macrocycles match their computational models•Designed macrocycles are permeable in vitro and orally bioavailable in vivo•Designed chameleonic peptides show solvent-dependent conformational switching
An investigation of the design principles of macrocyclic peptide membrane permeability and oral bioavailability enables the generation of synthetic macrocycles that fold into the predicted conformation, can cross membranes, and even adopt different conformations depending on polar versus nonpolar contexts.
Recent advances in molecular modeling using deep learning have the potential to revolutionize the field of structural biology. In particular, AlphaFold has been observed to provide models of protein ...structures with accuracies rivaling medium-resolution X-ray crystal structures, and with excellent atomic coordinate matches to experimental protein NMR and cryo-electron microscopy structures. Here we assess the hypothesis that AlphaFold models of small, relatively rigid proteins have accuracies (based on comparison against experimental data) similar to experimental solution NMR structures. We selected six representative small proteins with structures determined by both NMR and X-ray crystallography, and modeled each of them using AlphaFold. Using several structure validation tools integrated under the Protein Structure Validation Software suite (PSVS), we then assessed how well these models fit to experimental NMR data, including NOESY peak lists (RPF-DP scores), comparisons between predicted rigidity and chemical shift data (ANSURR scores), and
15
N-
1
H residual dipolar coupling data (RDC Q factors) analyzed by software tools integrated in the PSVS suite. Remarkably, the fits to NMR data for the protein structure models predicted with AlphaFold are generally similar, or better, than for the corresponding experimental NMR or X-ray crystal structures. Similar conclusions were reached in comparing AlphaFold2 predictions and NMR structures for three targets from the Critical Assessment of Protein Structure Prediction (CASP). These results contradict the widely held misperception that AlphaFold cannot accurately model solution NMR structures. They also document the value of PSVS for model vs. data assessment of protein NMR structures, and the potential for using AlphaFold models for guiding analysis of experimental NMR data and more generally in structural biology.
Biomolecular structure analysis from experimental NMR studies generally relies on restraints derived from a combination of experimental and knowledge-based data. A challenge for the structural ...biology community has been a lack of standards for representing these restraints, preventing the establishment of uniform methods of model-vs-data structure validation against restraints and limiting interoperability between restraint-based structure modeling programs. The NEF and NMR-STAR formats provide a standardized approach for representing commonly used NMR restraints. Using these restraint formats, a standardized validation system for assessing structural models of biopolymers against restraints has been developed and implemented in the wwPDB OneDep data deposition-validation-biocuration system. The resulting wwPDB restraint violation report provides a model vs. data assessment of biomolecule structures determined using distance and dihedral restraints, with extensions to other restraint types currently being implemented. These tools are useful for assessing NMR models, as well as for assessing biomolecular structure predictions based on distance restraints.
Display omitted
•The wwPDB validation report has been expanded to include restraint analysis•NMR-STAR and NEF serve as the standardized format for NMR data at wwPDB•Standardized restraint format facilitates model versus data assessment•More comprehensive and better assessment of biomolecular NMR structures
Baskaran et al. outlined the rationale for model-versus-data assessment of NMR structures by wwPDB, along with a summary of validation tools for NMR distance and dihedral-angle restraints implemented in the wwPDB validation pipeline. These tools enhance the assessment of biomolecular NMR structure quality, benefiting users of the wwPDB archive.
The P19 genotype belongs to the PII genogroup of group A rotaviruses (RVs). However, unlike the other PII RVs, which mainly infect humans, P19 RVs commonly infect animals (pigs), making P19 unique ...for the study of RV diversity and host ranges. Through in vitro binding assays and saturation transfer difference (STD) nuclear magnetic resonance (NMR), we found that P19 could bind mucin cores 2, 4, and 6, as well as type 1 histo-blood group antigens (HBGAs). The common sequences of these glycans serve as minimal binding units, while additional residues, such as the A, B, H, and Lewis epitopes of the type 1 HBGAs, can further define the binding outcomes and therefore likely the host ranges for P19 RVs. This complex binding property of P19 is shared with the other three PII RVs (P4, P6, and P8) in that all of them recognized the type 1 HBGA precursor, although P4 and P8, but not P6, also bind to mucin cores. Moreover, while essential for P4 and P8 binding, the addition of the Lewis epitope blocked P6 and P19 binding to type 1 HBGAs. Chemical-shift NMR of P19 VP8* identified a ligand binding interface that has shifted away from the known RV P-genotype binding sites but is conserved among all PII RVs and two PI RVs (P10 and P12), suggesting an evolutionary connection among these human and animal RVs. Taken together, these data are important for hypotheses on potential mechanisms for RV diversity, host ranges, and cross-species transmission.
In this study, we found that our P19 strain and other PII RVs recognize mucin cores and the type 1 HBGA precursors as the minimal functional units and that additional saccharides adjacent to these units can alter binding outcomes and thereby possibly host ranges. These data may help to explain why some PII RVs, such as P6 and P19, commonly infect animals but rarely humans, while others, such as the P4 and P8 RVs, mainly infect humans and are predominant over other P genotypes. Elucidation of the molecular bases for strain-specific host ranges and cross-species transmission of these human and animal RVs is important to understand RV epidemiology and disease burden, which may impact development of control and prevention strategies against RV gastroenteritis.