Large proteins with multiple domains are thought to fold cotranslationally to minimize interdomain misfolding. Once folded, domains interact with each other through the formation of extensive ...interfaces that are important for protein stability and function. However, multidomain protein folding and the energetics of domain interactions remain poorly understood. In elongation factor G (EF-G), a highly conserved protein composed of 5 domains, the 2 N-terminal domains form a stably structured unit cotranslationally. Using single-molecule optical tweezers, we have defined the steps leading to fully folded EF-G. We find that the central domain III of EF-G is highly dynamic and does not fold upon emerging from the ribosome. Surprisingly, a large interface with the N-terminal domains does not contribute to the stability of domain III. Instead, it requires interactions with its folded C-terminal neighbors to be stably structured. Because of the directionality of protein synthesis, this energetic dependency of domain III on its C-terminal neighbors disrupts cotranslational folding and imposes a posttranslational mechanism on the folding of the C-terminal part of EF-G. As a consequence, unfolded domains accumulate during synthesis, leading to the extensive population of misfolded species that interfere with productive folding. Domain III flexibility enables large-scale conformational transitions that are part of the EF-G functional cycle during ribosome translocation. Our results suggest that energetic tuning of domain stabilities, which is likely crucial for EF-G function, complicates the folding of this large multidomain protein.
Intrinsically disordered proteins (IDPs) present a functional paradox because they lack stable tertiary structure, but nonetheless play a central role in signaling, utilizing a process known as ...allostery. Historically, allostery in structured proteins has been interpreted in terms of propagated structural changes that are induced by effector binding. Thus, it is not clear how IDPs, lacking such well-defined structures, can allosterically affect function. Here, we show a mechanism by which an IDP can allosterically control function by simultaneously tuning transcriptional activation and repression, using a novel strategy that relies on the principle of 'energetic frustration'. We demonstrate that human glucocorticoid receptor tunes this signaling in vivo by producing translational isoforms differing only in the length of the disordered region, which modulates the degree of frustration. We expect this frustration-based model of allostery will prove to be generally important in explaining signaling in other IDPs.
Abstract
Folding of individual domains in large proteins during translation helps to avoid otherwise prevalent inter-domain misfolding. How folding intermediates observed in vitro for the majority of ...proteins relate to co-translational folding remains unclear. Combining in vivo and single-molecule experiments, we followed the co-translational folding of the G-domain, encompassing the first 293 amino acids of elongation factor G. Surprisingly, the domain remains unfolded until it is fully synthesized, without collapsing into molten globule-like states or forming stable intermediates. Upon fully emerging from the ribosome, the G-domain transitions to its stable native structure via folding intermediates. Our results suggest a strictly sequential folding pathway initiating from the C-terminus. Folding and synthesis thus proceed in opposite directions. The folding mechanism is likely imposed by the final structure and might have evolved to ensure efficient, timely folding of a highly abundant and essential protein.
The risk of cardiovascular disease (CVD) is a serious health threat to human society worldwide. The use of machine learning methods to predict the risk of CVD is of great relevance to identify ...high-risk patients and take timely interventions. In this study, we propose the XGBH machine learning model, which is a CVD risk prediction model based on key contributing features. In this paper, the generalisation of the model was enhanced by adding retrospective data of 14,832 Chinese Shanxi CVD patients to the kaggle dataset. The XGBH risk prediction model proposed in this paper was validated to be highly accurate (AUC = 0.81) compared to the baseline risk score (AUC = 0.65), and the accuracy of the model for CVD risk prediction was improved with the inclusion of the conventional biometric BMI variable. To increase the clinical application of the model, a simpler diagnostic model was designed in this paper, which requires only three characteristics from the patient (age, value of systolic blood pressure and whether cholesterol is normal or not) to enable early intervention in the treatment of high-risk patients with a slight reduction in accuracy (AUC = 0.79). Ultimately, a CVD risk score model with few features and high accuracy will be established based on the main contributing features. Of course, further prospective studies, as well as studies with other populations, are needed to assess the actual clinical effectiveness of the XGBH risk prediction model.
Cardiovascular disease (CVD) risk prediction shows great significance for disease diagnosis and treatment, especially early intervention for CVD, which has a direct impact on preventing and reducing ...adverse outcomes. In this paper, we collected clinical indicators and outcomes of 14,832 patients with cardiovascular disease in Shanxi, China, and proposed a cardiovascular disease risk prediction model, XGBH, based on key contributing characteristics to perform risk scoring of patients’ clinical outcomes. The XGBH risk prediction model had high accuracy, with a significant improvement compared to the baseline risk score (AUC = 0.80 vs. AUC = 0.65). At the same time, we found that with the addition of conventional biometric variables, the accuracy of the model’s CVD risk prediction would also be improved. Finally, we designed a simpler model to quantify disease risk based on only three questions answered by the patient, with only a modest reduction in accuracy (AUC = 0.79), and providing a valid risk assessment for CVD. Overall, our models may allow early-stage intervention in high-risk patients, as well as a cost-effective screening approach. Further prospective studies and studies in other populations are needed to assess the actual clinical effect of XGBH risk prediction models.
Multi-domain proteins, containing several structural units within a single polypeptide, constitute a large fraction of all proteomes. Co-translational folding is assumed to simplify the ...conformational search problem for large proteins, but the events leading to correctly folded, functional structures remain poorly characterized. Similarly, how the ribosome and molecular chaperones promote efficient folding remains obscure. Using optical tweezers, we have dissected early folding events of nascent elongation factor G, a multi-domain protein that requires chaperones for folding. The ribosome and the chaperone trigger factor reduce inter-domain misfolding, permitting folding of the N-terminal G-domain. Successful completion of this step is a crucial prerequisite for folding of the next domain. Unexpectedly, co-translational folding does not proceed unidirectionally; emerging unfolded polypeptide can denature an already-folded domain. Trigger factor, but not the ribosome, protects against denaturation. The chaperone thus serves a previously unappreciated function, helping multi-domain proteins overcome inherent challenges during co-translational folding.
Display omitted
•How the ribosome modulates nascent chain folding switches during elongation•Sequential domain-wise folding reduces misfolding•Co-translational folding can be reversed by an unexpected unfolding pathway•Protection of folded domains is an unanticipated chaperone function
Liu et al. show that domain-wise folding of nascent proteins can be reversed by denaturing interactions with emerging polypeptide. The chaperone trigger factor blocks denaturation and, together with the ribosome, reduces misfolding. The chaperone thus serves a dual function in promoting efficient folding of multi-domain proteins.
All cellular proteins are synthesized by the ribosome, an intricate molecular machine that translates the information of protein coding genes into the amino acid alphabet. The linear polypeptides ...synthesized by the ribosome must generally fold into specific three-dimensional structures to become biologically active. Folding has long been recognized to begin before synthesis is complete. Recently, biochemical and biophysical studies have shed light onto how the ribosome shapes the folding pathways of nascent proteins. Here, we discuss recent progress that is beginning to define the role of the ribosome in the folding of newly synthesized polypeptides.
Display omitted
•Folding is a crucial step in the biogenesis of functional proteins.•Interactions with the ribosome guide nascent polypeptide folding.•Recent experimental work is beginning to shed light on mechanisms by which the ribosome modulates protein folding.
Protein-protein interactions play critical roles in biology, but the structures of many eukaryotic protein complexes are unknown, and there are likely many interactions not yet identified. We take ...advantage of advances in proteome-wide amino acid coevolution analysis and deep-learning–based structure modeling to systematically identify and build accurate models of core eukaryotic protein complexes within the
proteome. We use a combination of RoseTTAFold and AlphaFold to screen through paired multiple sequence alignments for 8.3 million pairs of yeast proteins, identify 1505 likely to interact, and build structure models for 106 previously unidentified assemblies and 806 that have not been structurally characterized. These complexes, which have as many as five subunits, play roles in almost all key processes in eukaryotic cells and provide broad insights into biological function.
Abstract only
Large proteins composed of multiple domains are abundant in all proteomes, but their folding and structural dynamics remain poorly understood. Multi‐domain proteins are thought to fold ...sequentially as they emerge from the ribosome, which minimizes interdomain misfolding. Once folded, domains interact with each other through the formation of extensive interfaces that are important for protein stability and function. In elongation factor G (EF‐G), a highly conserved protein composed of five domains, the two N‐terminal domains form a stably structured unit co‐translationally. Using single‐molecule optical tweezers, we have defined the steps leading to fully folded EF‐G. We find that the central domain III of EF‐G is highly dynamic and does not fold upon emerging from the ribosome. Surprisingly, a large interface with the N‐terminal domains does not contribute to the stability of domain III. Instead, it requires interactions with its folded C‐terminal neighbors to be stably structured. Because of the directionality of protein synthesis, this energetic dependency of domain III on its C‐terminal neighbors disrupts co‐translational folding and imposes a post‐translational mechanism on the folding of the C‐terminal part of EF‐G. As a consequence, unfolded domains accumulate during synthesis, leading to the extensive population of misfolded species that interfere with productive folding. Domain III flexibility enables large‐scale conformational transitions that are part of the EF‐G functional cycle during ribosome translocation. Our results suggest that energetic tuning of domain stabilities, which is likely crucial for EF‐G function, complicates the folding of this large multi‐domain protein. EF‐G thus provides an example of how distinct biological ends – robust folding and functionally important flexibility – come into conflict during protein biogenesis.
Support or Funding Information
This work was supported by a grant from the National Institutes of Health (5R01GM121567)
Coupled synthesis and folding of the multi‐domain protein EF‐G. Domain III (green) is highly dynamic in the absence of fully folded domains IV (blue) and V (purple). As a consequence, co‐translational folding is interrupted. Accumulation of unfolded polypeptide results in the formation of misfolded species, sidetracking the molecule into non‐productive states and slowing down folding.
Figure 1