Glycopeptide-based analysis is used to inform researchers about the glycans on one or more proteins. The method's key attractive feature is its ability to link glycosylation information to exact ...locations (glycosylation sites) on proteins. Numerous applications for glycopeptide analysis are known, and several examples are described herein. The techniques used to characterize glycopeptides are still emerging, and recently, research focused on facilitating aspects of glycopeptide analysis has advanced significantly in the areas of sample preparation, MS fragmentation, and automation of data analysis. These recent developments, described herein, provide the foundation for the growth of glycopeptide analysis as a blossoming field.
One unifying challenge when classifying biological samples with mass spectrometry data is overcoming the obstacle of sample-to-sample variability so that differences between groups, such as between a ...healthy set and a disease set, can be identified. Similarly, when the same sample is re-analyzed under identical conditions, instrument signals can fluctuate by more than 10%. This signal inconsistency imposes difficulties in identifying subtle differences across a set of samples, and it weakens the mass spectrometrist’s ability to effectively leverage data in domains as diverse as proteomics, metabolomics, glycomics, and imaging. We selected challenging data sets in the fields of glycomics, mass spectrometry imaging, and bacterial typing to study the problem of within-group signal variability and adapted a 30-year-old statistical approach to address the problem. The solution, “local-balanced model,” relies on using balanced subsets of training data to classify test samples. This analysis strategy was assessed on ESI-MS data of IgG-based glycopeptides and MALDI-MS imaging data of endogenous lipids, and MALDI-MS data of bacterial proteins. Two preliminary examples on non-mass spectrometry data sets are also included to show the potential generality of the method outside the field of MS analysis. We demonstrate that this approach is superior to simple normalization methods, generalizable to multiple mass spectrometry domains, and potentially appropriate in fields as diverse as physics and satellite imaging. In some cases, improvements in classification can be dramatic, with accuracy escalating from 60% with normalization alone to over 90% with the additional development described herein.
Graphical abstract
Glycosylation analysis of viral glycoproteins contributes significantly to vaccine design and development. Among other benefits, glycosylation analysis allows vaccine developers to assess the impact ...of construct design or producer cell line choices for vaccine production, and it is a key measure by which glycoproteins that are produced for use in vaccination can be compared to their native viral forms. Because many viral glycoproteins are multiply glycosylated, glycopeptide analysis is a preferrable approach for mapping the glycans, yet the analysis of glycopeptide data can be cumbersome and requires the expertise of an experienced analyst. In recent years, a commercial software product, Byonic, has been implemented in several instances to facilitate glycopeptide analysis on viral glycoproteins and other glycoproteomics data sets, and the purpose of the study herein is to determine the strengths and limitations of using this software, particularly in cases relevant to vaccine development. The glycopeptides from a recombinantly expressed trimeric S glycoprotein of the SARS-CoV-2 virus were first analyzed using an expert-based analysis strategy; subsequently, analysis of the same data set was completed using Byonic. Careful assessment of instances where the two methods produced different results revealed that the glycopeptide assignments from Byonic contained more false positives than true positives, even when the data were assessed using a 1% false discovery rate. The work herein provides a roadmap for removing the spurious assignments that Byonic generates, and it provides an assessment of the opportunity cost for relying on automated assignments for glycopeptide data sets from viral glycoproteins.
Graphical abstract
The SARS-CoV-2 coronavirus, the etiologic agent of COVID-19, uses its spike (S) glycoprotein anchored in the viral membrane to enter host cells. The S glycoprotein is the major target for ...neutralizing antibodies elicited by natural infection and by vaccines. Approximately 35% of the SARS-CoV-2 S glycoprotein consists of carbohydrate, which can influence virus infectivity and susceptibility to antibody inhibition. We found that virus-like particles produced by coexpression of SARS-CoV-2 S, M, E, and N proteins contained spike glycoproteins that were extensively modified by complex carbohydrates. We used a fucose-selective lectin to purify the Golgi-modified fraction of a wild-type SARS-CoV-2 S glycoprotein trimer and determined its glycosylation and disulfide bond profile. Compared with soluble or solubilized S glycoproteins modified to prevent proteolytic cleavage and to retain a prefusion conformation, more of the wild-type S glycoprotein N-linked glycans are processed to complex forms. Even Asn 234, a significant percentage of which is decorated by high-mannose glycans on other characterized S trimer preparations, is predominantly modified in the Golgi compartment by processed glycans. Three incompletely occupied sites of O-linked glycosylation were detected. Viruses pseudotyped with natural variants of the serine/threonine residues implicated in O-linked glycosylation were generally infectious and exhibited sensitivity to neutralization by soluble ACE2 and convalescent antisera comparable to that of the wild-type virus. Unlike other natural cysteine variants, a Cys15Phe (C15F) mutant retained partial, but unstable, infectivity. These findings enhance our understanding of the Golgi processing of the native SARS-CoV-2 S glycoprotein carbohydrates and could assist the design of interventions.
The SARS-CoV-2 coronavirus, which causes COVID-19, uses its spike glycoprotein to enter host cells. The viral spike glycoprotein is the main target of host neutralizing antibodies that help to control SARS-CoV-2 infection and are important for the protection provided by vaccines. The SARS-CoV-2 spike glycoprotein consists of a trimer of two subunits covered with a coat of carbohydrates (sugars). Here, we describe the disulfide bonds that assist the SARS-CoV-2 spike glycoprotein to assume the correct shape and the composition of the sugar moieties on the glycoprotein surface. We also evaluate the consequences of natural virus variation in O-linked sugar addition and in the cysteine residues involved in disulfide bond formation. This information can expedite the improvement of vaccines and therapies for COVID-19.
Large language models (LLMs) have the ability to generate text by stringing together words from their extensive training data. The leading AI text generation tool built on LLMs, ChatGPT, has quickly ...grown a vast user base since its release, but the domains in which it is being heavily leveraged are not yet known to the public. To understand how generative AI is reshaping print media and the extent to which it is being implemented already, methods to distinguish human-generated text from that generated by AI are required. Since college students have been early adopters of ChatGPT, we sought to study the presence of generative AI in newspaper articles written by collegiate journalists. To achieve this objective, an accurate AI detection model is needed. Herein, we analyzed university newspaper articles from different universities to determine whether ChatGPT was used to write or edit the news articles. We developed a detection model using classical machine learning and used the model to detect AI usage in the news articles. The detection model showcased a 93% accuracy in the training data and had a similar performance in the test set, demonstrating effectiveness in AI detection above existing state-of-the-art detection tools. Finally, the model was applied to the task of searching for generative AI usage in 2023, and we found that ChatGPT was not used to revise articles to any appreciable measure to write university news articles at the schools we studied.
The purpose of this review is to provide those interested in glycosylation analysis with the most updated information on the availability of automated tools for MS characterization of N-linked and ...O-linked glycosylation types. Specifically, this review describes software tools that facilitate elucidation of glycosylation from MS data on the basis of mass alone, as well as software designed to speed the interpretation of glycan and glycopeptide fragmentation from MS/MS data. This review focuses equally on software designed to interpret the composition of released glycans and on tools to characterize N-linked and O-linked glycopeptides. Several websites have been compiled and described that will be helpful to the reader who is interested in further exploring the described tools.
The human immunodeficiency virus type 1 (HIV-1) envelope glycoprotein (Env) trimer, which consists of the gp120 and gp41 subunits, is the focus of multiple strategies for vaccine development. ...Extensive Env glycosylation provides HIV-1 with protection from the immune system, yet the glycans are also essential components of binding epitopes for numerous broadly neutralizing antibodies. Recent studies have shown that when Env is isolated from virions, its glycosylation profile differs significantly from that of soluble forms of Env (gp120 or gp140) predominantly used in vaccine discovery research. Here we show that exogenous membrane-anchored Envs, which can be produced in large quantities in mammalian cells, also display a virion-like glycan profile, where the glycoprotein is extensively decorated with high-mannose glycans. Additionally, because we characterized the glycosylation with a high-fidelity profiling method, glycopeptide analysis, an unprecedented level of molecular detail regarding membrane Env glycosylation and its heterogeneity is presented. Each glycosylation site was characterized individually, with about 500 glycoforms characterized per Env protein. While many of the sites contain exclusively high-mannose glycans, others retain complex glycans, resulting in a glycan profile that cannot currently be mimicked on soluble gp120 or gp140 preparations. These site-level studies are important for understanding antibody-glycan interactions on native Env trimers. Additionally, we report a newly observed O-linked glycosylation site, T606, and we show that the full O-linked glycosylation profile of membrane-associated Env is similar to that of soluble gp140. These findings provide new insight into Env glycosylation and clarify key molecular-level differences between membrane-anchored Env and soluble gp140.
A vaccine that protects against human immunodeficiency virus type 1 (HIV-1) infection should elicit antibodies that bind to the surface envelope glycoproteins on the membrane of the virus. The envelope glycoproteins have an extensive coat of carbohydrates (glycans), some of which are recognized by virus-neutralizing antibodies and some of which protect the virus from neutralizing antibodies. We found that the HIV-1 membrane envelope glycoproteins have a unique pattern of carbohydrates, with many high-mannose glycans and also, in some places, complex glycans. This pattern was very different from the carbohydrate profile seen for a more easily produced soluble version of the envelope glycoprotein. Our results provide a detailed characterization of the glycans on the natural membrane envelope glycoproteins of HIV-1, a carbohydrate profile that would be desirable to mimic with a vaccine.