Treatment with broadly neutralizing antibodies (bNAbs) has proven effective against HIV-1 infections in humanized mice, non-human primates, and humans. Due to the high mutation rate of HIV-1, ...resistance testing of the patient's viral strains to the bNAbs is still inevitable. So far, bNAb resistance can only be tested in expensive and time-consuming neutralization experiments. Here, we introduce well-performing computational models that predict the neutralization response of HIV-1 to bNAbs given only the envelope sequence of the virus. Using non-linear support vector machines based on a string kernel, the models learnt even the important binding sites of bNAbs with more complex epitopes, i.e., the CD4 binding site targeting bNAbs, proving thereby the biological relevance of the models. To increase the interpretability of the models, we additionally provide a new kind of motif logo for each query sequence, visualizing those residues of the test sequence that influenced the prediction outcome the most. Moreover, we predicted the neutralization sensitivity of around 34,000 HIV-1 samples from different time points to a broad range of bNAbs, enabling the first analysis of HIV resistance to bNAbs on a global scale. The analysis showed for many of the bNAbs a trend towards antibody resistance over time, which had previously only been discovered for a small non-representative subset of the global HIV-1 population.
Knowing the three-dimensional (3D) structure of the chromatin is important for obtaining a complete picture of the regulatory landscape. Changes in the 3D structure have been implicated in diseases. ...While there exist approaches that attempt to predict the long-range chromatin interactions, they focus only on interactions between specific genomic regions - the promoters and enhancers, neglecting other possibilities, for instance, the so-called structural interactions involving intervening chromatin.
We present a method that can be trained on 5C data using the genetic sequence of the candidate loci to predict potential genome-wide interaction partners of a particular locus of interest. We have built locus-specific support vector machine (SVM)-based predictors using the oligomer distance histograms (ODH) representation. The method shows good performance with a mean test AUC (area under the receiver operating characteristic (ROC) curve) of 0.7 or higher for various regions across cell lines GM12878, K562 and HeLa-S3. In cases where any locus did not have sufficient candidate interaction partners for model training, we employed multitask learning to share knowledge between models of different loci. In this scenario, across the three cell lines, the method attained an average performance increase of 0.09 in the AUC. Performance evaluation of the models trained on 5C data regarding prediction on an independent high-resolution Hi-C dataset (which is a rather hard problem) shows 0.56 AUC, on average. Additionally, we have developed new, intuitive visualization methods that enable interpretation of sequence signals that contributed towards prediction of locus-specific interaction partners. The analysis of these sequence signals suggests a potential general role of short tandem repeat sequences in genome organization.
We demonstrated how our approach can 1) provide insights into sequence features of locus-specific interaction partners, and 2) also identify their cell-line specificity. That our models deem short tandem repeat sequences as discriminative for prediction of potential interaction partners, suggests that they could play a larger role in genome organization. Thus, our approach can (a) be beneficial to broadly understand, at the sequence-level, chromatin interactions and higher-order structures like (meta-) topologically associating domains (TADs); (b) study regions omitted from existing prediction approaches using various information sources (e.g., epigenetic information); and
We present the AIMe registry, a community-driven reporting platform for AI in biomedicine. It aims to enhance the accessibility, reproducibility and usability of biomedical AI models, and allows ...future revisions by the community.
Missense variants in genes encoding ion channels are associated with a spectrum of severe diseases. Variant effects on biophysical function correlate with clinical features and can be categorized as ...gain- or loss-of-function. This information enables a timely diagnosis, facilitates precision therapy, and guides prognosis. Functional characterization presents a bottleneck in translational medicine. Machine learning models may be able to rapidly generate supporting evidence by predicting variant functional effects. Here, we describe a multi-task multi-kernel learning framework capable of harmonizing functional results and structural information with clinical phenotypes. This novel approach extends the human phenotype ontology towards kernel-based supervised machine learning. Our gain- or loss-of-function classifier achieves high performance (mean accuracy 0.853 SD 0.016, mean AU-ROC 0.912 SD 0.025), outperforming both conventional baseline and state-of-the-art methods. Performance is robust across different phenotypic similarity measures and largely insensitive to phenotypic noise or sparsity. Localized multi-kernel learning offered biological insight and interpretability by highlighting channels with implicit genotype-phenotype correlations or latent task similarity for downstream analysis.
Understanding antibody-based SARS-CoV-2 immunity is critical for overcoming the COVID-19 pandemic and informing vaccination strategies. We evaluated SARS-CoV-2 antibody dynamics over 10 months in 963 ...individuals who predominantly experienced mild COVID-19. Investigating 2,146 samples, we initially detected SARS-CoV-2 antibodies in 94.4% of individuals, with 82% and 79% exhibiting serum and IgG neutralization, respectively. Approximately 3% of individuals demonstrated exceptional SARS-CoV-2 neutralization, with these “elite neutralizers” also possessing SARS-CoV-1 cross-neutralizing IgG. Multivariate statistical modeling revealed age, symptomatic infection, disease severity, and gender as key factors predicting SARS-CoV-2-neutralizing activity. A loss of reactivity to the virus spike protein was observed in 13% of individuals 10 months after infection. Neutralizing activity had half-lives of 14.7 weeks in serum versus 31.4 weeks in purified IgG, indicating a rather long-term IgG antibody response. Our results demonstrate a broad spectrum in the initial SARS-CoV-2-neutralizing antibody response, with sustained antibodies in most individuals for 10 months after mild COVID-19.
Display omitted
•Broad variation in neutralizing antibodies in SARS-CoV-2-convalescent individuals•∼3% of individuals showed a potent antibody response with SARS-CoV-1 cross-reactivity•Older age, symptoms, and severe disease predict higher SARS-CoV-2 neutralization•Serum and IgG neutralization half-lives were 14.7 and 31.4 weeks, respectively
Vanshylla et al. report longitudinal antibody kinetics in a mainly mild COVID-19 convalescent cohort of 963 individuals. There is broad variation in the initial response with older age and disease severity predicting higher SARS-CoV-2 neutralizing activity. Neutralizing IgG antibodies are detectable for up to 10 months in the majority of individuals.
The identification and isolation of highly infectious SARS-CoV-2-infected individuals is an important public health strategy. Rapid antigen detection tests (RADT) are promising tools for large-scale ...screenings due to timely results and feasibility for on-site testing. Nonetheless, the diagnostic performance of RADT in detecting infectious individuals is not yet fully determined. In this study, RT-qPCR and virus culture of RT-qPCR-positive samples were used to evaluate and compare the performance of the Standard Q COVID-19 Ag test in detecting SARS-CoV-2-infected and possibly infectious individuals. To this end, two combined oro- and nasopharyngeal swabs were collected at a routine SARS-CoV-2 diagnostic center. A total of 2,028 samples were tested, and 118 virus cultures were inoculated. SARS-CoV-2 infection was detected in 210 samples by RT-qPCR, representing a positive rate of 10.36%. The Standard Q COVID-19 Ag test yielded a positive result in 92 (4.54%) samples resulting in an overall sensitivity and specificity of 42.86 and 99.89%, respectively. For adjusted
values of <20 (
= 14), <25 (
= 57), and <30 (
= 88), the RADT reached sensitivities of 100, 98.25, and 88.64%, respectively. All 29 culture-positive samples were detected by the RADT. Although the overall sensitivity was low, the Standard Q COVID-19 Ag test reliably detected patients with high RNA loads. In addition, negative RADT results fully corresponded with the lack of viral cultivability in Vero E6 cells. These results indicate that RADT can be a valuable tool for the detection of individuals with high RNA loads that are likely to transmit SARS-CoV-2.
The mechanisms triggering the human immunodeficiency virus type I (HIV-1) to switch the coreceptor usage from CCR5 to CXCR4 during the course of infection are not entirely understood. While low CD4+ ...T cell counts are associated with CXCR4 usage, a predominance of CXCR4 usage with still high CD4+ T cell counts remains puzzling. Here, we explore the hypothesis that viral adaptation to the human leukocyte antigen (HLA) complex, especially to the HLA class II alleles, contributes to the coreceptor switch. To this end, we sequence the viral gag and env protein with corresponding HLA class I and II alleles of a new cohort of 312 treatment-naive, subtype C, chronically-infected HIV-1 patients from South Africa. To estimate HLA adaptation, we develop a novel computational approach using Bayesian generalized linear mixed models (GLMMs). Our model allows to consider the entire HLA repertoire without restricting the model to pre-learned HLA-polymorphisms. In addition, we correct for phylogenetic relatedness of the viruses within the model itself to account for founder effects. Using our model, we observe that CXCR4-using variants are more adapted than CCR5-using variants (p-value = 1.34e-2). Additionally, adapted CCR5-using variants have a significantly lower predicted false positive rate (FPR) by the geno2phenocoreceptor tool compared to the non-adapted CCR5-using variants (p-value = 2.21e-2), where a low FPR is associated with CXCR4 usage. Consequently, estimating HLA adaptation can be an asset in predicting not only coreceptor usage, but also an approaching coreceptor switch in CCR5-using variants. We propose the usage of Bayesian GLMMs for modeling virus-host adaptation in general.
Mass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and ...experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.
We present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.
OpenMS is available under the Lesser GNU Public License (LGPL) from the project website at http://www.openms.de.
Immunization through repeated direct venous inoculation of Plasmodium falciparum (Pf) sporozoites (PfSPZ) under chloroquine chemoprophylaxis, using the PfSPZ Chemoprophylaxis Vaccine (PfSPZ-CVac), ...induces high-level protection against controlled human malaria infection (CHMI). Humoral and cellular immunity contribute to vaccine efficacy but only limited information about the implicated Pf-specific antigens is available. Here, we examined Pf-specific antibody profiles, measured by protein arrays representing the full Pf proteome, of 40 placebo- and PfSPZ-immunized malaria-naïve volunteers from an earlier published PfSPZ-CVac dose-escalation trial. For this purpose, we both utilized and adapted supervised machine learning methods to identify predictive antibody profiles at two different time points: after immunization and before CHMI. We developed an adapted multitask support vector machine (SVM) approach and compared it to standard methods, i.e. single-task SVM, regularized logistic regression and random forests. Our results show, that the multitask SVM approach improved the classification performance to discriminate the protection status based on the underlying antibody-profiles while combining time- and dose-dependent data in the prediction model. Additionally, we developed the new f E ature di S tance ex P lainabilit Y (ESPY) method to quantify the impact of single antigens on the non-linear multitask SVM model and make it more interpretable. In conclusion, our multitask SVM model outperforms the studied standard approaches in regard of classification performance. Moreover, with our new explanation method ESPY, we were able to interpret the impact of Pf-specific antigen antibody responses that predict sterile protective immunity against CHMI after immunization. The identified Pf-specific antigens may contribute to a better understanding of immunity against human malaria and may foster vaccine development.
Due to the high mutation rate of human immunodeficiency virus (HIV), drug-resistant-variants emerge frequently. Therefore, researchers are constantly searching for new ways to attack the virus. One ...new class of anti-HIV drugs is the class of coreceptor antagonists that block cell entry by occupying a coreceptor on CD4 cells. This type of drug just has an effect on the subset of HIVs that use the inhibited coreceptor. A good prediction of whether the viral population inside a patient is susceptible to the treatment is hence very important for therapy decisions and pre-requisite to administering the respective drug. The first prediction models were based on data from Sanger sequencing of the V3 loop of HIV. Recently, a method based on next-generation sequencing (NGS) data was introduced that predicts labels for each read separately and decides on the patient label through a percentage threshold for the resistant viral minority.
We model the prediction problem on the patient level taking the information of all reads from NGS data jointly into account. This enables us to improve prediction performance for NGS data, but we can also use the trained model to improve predictions based on Sanger sequencing data. Therefore, also laboratories without NGS capabilities can benefit from the improvements. Furthermore, we show which amino acids at which position are important for prediction success, giving clues on how the interaction mechanism between the V3 loop and the particular coreceptors might be influenced.
A webserver is available at http://coreceptor.bioinf.mpi-inf.mpg.de.
nico.pfeifer@mpi-inf.mpg.de.