Abstract
Motivation
The V3 loop of the gp120 glycoprotein of the Human Immunodeficiency Virus 1 (HIV-1) is considered to be responsible for viral coreceptor tropism. gp120 interacts with the CD4 ...receptor of the host cell and subsequently V3 binds either CCR5 or CXCR4. Due to the fact that the CCR5 coreceptor is targeted by entry inhibitors, a reliable prediction of the coreceptor usage of HIV-1 is of great interest for antiretroviral therapy. Although several methods for the prediction of coreceptor tropism are available, almost all of them have been developed based on only subtype B sequences, and it has been shown in several studies that the prediction of non-B sequences, in particular subtype A sequences, are less reliable. Thus, the aim of the current study was to develop a reliable prediction model for subtype A viruses.
Results
Our new model SCOTCH is based on a stacking approach of classifier ensembles and shows a significantly better performance for subtype A sequences compared to other available models. In particular for low false positive rates (between 0.05 and 0.2, i.e. recommendation in the German and European Guidelines for tropism prediction), SCOTCH shows significantly better prediction performances in terms of partial area under the curves and diagnostic odds ratios compared to existing tools, and thus can be used to reliably predict coreceptor tropism for subtype A sequences.
Availability and implementation
SCOTCH can be downloaded/accessed at http://www.heiderlab.de.
Today a broad range of antiretroviral drug regimens are applicable for the successful suppression of virus replication in human immunodeficiency virus (HIV) infected people. However, there still ...remains an obstacle in therapy: the high mutation rate of the HI virus under drug pressure leads to resistant variants causing failure of permanent and effective treatment. Therefore, resistance testing is therefore inevitable to administer appropriate antiviral drugs to infected patients.
By means of current high-throughput sequencing technologies, computational models have recently constituted important assistance in drug resistance prediction and can guide the choice of medical treatment. Several machine learning algorithms, e.g. support-vector machines, random forests, as well as statistical methods have been already applied to genotypic data and structural information to predict drug resistance.
In this review, we provide an overview of existing approaches in computational drug resistance prediction in HIV. We further highlight the challenges and limitations of current methods, e.g. time complexity and prediction of non-B subtypes.
Moreover, we give a perspective on multi-label and multi-instance classification techniques that potentially tackle the problem of cross-resistances among drugs.
Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In ...this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations.
Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage.
eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de .
Recent genome-wide association studies (GWAS) have confirmed known risk mutations for venous thromboembolism (VTE) and identified a number of novel susceptibility loci in adults. Here we present a ...GWAS in 212 nuclear families with pediatric VTE followed by targeted next-generation sequencing (NGS) to identify causative mutations contributing to the association. Three single nucleotide polymorphisms (SNPs) exceeded the threshold for genome-wide significance as determined by permutation testing using 100 000 bootstrap permutations (P < 10−5). These SNPs reside in a region on chromosome 6q13 comprising the genes small ARF GAP1 (SMAP1), an ARF6 guanosine triphosphatase-activating protein that functions in clathrin-dependent endocytosis, and β-1,3-glucoronyltransferase 2 (B3GAT2), a member of the human natural killer 1 carbohydrate pathway. Rs1304029 and rs2748331 are associated with pediatric VTE with unpermuted/permuted values of P = 1.42 × 10−6/2.0 × 10−6 and P = 6.11 × 10−6/1.8 × 10−5, respectively. Rs2748331 was replicated (P = .00719) in an independent study sample coming from our GWAS on pediatric thromboembolic stroke (combined P = 7.88 × 10−7). Subsequent targeted NGS in 24 discordant sibling pairs identified 17 nonsynonymous coding variants, of which 1 located in SMAP1 and 3 in RIMS1, a member of the RIM family of active zone proteins, are predicted as damaging by Protein Variation Effect Analyzer and/or sorting intolerant from tolerant scores. Three SNPs curtly missed statistical significance in the transmission-disequilibrium test in the full cohort (rs112439957: P = .08326, SMAP1; rs767118962: P = .08326, RIMS1; and rs41265501: P = .05778, RIMS1). In conjunction, our data provide compelling evidence for SMAP1, B3GAT2, and RIMS1 as novel susceptibility loci for pediatric VTE and warrant future functional studies to unravel the underlying molecular mechanisms leading to VTE.
•Our study identified a region on chromosome 6 comprising the genes SMAP1, B3GAT2, and RIMS1 as novel susceptibility locus for pediatric VTE.•Nonsynonymous variants in SMAP1 and RIMS1 are predicted as deleterious and may influence vesicle processing in blood cells.
Reactivation of fetal gene expression patterns has been implicated in common cardiac diseases in adult life including left ventricular (LV) hypertrophy (LVH) in arterial hypertension. Thus, increased ...wall stress and neurohumoral activation are discussed to induce the return to expression of fetal genes after birth in LVH. We therefore aimed to identify novel potential candidates for LVH by analyzing fetal-adult cardiac gene expression in a genetic rat model of hypertension, i.e. the stroke-prone spontaneously hypertensive rat (SHRSP). To this end we performed genome-wide transcriptome analysis in SHRSP to identify differences in expression patterns between day 20 of fetal development (E20) and adult animals in week 14 in comparison to a normotensive rat strain with contrasting low LV mass, i.e. Fischer (F344). 15232 probes were detected as expressed in LV tissue obtained from rats at E20 and week 14 (p < 0.05) and subsequently screened for differential expression. We identified 24 genes with SHRSP specific up-regulation and 21 genes with down-regulation as compared to F344. Further bioinformatic analysis presented Efcab6 as a new candidate for LVH that showed only in the hypertensive SHRSP rat differential expression during development (logFC = 2.41, p < 0.001) and was significantly higher expressed in adult SHRSP rats compared with adult F344 (+ 76%) and adult normotensive Wistar-Kyoto rats (+ 82%). Thus, it represents an interesting new target for further functional analyses and the elucidation of mechanisms leading to LVH. Here we report a new approach to identify candidate genes for cardiac hypertrophy by combining the analysis of gene expression differences between strains with a contrasting cardiac phenotype with a comparison of fetal-adult cardiac expression patterns.
The distribution of human disease-associated mutations is not random across the human genome. Despite the fact that natural selection continually removes disease-associated mutations, an enrichment ...of these variants can be observed in regions of low recombination. There are a number of mechanisms by which such a clustering could occur, including genetic perturbations or demographic effects within different populations. Recent genome-wide association studies (GWAS) suggest that single nucleotide polymorphisms (SNPs) associated with complex disease traits are not randomly distributed throughout the genome, but tend to cluster in regions of low recombination.
Here we investigated whether deleterious mutations have accumulated in regions of low recombination due to the impact of recent positive selection and genetic hitchhiking. Using publicly available data on common complex diseases and population demography, we observed an enrichment of hitchhiked disease associations in conserved gene clusters subject to selection pressure. Evolutionary analysis revealed that these conserved gene clusters arose by multiple concerted rearrangements events across the vertebrate lineage. We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21.
Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits. This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits.
Drug resistance testing is mandatory in antiretroviral therapy in human immunodeficiency virus (HIV) infected patients for successful treatment. The emergence of resistances against antiretroviral ...agents remains the major obstacle in inhibition of viral replication and thus to control infection. Due to the high mutation rate the virus is able to adapt rapidly under drug pressure leading to the evolution of resistant variants and finally to therapy failure.
We developed a web service for drug resistance prediction of commonly used drugs in antiretroviral therapy, i.e., protease inhibitors (PIs), reverse transcriptase inhibitors (NRTIs and NNRTIs), and integrase inhibitors (INIs), but also for the novel drug class of maturation inhibitors. Furthermore, co-receptor tropism (CCR5 or CXCR4) can be predicted as well, which is essential for treatment with entry inhibitors, such as Maraviroc. Currently, SHIVA provides 24 prediction models for several drug classes. SHIVA can be used with single RNA/DNA or amino acid sequences, but also with large amounts of next-generation sequencing data and allows prediction of a user specified selection of drugs simultaneously. Prediction results are provided as clinical reports which are sent via email to the user.
SHIVA represents a novel high performing alternative for hitherto developed drug resistance testing approaches able to process data derived from next-generation sequencing technologies. SHIVA is publicly available via a user-friendly web interface.
Maturation inhibitors such as Bevirimat are a new class of antiretroviral drugs that hamper the cleavage of HIV-1 proteins into their functional active forms. They bind to these preproteins and ...inhibit their cleavage by the HIV-1 protease, resulting in non-functional virus particles. Nevertheless, there exist mutations in this region leading to resistance against Bevirimat. Highly specific and accurate tools to predict resistance to maturation inhibitors can help to identify patients, who might benefit from the usage of these new drugs.
We tested several methods to improve Bevirimat resistance prediction in HIV-1. It turned out that combining structural and sequence-based information in classifier ensembles led to accurate and reliable predictions. Moreover, we were able to identify the most crucial regions for Bevirimat resistance computationally, which are in line with experimental results from other studies.
Our analysis demonstrated the use of machine learning techniques to predict HIV-1 resistance against maturation inhibitors such as Bevirimat. New maturation inhibitors are already under development and might enlarge the arsenal of antiretroviral drugs in the future. Thus, accurate prediction tools are very useful to enable a personalized therapy.
The alarmins myeloid-related protein (MRP)8 and MRP14 are the most prevalent cytoplasmic proteins in phagocytes. When released from activated or necrotic phagocytes, extracellular MRP8/MRP14 promote ...inflammation in many diseases, including infections, allergies, autoimmune diseases, rheumatoid arthritis, and inflammatory bowel disease. The involvement of TLR4 and the multiligand receptor for advanced glycation end products as receptors during MRP8-mediated effects on inflammation remains controversial. By comparative bioinformatic analysis of genome-wide response patterns of human monocytes to MRP8, endotoxins, and various cytokines, we have developed a model in which TLR4 is the dominant receptor for MRP8-mediated phagocyte activation. The relevance of the TLR4 signaling pathway was experimentally validated using human and murine models of TLR4- and receptor for advanced glycation end products-dependent signaling. Furthermore, our systems biology approach has uncovered an antiapoptotic role for MRP8 in monocytes, which was corroborated by independent functional experiments. Our data confirm the primary importance of the TLR4/MRP8 axis in the activation of human monocytes, representing a novel and attractive target for modulation of the overwhelming innate immune response.
Antiretroviral treatment of Human Immunodeficiency Virus type-1 (HIV-1) infections with CCR5-antagonists requires the co-receptor usage prediction of viral strains. Currently available tools are ...mostly designed based on subtype B strains and thus are in general not applicable to non-B subtypes. However, HIV-1 infections caused by subtype B only account for approximately 11% of infections worldwide. We evaluated the performance of several sequence-based algorithms for co-receptor usage prediction employed on subtype A V3 sequences including circulating recombinant forms (CRFs) and subtype C strains. We further analysed sequence profiles of gp120 regions of subtype A, B and C to explore functional relationships to entry phenotypes. Our analyses clearly demonstrate that state-of-the-art algorithms are not useful for predicting co-receptor tropism of subtype A and its CRFs. Sequence profile analysis of gp120 revealed molecular variability in subtype A viruses. Especially, the V2 loop region could be associated with co-receptor tropism, which might indicate a unique pattern that determines co-receptor tropism in subtype A strains compared to subtype B and C strains. Thus, our study demonstrates that there is a need for the development of novel algorithms facilitating tropism prediction of HIV-1 subtype A to improve effective antiretroviral treatment in patients.