•Exoteric introduction of deep learning and its usage in bioinformatics.•Concrete and representative examples of using deep learning in bioinformatics.•Solutions and suggestions for handling common ...issues when using deep learning.•Thorough survey of the commonly used deep learning models for various data types.
Deep learning, which is especially formidable in handling big data, has achieved great success in various fields, including bioinformatics. With the advances of the big data era in biology, it is foreseeable that deep learning will become increasingly important in the field and will be incorporated in vast majorities of analysis pipelines. In this review, we provide both the exoteric introduction of deep learning, and concrete examples and implementations of its representative applications in bioinformatics. We start from the recent achievements of deep learning in the bioinformatics field, pointing out the problems which are suitable to use deep learning. After that, we introduce deep learning in an easy-to-understand fashion, from shallow neural networks to legendary convolutional neural networks, legendary recurrent neural networks, graph neural networks, generative adversarial networks, variational autoencoder, and the most recent state-of-the-art architectures. After that, we provide eight examples, covering five bioinformatics research directions and all the four kinds of data type, with the implementation written in Tensorflow and Keras. Finally, we discuss the common issues, such as overfitting and interpretability, that users will encounter when adopting deep learning methods and provide corresponding suggestions. The implementations are freely available at https://github.com/lykaust15/Deep_learning_examples.
•Spatial patterns of soil heavy metals in a Chinese hickory plantations region was investigated.•The soils were contaminated by Cd and Cu.•Moran’s I and geostatistics were applied to reveal spatial ...variation of heavy metals in soils.•The main sources of pollution were related to mining and fertilizer application.
Chinese hickory (Carya cathayensis Sarg.) is a famous woody nut in China. Currently, there is an increasing concern related to the soil heavy metal pollution in Chinese hickory plantations. In this study, a total of 188 soil samples were collected from Lin’an city, a typical Chinese hickory production area. The results showed that the average background concentrations of Cd, Cu, Zn, Pb, Ni and Cr in soils were 0.37, 40.76, 87.61, 30.10, 28.33 and 56.57 mg kg−1, respectively. The Cd and Pb concentrations in Chinese hickory plantation soils had strong variability coefficients of 186.49% and 95.42%, respectively. Compared with the background values in Zhejiang province, the heavy metals were enriched in the plantation soils. The accumulation ratio followed the order of Cu > Cd > Pb > Ni > Zn > Cr. Part of the study area was seriously contaminated by Cd and Cu, as 31.38% of soil Cd and Cu samples exceeded the second grade standardized value of Environmental Quality Standards of Soils (EQSS). The soils in our study area generally reached a moderate ecological risk, as its average RI value reached 103. Moran’s I and Kriging interpolation results revealed that all the studied heavy metals in soils had clear spatial distribution patterns. The high values of soil heavy metals (except Pb) were mainly distributed in the localities closest to the mine area. The apportionment of heavy metal pollution sources showed that the Cu, Ni and Cr in soils were mainly related to mining activities, Pb was closely related to fertilizer application, Cd and Zn were related to both. The heavy metals in soils may pose a potential threat to local ecosystem and human health. Our results implied that soil heavy metals should be tested, and risk-based management should be performed in the economic plantations in China. These maps could provide useful information for forestry and environmental management.
Streptococcus pneumoniae (S. pneumoniae) is an opportunistic pathogen that causes pneumonia, meningitis and bacteremia in humans and animals. Pneumolysin (PLY), a major pore-forming toxin that is ...important for S. pneumoniae pathogenicity, is a promising target for the development of anti-infective agents. Ephedra sinica granules (ESG) is one of the oldest medical preparation with multiple biological activities (such as a divergent wind and cold effect); however, the detailed mechanism remains unknown. In this study, we found that ESG treatment significantly inhibited the oligomerization of PLY and then reduced the activity of PLY without affecting S. pneumoniae growth and PLY production. In a PLY and A549 cell co-incubation system, the addition of ESG resulted in significant protection against PLY-mediated cell injury. Furthermore, S. pneumoniae-infected mice showed decreased mortality, and alleviated tissue damage and inflammatory reactions following treatment with ESG. Our results indicate that ESG is a potential candidate treatment for S. pneumoniae infection that targets PLY. This finding partially elucidates the mechanism of the Chinese herbal formula ESG in the treatment of pneumococcal disease.
Kernel selection is a fundamental problem of kernel-based learning algorithms. In this paper, we propose an approximate approach to automatic kernel selection for regression from the perspective of ...kernel matrix approximation. We first introduce multilevel circulant matrices into automatic kernel selection, and develop two approximate kernel selection algorithms by exploiting the computational virtues of multilevel circulant matrices. The complexity of the proposed algorithms is quasi-linear in the number of data points. Then, we prove an approximation error bound to measure the effect of the approximation in kernel matrices by multilevel circulant matrices on the hypothesis and further show that the approximate hypothesis produced with multilevel circulant matrices converges to the accurate hypothesis produced with kernel matrices. Experimental evaluations on benchmark datasets demonstrate the effectiveness of approximate kernel selection.
microRNA (miRNA) is a short RNA (~ 22 nt) that regulates gene expression at the posttranscriptional level. Aberration of miRNA expressions could affect their targeting mRNAs involved in ...cancer-related signaling pathways. We conduct clustering analysis of miRNA and mRNA using expression data from the Cancer Genome Atlas (TCGA). We combine the Hungarian algorithm and blossom algorithm in graph theory. Data analysis is done using programming language R and Python.
We first quantify edge-weights of the miRNA-mRNA pairs by combining their expression correlation coefficient in tumor (T_CC) and correlation coefficient in normal (N_CC). We thereby introduce a bipartite graph partition procedure to identify cluster candidates. Specifically, we propose six weight formulas to quantify the change of miRNA-mRNA expression T_CC relative to N_CC, and apply the traditional hierarchical clustering to subjectively evaluate the different weight formulas of miRNA-mRNA pairs. Among these six different weight formulas, we choose the optimal one, which we define as the integrated mean value weights, to represent the connections between miRNA and mRNAs. Then the Hungarian algorithm and the blossom algorithm are employed on the miRNA-mRNA bipartite graph to passively determine the clusters. The combination of Hungarian and the blossom algorithms is dubbed maximum weighted merger method (MWMM).
MWMM identifies clusters of different sizes that meet the mathematical criterion that internal connections inside a cluster are relatively denser than external connections outside the cluster and biological criterion that the intra-cluster Gene Ontology (GO) term similarities are larger than the inter-cluster GO term similarities. MWMM is developed using breast invasive carcinoma (BRCA) as training data set, but can also applies to other cancer type data sets. MWMM shows advantage in GO term similarity in most cancer types, when compared to other algorithms.
miRNAs and mRNAs that are likely to be affected by common underlying causal factors in cancer can be clustered by MWMM approach and potentially be used as candidate biomarkers for different cancer types and provide clues for targets of precision medicine in cancer treatment.
In this randomized phase II clinical trial, we evaluated the effectiveness of adding the TLR agonists, poly-ICLC or resiquimod, to autologous tumor lysate-pulsed dendritic cell (ATL-DC) vaccination ...in patients with newly-diagnosed or recurrent WHO Grade III-IV malignant gliomas. The primary endpoints were to assess the most effective combination of vaccine and adjuvant in order to enhance the immune potency, along with safety. The combination of ATL-DC vaccination and TLR agonist was safe and found to enhance systemic immune responses, as indicated by increased interferon gene expression and changes in immune cell activation. Specifically, PD-1 expression increases on CD4+ T-cells, while CD38 and CD39 expression are reduced on CD8+ T cells, alongside an increase in monocytes. Poly-ICLC treatment amplifies the induction of interferon-induced genes in monocytes and T lymphocytes. Patients that exhibit higher interferon response gene expression demonstrate prolonged survival and delayed disease progression. These findings suggest that combining ATL-DC with poly-ICLC can induce a polarized interferon response in circulating monocytes and CD8+ T cells, which may represent an important blood biomarker for immunotherapy in this patient population.Trial Registration: ClinicalTrials.gov Identifier: NCT01204684.
Increased T cell infiltration and interferon gamma (IFNγ) pathway activation are seen in tumors of melanoma patients who respond to ICI (immune checkpoint inhibitor) or MAPK pathway inhibitor (MAPKi) ...therapies. Yet, the rate of durable tumor control after ICI is almost twice that of MAPKi, suggesting that additional mechanisms may be present in patients responding to ICI therapy that are beneficial for anti-tumor immunity.
We used transcriptional analysis and clinical outcomes from patients treated with ICI or MAPKi therapies to delineate immune mechanisms driving tumor response.
We discovered response to ICI is associated with CXCL13-driven recruitment of CXCR5+ B cells with significantly higher clonal diversity than MAPKi. Our
data indicate that CXCL13 production was increased in human peripheral blood mononuclear cells by anti-PD1, but not MAPKi, treatment. Higher B cell infiltration and B cell receptor (BCR) diversity allows presentation of diverse tumor antigens by B cells, resulting in activation of follicular helper CD4 T cells (Tfh) and tumor reactive CD8 T cells after ICI therapy. Higher BCR diversity and IFNγ pathway score post-ICI are associated with significantly longer patient survival compared to those with either one or none.
Response to ICI, but not to MAPKi, depends on the recruitment of CXCR5+ B cells into the tumor microenvironment and their productive tumor antigen presentation to follicular helper and cytotoxic, tumor reactive T cells. Our study highlights the potential of CXCL13 and B cell based strategies to enhance the rate of durable response in melanoma patients treated with ICI.
In comparison with responses in recurrent glioblastoma (rGBM), the intracranial response of brain metastases (BrM) to immune checkpoint blockade (ICB) is less well studied. Here, we present an ...integrated single-cell RNA-Seq (scRNA-Seq) study of 19 ICB-naive and 9 ICB-treated BrM samples from our own and published data sets. We compared them with our previously published scRNA-Seq data from rGBM and found that ICB led to more prominent T cell infiltration into BrM than rGBM. These BrM-infiltrating T cells exhibited a tumor-specific phenotype and displayed greater activated/exhausted features. We also used multiplex immunofluorescence and spatial transcriptomics to reveal that ICB reduced a distinct CD206+ macrophage population in the perivascular space, which may modulate T cell entry into BrM. Furthermore, we identified a subset of progenitor exhausted T cells that correlated with longer overall survival in BrM patients. Our study provides a comprehensive immune cellular landscape of ICB's effect on metastatic brain tumors and offers insights into potential strategies for improving ICB efficacy for brain tumor patients.
Short video hot spot classification is a fundamental method to grasp the focus of consumers and improve the effectiveness of video marketing. The limitations of traditional short text classification ...are sparse content as well as inconspicuous feature extraction. To solve the problems above, this paper proposes a short video hot spot classification model combining latent dirichlet allocation (LDA) feature fusion and improved bi-directional long short-term memory (BiLSTM), namely the LDA-BiLSTM-self-attention (LBSA) model, to carry out the study of hot spot classification that targets Carya cathayensis walnut short video review data under the TikTok platform. Firstly, the LDA topic model was used to expand the topic features of the Word2Vec word vector, which was then fused and input into the BiLSTM model to learn the text features. Afterwards, the self-attention mechanism was employed to endow different weights to the output information of BiLSTM in accordance with the importance, to enhance the precision of feature extraction and complete the hot spot classification of review data. Experimental results show that the precision of the proposed LBSA model reached 91.52%, which is significantly improved compared with the traditional model in terms of precision and F1 value.