Feature selection approaches based on mutual information can be roughly categorized into two groups. The first group minimizes the redundancy of features between each other. The second group ...maximizes the new classification information of features providing for the selected subset. A critical issue is that large new information does not signify little redundancy, and vice versa. Features with large new information but with high redundancy may be selected by the second group, and features with low redundancy but with little relevance with classes may be highly scored by the first group. Existing approaches fail to balance the importance of both terms. As such, a new information term denoted as Independent Classification Information is proposed in this paper. It assembles the newly provided information and the preserved information negatively correlated with the redundant information. Redundancy and new information are properly unified and equally treated in the new term. This strategy helps find the predictive features providing large new information and little redundancy. Moreover, independent classification information is proved as a loose upper bound of the total classification information of feature subset. Its maximization is conducive to achieve a high global discriminative performance. Comprehensive experiments demonstrate the effectiveness of the new approach.
Abstract
Motivation
Exploring the potential drug–target interactions (DTIs) is a key step in drug discovery and repurposing. In recent years, predicting the probable DTIs through computational ...methods has gradually become a research hot spot. However, most of the previous studies failed to judiciously take into account the consistency between the chemical properties of drug and its functions. The changes of these relationships may lead to a severely negative effect on the prediction of DTIs.
Results
We propose an autoencoder-based method, AEFS, under spatial consistency constraints to predict DTIs. A heterogeneous network is established to integrate the information of drugs, proteins and diseases. The original drug features are projected to an embedding (protein) space by a multi-layer encoder, and further projected into label (disease) space by a decoder. In this process, the clinical information of drugs is introduced to assist the DTI prediction. By maintaining the distribution of drug correlation in the original feature, embedding and label space, AEFS keeps the consistency between chemical properties and functions of drugs. Experimental comparisons indicate that AEFS is more robust for imbalanced data and of significantly superior performance in DTI prediction. Case studies further confirm its ability to mine the latent DTIs.
Availability and implementation
The code of AEFS is available at https://github.com/JackieSun818/AEFS.
Supplementary information
Supplementary data are available at Bioinformatics online.
Several studies have shown that low expression of epoxide hydrolase 1 (EPHX1) is closely associated with varying human cancers, including hepatocellular carcinoma (HCC). This study aims to explore ...the potential mechanism of EPHX1 silencing and revealed a novel regulatory pathway in the pathogenesis of HCC. In this study, micro ribonucleic acid (miR)‐184 was predicted and validated to be a regulator of EPHX1 through experiments, and its expression was negatively correlated with the messenger RNA (mRNA) levels of EPHX1 in primary tumors. Elevation of EPHX1 suppressed cell proliferation and migration as well as cell cycle progression, and induced apoptosis, while downregulation of miR‐184 exhibited the opposite effect on cellular processes. Moreover, LINC00205 interacted with miR‐184 and was markedly downregulated in tumors. The effects of the miR‐184 inhibitor on cell proliferation, apoptosis, and migration were reversed in part by the transfection with LINC00205 small interfering RNAs. In addition, LINC00205 acted as a molecular sponge to positively regulate the mRNA and protein levels of EPHX1 via regulating miR‐184. The tumorigenicity of HCC cells was enhanced by LINC00205 shRNA but diminished by overexpression of EPHX1 in vivo. Clinically, the EPHX1 expression in patients with HCC was markedly downregulated. Taken together, the results of this study suggest that as a competing endogenous RNA, LINC00205 may regulate EPHX1 by inhibiting miR‐184 in the progression of HCC and that targeting the LINC00205/miR‐184/EPHX1 axis may provide a treatment protocol for patients.
We demonstrated that decreased epoxide hydrolase 1 (EPHX1) expression was a common event underlying hepatocellular carcinoma (HCC), indicating that EPHX1 may exert a crucial effect on the progression of HCC. LINC00205 suppresses the progression of HCC by modulating the miR‐184/EPHX1 pathway like a tumor suppressor.
MicroRNA (miRNA or miR) has been shown to play an important role in the initiation and development in many different cancers. Here, we demonstrated down‐regulated expression of miR‐27a‐3p in ...hepatocellular carcinoma (HCC) tissues in comparison with that in adjacent normal liver tissues based on the TCGA database. Cells viability and apoptosis was measured by CCK‐8 and flow cytometry assay. Cell invasion and migration was measured by Transwell and wound healing assay. The effect of miR‐27a‐3p on DUSP16 expression was evaluated by luciferase assays, and western blot assay. miR‐27a‐3p up‐regulation by transfection with miR‐27a‐3p mimics attenuated SMMC‐7721 and HepG2 cell viability, invasion as well as migration, obviously. Moreover, we found that dual specificity phosphatase 16 (DUSP16), also known as mitogen‐activated protein kinase phosphatase 7 (MKP‐7), is a target of miR‐27a‐3p. DUSP16 expression was obvious decrease by miR‐27a‐3p at both transcriptional and protein levels in both SMMC‐7721 and HepG2 cells. DUSP16 expression in tissues of HCC was up‐regulated in comparison with that in tissues of adjacent liver based on the TCGA database. Overexpression of DUSP16 significantly reversed the cell changes in viability, invasion and migration which resulted from miR‐27a‐3p up‐regulation in SMMC‐7721 and HepG2 cells. Our findings contribute to current understanding of the functions of miR‐27a‐3p and suggest a mechanism by which miR‐27a‐3p plays an anti‐tumor role in the development of HCC by targeting DUSP16.
miR‐27a‐3p induces cell apoptosis and inhibits cell invasion and migration of HCC. miR‐27a‐3p negative regulates DUSP16 expression. DUSP16 overexpression inhibits miR‐27a‐3p‐induced cell apoptosis and decreased cell invasion and migration of HCC.
Elm (Ulmus) has a long history of use as a high-quality heavy hardwood famous for its resistance to drought, cold, and salt. It grows in temperate, warm temperate, and subtropical regions. This is ...the first report of Ulmaceae chloroplast genomes by de novo sequencing. The Ulmus chloroplast genomes exhibited a typical quadripartite structure with two single-copy regions (long single copy LSC and short single copy SSC sections) separated by a pair of inverted repeats (IRs). The lengths of the chloroplast genomes from five Ulmus ranged from 158,953 to 159,453 bp, with the largest observed in Ulmus davidiana and the smallest in Ulmus laciniata. The genomes contained 137-145 protein-coding genes, of which Ulmus davidiana var. japonica and U. davidiana had the most and U. pumila had the fewest. The five Ulmus species exhibited different evolutionary routes, as some genes had been lost. In total, 18 genes contained introns, 13 of which (trnL-TAA+, trnL-TAA-, rpoC1-, rpl2-, ndhA-, ycf1, rps12-, rps12+, trnA-TGC+, trnA-TGC-, trnV-TAC-, trnI-GAT+, and trnI-GAT) were shared among all five species. The intron of ycf1 was the longest (5,675bp) while that of trnF-AAA was the smallest (53bp). All Ulmus species except U. davidiana exhibited the same degree of amplification in the IR region. To determine the phylogenetic positions of the Ulmus species, we performed phylogenetic analyses using common protein-coding genes in chloroplast sequences of 42 other species published in NCBI. The cluster results showed the closest plants to Ulmaceae were Moraceae and Cannabaceae, followed by Rosaceae. Ulmaceae and Moraceae both belonged to Urticales, and the chloroplast genome clustering results were consistent with their traditional taxonomy. The results strongly supported the position of Ulmaceae as a member of the order Urticales. In addition, we found a potential error in the traditional taxonomies of U. davidiana and U. davidiana var. japonica, which should be confirmed with a further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering.
The partially randomized extended Kaczmarz method is effective for solving large, sparse, overdetermined and inconsistent linear systems. In this paper, we propose a modified variant for this method, ...and give a tight upper bound for its convergence rate. Moreover, we verify the efficiency of the proposed method by numerical experiments.
The higher-order modified Korteweg–de Vries (mKdV) equation with constants background is revealed based on the Riemann–Hilbert problem (RHP). With the derivation of RHP, the one-soliton solution ...(oSS) and simple breather solution (sBS) of the higher-order mKdV equation are obtained for the first time. In addition, the dynamic behavior of the oSS and sBS were further discussed by some corresponding graphs with selecting appropriate parameters, which have not been studied in published works.
Unsupervised feature selection has gained considerable attention for extracting valuable features from unlabeled datasets. Existing approaches typically rely on sparse mapping matrices to preserve ...local neighborhood structures. However, this strategy favors large-weight features, potentially overlooking smaller yet valuable ones and distorting data distribution and feature structure. Besides, some methods focus on local structure information, failing to explore global information. To address these limitations, we introduce an exponential weighting mechanism to induce a rational feature distribution and explore data structure in the feature subspace. Specifically, we propose a unified framework incorporating local structure learning and exponentially weighted sparse regression for optimal feature combinations, preserving global and local information. Experimental results demonstrate the superiority of our approach over existing unsupervised feature selection methods.
•We introduce an exponential weighting mechanism to adjust the feature weight distribution.•We learn feature weights in the feature subspace to preserve sample distribution and feature structure.•We propose a UFS framework that captures local neighborhood structure and global discriminative information.
Background Laryngeal squamous cell carcinoma (LSCC) is one of the highly aggressive malignancy types of head and neck squamous cell carcinomas; genes involved in the development of LSCC still need ...exploration. Methods We downloaded expression profiles of 96 (85 in advanced stage and 11 in early stage) LSCC patients from TCGA-HNSC. Function enrichment and protein-protein interactions of genes in significant modules were conducted. Univariate and multivariate Cox regression analyses were performed to explore potential prognostic biomarkers for LSCC. The expression levels of genes at different stages were compared and visualized via boxplots. Immune infiltration was examined by the CIBERSORTx web-based tool and depicted with ggplot2. Gene set enrichment analysis (GSEA) was utilized to analyze functional enrichment terms and pathways. Immunohistochemical staining (IHC) was used to verify the expression of genes in the LSCC samples. Results We identified 25 modules, including 3 modules significantly related to tumor stages of LSCC via weighted gene co-expression network analysis (WGCNA). UIMC1, NPM1, and DCTN4 in the module 'cyan', TARS in the module 'darkorange', and COPB2 and RYK in the module 'lightyellow' showed statistically significant relation to overall survival. The expression of COPB2, DCTN4, RYK, TARS, and UIMC1 indicated association with the change of fraction of immune cells in LSCC patients; two genes, COPB2 and RYK, indicated different expression in various tumor stages of LSCC. Finally, COPB2 and RYK showed high-expression in tumor tissues of advanced LSCC patients. Conclusions Our study provided a potential perceptive in analyzing progression of LSCC cells and exploring prognostic genes. Keywords: LSCC, WGCNA, Immune infiltration, Tumor stages, GSEA