Identifying essential genes in a given organism is important for research on their fundamental roles in organism survival. Furthermore, if possible, uncovering the links between core functions or ...pathways with these essential genes will further help us obtain deep insight into the key roles of these genes. In this study, we investigated the essential and non-essential genes reported in a previous study and extracted gene ontology (GO) terms and biological pathways that are important for the determination of essential genes. Through the enrichment theory of GO and KEGG pathways, we encoded each essential/non-essential gene into a vector in which each component represented the relationship between the gene and one GO term or KEGG pathway. To analyze these relationships, the maximum relevance minimum redundancy (mRMR) was adopted. Then, the incremental feature selection (IFS) and support vector machine (SVM) were employed to extract important GO terms and KEGG pathways. A prediction model was built simultaneously using the extracted GO terms and KEGG pathways, which yielded nearly perfect performance, with a Matthews correlation coefficient of 0.951, for distinguishing essential and non-essential genes. To fully investigate the key factors influencing the fundamental roles of essential genes, the 21 most important GO terms and three KEGG pathways were analyzed in detail. In addition, several genes was provided in this study, which were predicted to be essential genes by our prediction model. We suggest that this study provides more functional and pathway information on the essential genes and provides a new way to investigate related problems.
Forecasting alterations in protein stability caused by variations holds immense importance. Improving the thermal stability of proteins is important for biomedical and industrial applications. This ...review discusses the latest methods for predicting the effects of mutations on protein stability, databases containing protein mutations and thermodynamic parameters, and experimental techniques for efficiently assessing protein stability in high‐throughput settings. Various publicly available databases for protein stability prediction are introduced. Furthermore, state‐of‐the‐art computational approaches for anticipating protein stability changes due to variants are reviewed. Each method's types of features, base algorithm, and prediction results are also detailed. Additionally, some experimental approaches for verifying the prediction results of computational methods are introduced. Finally, the review summarizes the progress and challenges of protein stability prediction and discusses potential models for future research directions.
Colorectal cancer is the third most common cancer in males and second in females. This disease can be caused by genetic and acquired/environmental factors. Microsatellite instability (MSI) is one of ...the major mechanisms in colorectal cancer. This mechanism is a specific condition of genetic hyper mutability that results from incompetent DNA mismatch repair. MSI has been applied to classify different colorectal cancer subtypes. However, the effects of MSI status on gene expression are largely unknown. In our study, we integrated the gene expression profile and MSI status of all CRC samples from the TCGA database, and then categorized the CRC samples into three subgroups, namely, MSI‐stable, MSI‐low, and MSI‐high, according to the MSI status. We applied a novel computational method based on machine learning and screened the genes specifically expressed for the different colorectal cancer subtypes. The results showed the distinct mechanisms of the different colorectal cancer subtypes with MSI status and provided the genes that may be the optimal standards to further classify the various molecular subtypes of colorectal cancer with distinct MSI status.
What's new?
Microsatellite instability (MSI), a key genetic mechanism implicated in colorectal cancer (CRC), is linked to drug reactivity and sensitivity in CRC patients and is useful for CRC subtype classification. Yet, little is known about the identity of MSI‐associated genes or their role in CRC. Here, combined analysis of datasets on gene‐expression profile and MSI status enabled the investigation of a number of differentially expressed genes from CRC samples. Genes optimal for the classification of CRC subtypes with different MSI statuses were identified. The gene panel could facilitate the discovery of biomarkers specific for CRCs with known MSI status.
Synthetic lethality is the synthesis of mutations leading to cell death. Tumor‐specific synthetic lethality has been targeted in research to improve cancer therapy. With the advances of techniques in ...molecular biology, such as RNAi and CRISPR/Cas9 gene editing, efforts have been made to systematically identify synthetic lethal interactions, especially for frequently mutated genes in cancers. However, elucidating the mechanism of synthetic lethality remains a challenge because of the complexity of its influencing conditions. In this study, we proposed a new computational method to identify critical functional features that can accurately predict synthetic lethal interactions. This method incorporates several machine learning algorithms and encodes protein‐coding genes by an enrichment system derived from gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways to represent their functional features. We built a random forest‐based prediction engine by using 2120 selected features and obtained a Matthews correlation coefficient of 0.532. We examined the top 15 features and found that most of them have potential roles in synthetic lethality according to previous studies. These results demonstrate the ability of our proposed method to predict synthetic lethal interactions and provide a basis for further characterization of these particular genetic combinations.
A computational analysis of synthetic lethality was performed in this study. Synthetic lethality gene pairs were encoded via enrichment theory of Gene Ontology and Kyoto Encyclopedia of Genes and Genomes. Advanced computational methods were adopted to build an optimal prediction model and extract important features.
Many differences between different ethnic groups have been observed, such as skin color, eye color, height, susceptibility to some diseases, and response to certain drugs. However, the genetic bases ...of such differences have been under-investigated. Since the HapMap project, large-scale genotype data from Caucasian, African and Asian population samples have been available. The project found that these populations were located in different areas of the PCA (Principal Component Analysis) plot. However, as an unsupervised method, PCA does not measure the differences in each single nucleotide polymorphism (SNP) among populations.
We applied an advanced mutual information-based feature selection method to detect associations between SNP status and ethnic groups using the latest HapMap Phase 3 release version 3, which included more sub-populations. A total of 299 SNPs were identified, and they can accurately predicted the ethnicity of all HapMap populations. The 10-fold cross validation accuracy of the SMO (sequential minimal optimization) model on training dataset was 0.901, and the accuracy on independent test dataset was 0.895.
In-depth functional analysis of these SNPs and their nearby genes revealed the genetic bases of skin and eye color differences among populations.
To develop high‐performance thermally activated delayed fluorescence (TADF) exciplex emitters, a novel strategy of introducing a single‐molecule TADF emitter as one of the constituting materials has ...been presented. Such a new type of exciplex TADF emitter will have two reverse intersystem crossing (RISC) routes on both the pristine TADF molecules and the exciplex emitters, benefiting the utilization of triplet excitons. Based on a newly designed and synthesized single‐molecule TADF emitter MAC, a highly efficient exciplex emitter MAC:PO‐T2T has been obtained. The device based on MAC:PO‐T2T with a weight ratio of 7:3 exhibits a low turn‐on voltage of 2.4 V, high maximum efficiency of 52.1 cd A−1 (current efficiency), 45.5 lm W−1 (power efficiency), and 17.8% (external quantum efficiency, EQE), as well as a high EQE of 12.3% at a luminance of 1000 cd m−2. The device shows the best performance among reported organic light‐emitting devices based on exciplex emitters. Such high‐efficiency and low‐efficiency roll‐off should be ascribed to the additional reverse intersystem crossing process on the MAC molecules, showing the advantages of the strategy described in this study.
A new type of high‐performance exciplex thermally activated delayed fluorescence TADF emitter is demonstrated by introducing single‐molecule TADF emitter as one of the constituting materials. The OLED based on the novel emitter shows a low turn‐on voltage of 2.4 V and a maximum external quantum efficiency of 17.8% with mild efficiency roll‐off, which offers a new strategy for designing efficient exciplex emitters.
Background Chronic inflammatory disorders in atrial fibrillation (AF) contribute to the onset of ischemic stroke. Systemic immune inflammation index (SIII) and system inflammation response index ...(SIRI) are the two novel and convenient measurements that are positively associated with body inflammation. However, little is known regarding the association between SIII/SIRI with the presence of AF among the patients with ischemic stroke. Methods A total of 526 ischemic stroke patients (173 with AF and 353 without AF) were consecutively enrolled in our study from January 2017 to June 2019. SIII and SIRI were measured in both groups. Logistic regression analysis was used to analyse the potential association between SIII/SIRI and the presence of AF. Finally, the correlation between hospitalization expenses, changes in the National Institutes of Health Stroke Scale (NIHSS) scores and SIII/SIRI values were measured. Results In patients with ischemic stroke, SIII and SIRI values were significantly higher in AF patients than in non-AF patients (all p < 0.001). Moreover, with increasing quartiles of SIII and SIRI in all patients, the proportion of patients with AF was higher than that of non-AF patients gradually. Logistic regression analyses demonstrated that log-transformed SIII and log-transformed SIRI were independently associated with the presence of AF in patients with ischemic stroke (log-transformed SIII: odds ratio OR: 1.047, 95% confidence interval CI = 0.322-1.105, p = 0.047; log-transformed SIRI: OR: 6.197, 95% CI = 2.196-17.484, p = 0.001). Finally, a positive correlation between hospitalization expenses, changes in the NIHSS scores and SIII/SIRI were found, which were more significant in patients with AF (all p < 0.05). Conclusions Our study suggests SIII and SIRI are convenient and effective measurements for predicting the presence of AF in patients with ischemic stroke. Moreover, they were correlated with increased financial burden and poor short-term prognosis in AF patients presenting with ischemic stroke. Keywords: Atrial fibrillation, Systemic immune inflammation index, Systemic inflammation response index, Ischemic stroke
Breast cancer is regarded worldwide as a severe human disease. Various genetic variations, including hereditary and somatic mutations, contribute to the initiation and progression of this disease. ...The diagnostic parameters of breast cancer are not limited to the conventional protein content and can include newly discovered genetic variants and even genetic modification patterns such as methylation and microRNA. In addition, breast cancer detection extends to detailed breast cancer stratifications to provide subtype-specific indications for further personalized treatment. One genome-wide expression-methylation quantitative trait loci analysis confirmed that different breast cancer subtypes have various methylation patterns. However, recognizing clinically applied (methylation) biomarkers is difficult due to the large number of differentially methylated genes. In this study, we attempted to re-screen a small group of functional biomarkers for the identification and distinction of different breast cancer subtypes with advanced machine learning methods. The findings may contribute to biomarker identification for different breast cancer subtypes and provide a new perspective for differential pathogenesis in breast cancer subtypes.
Protein‐protein interactions (PPIs) form the basis of a myriad of biological pathways and mechanism, such as the formation of protein complexes or the components of signaling cascades. Here, we ...reviewed experimental methods for identifying PPI pairs, including yeast two‐hybrid (Y2H), mass spectrometry (MS), co‐localization, and co‐immunoprecipitation. Furthermore, a range of computational methods leveraging biochemical properties, evolution history, protein structures and more have enabled identification of additional PPIs. Given the wealth of known PPIs, we reviewed important network methods to construct and analyze networks of PPIs. These methods aid biological discovery through identifying hub genes and dynamic changes in the network, and have been thoroughly applied in various fields of biological research. Lastly, we discussed the challenges and future direction of research utilizing the power of PPI networks.