Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body ...against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides.
In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier.
The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97.
Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia.
•Computational model is developed for Anti-cancer peptides.•Three discrete feature extraction methods are used.•Ensemble feature space is formed.•Simple majority and GA based majority are ...used.•Obtained quite promising results than existing methods.
Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, ‘iACP-GAEnsC’, is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, ‘iACP-GAEnsC’ model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that ‘iACP-GAEnsC’ model might be a leading tool in the field of drug design and proteomics for researchers.
Antioxidant proteins are involved in several biological processes and can protect DNA and cells from the damage of free radicals. These proteins regulate the body's oxidative stress and perform a ...significant role in many antioxidant-based drugs. The current invitro-based medications are costly, time-consuming, and unable to efficiently screen and identify the targeted motif of antioxidant proteins. In this model, we proposed an accurate prediction method to discriminate antioxidant proteins namely StackedEnC-AOP. The training sequences are formulation encoded via incorporating a discrete wavelet transform (DWT) into the evolutionary matrix to decompose the PSSM-based images via two levels of DWT to form a Pseudo position-specific scoring matrix (PsePSSM-DWT) based embedded vector. Additionally, the Evolutionary difference formula and composite physiochemical properties methods are also employed to collect the structural and sequential descriptors. Then the combined vector of sequential features, evolutionary descriptors, and physiochemical properties is produced to cover the flaws of individual encoding schemes. To reduce the computational cost of the combined features vector, the optimal features are chosen using Minimum redundancy and maximum relevance (mRMR). The optimal feature vector is trained using a stacking-based ensemble meta-model. Our developed StackedEnC-AOP method reported a prediction accuracy of 98.40% and an AUC of 0.99 via training sequences. To evaluate model validation, the StackedEnC-AOP training model using an independent set achieved an accuracy of 96.92% and an AUC of 0.98. Our proposed StackedEnC-AOP strategy performed significantly better than current computational models with a ~ 5% and ~ 3% improved accuracy via training and independent sets, respectively. The efficacy and consistency of our proposed StackedEnC-AOP make it a valuable tool for data scientists and can execute a key role in research academia and drug design.
Cancer is a leading killer disease globally, it occurs when the cellular changes cause the abnormal growth and division of the cells. Conventional treatment such as therapies and wet experimental ...methods are deemed unsatisfactory and worthless because of its huge cost and laborious nature. However, the recent innovation of anticancer peptides (ACPs) offers an effective way to treat cancer affected cells. Due to the rapid growth of biological sequences, truly identification of ACPs has become a difficult task for scientists. Therefore, measuring the importance of ACPs, an efficient and reliable intelligent model is highly essential to accurately identify its pattern. In this study, three distinct nature encoding schemes are employed to obtain features from peptide sequences. However, K-space amino acid pair (KSAAP) is used to extract highly correlated and effective descriptors. Apart from the sequential features, composite physiochemical properties are applied to gather local structure descriptors. Furthermore, to represent the intrinsic residue information of amino acids, autocovariance is also used. Additionally, a novel two-level feature selection (2LFS) method is utilized to select high discriminative features and to minimize the dimensionality of the proposed descriptors. At last, to examine the performance of the proposed model, several learning hypotheses are investigated to select a superior operational engine. To measure the generalization capability, two diverse benchmark datasets are used. After evaluating the empirical outcomes, KSAAP using 2LFS reported high classification results on both datasets. Whereas, the classification outcomes reveal that our proposed cACP-2LFS achieved ~11% improved performance accuracy than present models in the literature so far. It is expected that our proposed model might be useful in the area of medicine, proteomics, and research academia. The source code and all datasets are publicly available at https://github.com/shahidawkum/cACP-2LFS.
•Computational model is developed for N6-methyladenosine sites.•PseDNC, PseTNC, STNC and STTNC discrete methods are utilized for features.•KNN, SVM, and PNN are used as a classifier.•Obtained quite ...promising results than existing methods.
N6- methyladenosine (m6A) is a vital post-transcriptional modification, which adds another layer of epigenetic regulation at RNA level. It chemically modifies mRNA that effects protein expression. RNA sequence contains many genetic code motifs (GAC). Among these codes, identification of methylated or not methylated GAC motif is highly indispensable. However, with a large number of RNA sequences generated in post-genomic era, it becomes a challenging task how to accurately and speedily characterize these sequences. In view of this, the concept of an intelligent is incorporated with a computational model that truly and fast reflects the motif of the desired classes. An intelligent computational model “iMethyl-STTNC” model is proposed for identification of methyladenosine sites in RNA. In the proposed study, four feature extraction techniques, such as; Pseudo-dinucleotide-composition, Pseudo-trinucleotide-composition, split-trinucleotide-composition, and split-tetra-nucleotides-composition (STTNC) are utilized for genuine numerical descriptors. Three different classification algorithms including probabilistic neural network, Support vector machine (SVM), and K-nearest neighbor are adopted for prediction. After examining the outcomes of prediction model on each feature spaces, SVM using STTNC feature space reported the highest accuracy of 69.84%, 91.84% on dataset1 and dataset2, respectively. The reported results show that our proposed predictor has achieved encouraging results compared to the present approaches, so far in the research. It is finally reckoned that our developed model might be beneficial for in-depth analysis of genomes and drug development.
Myrmoteras scabrum
Moffett, 1985 is located again in Western Ghats. The species was described based upon a single specimen collected from Cannanore District of Western Ghats by Moffett (Bull Mus Comp ...Zool 151(1):1–53,
1985
) Only the second known specimen of this species from Thanikuddy region of Periyar Tiger reserve is discovered here. Auto-montage images of the species are provided herewith along with notes on its biology.
Acute myeloid leukemia (AML) stem cells are required for the initiation and maintenance of the disease. Activation of the Wnt/β-catenin pathway is required for the survival and development of AML ...leukaemia stem cells (LSCs) and therefore, targeting β-catenin is a potential therapeutic strategy. NUC-7738, a phosphoramidate transformation of 3'-deoxyadenosine (3'-dA) monophosphate, is specifically designed to generate the active anti-cancer metabolite 3'-deoxyadenosine triphosphate (3'-dATP) intracellularly, bypassing key limitations of breakdown, transport, and activation. NUC-7738 is currently in a Phase I/II clinical study for the treatment of patients with advanced solid tumors. Protein expression and immunophenotypic profiling revealed that NUC-7738 caused apoptosis in AML cell lines through reducing PI3K-p110α, phosphorylated Akt (Ser473) and phosphorylated GSK3β (Ser9) resulting in reduced β-catenin, c-Myc and CD44 expression. NUC-7738 reduced β-catenin nuclear expression in AML cells. NUC-7738 also decreased the percentage of CD34+ CD38- CD123+ (LSC-like cells) from 81% to 47% and reduced the total number and size of leukemic colonies. These results indicate that therapeutic targeting of the PI3K/Akt/GSK3β axis can inhibit β-catenin signalling, resulting in reduced clonogenicity and eventual apoptosis of AML cells.
Mycobacterium tuberculosis, a highly perilous pathogen in humans, serves as the causative agent of tuberculosis (TB), affecting nearly 33% of the global population. With the increasing prevalence of ...multidrug-resistant TB, there is a need for novel and efficacious alternative therapies. Peptide therapies have emerged as a favorable alternative due to their remarkable specificity in targeting cells without affecting healthy cells. However, the experimental identification methods of anti-tubercular peptides (AtbPs) are labor-intensive and costly. Therefore, accurate prediction of AtbPs has become challenging due to the large number of peptide samples. In this paper, we propose an ensemble learning model to enhance the prediction outcomes by addressing the limitations of individual learning models. We formulate the training samples by utilizing four distinct representation methods: AAindex, Composition/Transition/Distribution, Dipeptide Deviation from Expected Mean, and Enhanced Grouped Amino Acid Composition to numerically encode peptide samples. The feature vectors extracted from these methods are fused to develop a compact vector. We evaluate the prediction rates using three different classification models, employing both individual and heterogeneous vectors. Furthermore, we enhance the prediction and training capabilities of the proposed model by using the predicted labels of the individual classifiers for implementing an ensemble deep model via a genetic algorithm. Through evaluation of both the training datasets and independent datasets, our proposed ensemble learner achieves impressive accuracies of 97.80%, 95.13%, 93.91%, and 94.17%, using RD training, MD training, RD independent, and MD independent datasets, respectively. Our findings demonstrate that the proposed pAtbP-EnC model outperforms existing predictors by reporting approximately 11% higher training accuracy. We conclude that the pAtbP-EnC predictor will be a considerable tool in the field of pharmaceutical design and research academia. The used datasets and the source code are publicly available at https://github.com/Intelligent-models/pAtbP-EnC2023 .
Neuropeptides (NPs) are a kind of neuromodulator/ neurotransmitter that works as signaling molecules in the central nervous system, and perform major roles in physiological and hormone regulation ...activities. Recently, machine learning-based therapeutic agents have gained the attention of researchers due to their high and reliable prediction results. However, the unsatisfactory performance of the existing predictors is due to their high execution cost and minimum predictive results. Therefore, the development of a reliable prediction is highly indispensable for scientists to effectively predict NPs. In this study, we presented an automatic and computationally effective model for identifying of NPs. The evolutionary information is formulated using a bigram position-specific scoring matrix (Bi-PSSM) and K-spaced bigram (KSB). Moreover, for noise reduction, a discrete wavelet transform (DWT) is utilized to form Bi-PSSM_DWT and KSB_DWT based high discriminative vectors. In addition, one-hot encoding is also employed to collect sequential features from peptide samples. Finally, a multi-perspective feature set of sequential and embedded evolutionary information. The optimum features are chosen from the extracted features via Shapley Additive exPlanations (SHAP) by evaluating the contribution of the extracted features. The optimal features are trained via six classification models i.e., XGB, ETC, SVM, ADA, FKNN, and LGBM. The predicted labels of these learners are then provided to a genetic algorithm to form an ensemble classification approach. Hence, our model achieved a higher predictive accuracy of 94.47% and 92.55% using training sequences and independent sequences, respectively. Which is ~3% highest predictive accuracy than present methods. It is suggested that our presented tool will be beneficial and may execute a substantial role in drug development and research academia. The source code and all datasets are publicly available at https://github.com/shahidawkum/Target-ensC_NP.
Here we describe the presence of the monotypic and poorly known aphid genus Pseudessigella Hille Ris Lambers (Hemiptera: Aphididae: Lachninae) in India. So far, the genus has only been known from ...Punjab, Pakistan. Representatives of P. brachychaeta Hille Ris Lambers were collected from Pinus wallichiana A.B. Jacks. in the Yousmarg region of the state of Jammu and Kashmir in India. Hitherto unknown oviparous females and dwarfish males, the latter reported in Eulachnini for the first time, are described and illustrated. The male's antennal sensilla and genitalic morphology are additionally studied and presented using Scanning Electron Microscopy. Notes on the biology, distribution, and previously overlooked generic features are given. We provide morphological identification keys to the genera of the tribe Eulachnini and to the species of aphid living on P. wallichiana.