Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
Recenzirano Odprti dostop
  • iAMY-SCM: Improved predicti...
    Charoenkwan, Phasit; Kanthawong, Sakawrat; Nantasenamat, Chanin; Hasan, Md. Mehedi; Shoombuatong, Watshara

    Genomics (San Diego, Calif.), January 2021, 2021-01-00, 20210101, Letnik: 113, Številka: 1
    Journal Article

    Fast, accurate identification and characterization of amyloid proteins at a large-scale is essential for understating their role in therapeutic intervention strategies. As a matter of fact, there exist only one in silico model for amyloid protein identification using the random forest (RF) model in conjunction with various feature types namely the RFAmy. However, it suffers from low interpretability for biologists. Thus, it is highly desirable to develop a simple and easily interpretable prediction method with robust accuracy as compared to the existing complicated model. In this study, we propose iAMY-SCM, the first scoring card method-based predictor for predicting and analyzing amyloid proteins. Herein, the iAMY-SCM made use of a simple weighted-sum function in conjunction with the propensity scores of dipeptides for the amyloid protein identification. Cross-validation results indicated that iAMY-SCM provided an accuracy of 0.895 that corresponded to 10–22% higher performance than that of widely used machine learning models. Furthermore, iAMY-SCM achieving an accuracy of 0.827 as evaluated by an independent test, which was found to be comparable to that of RFAmy and was approximately 9–13% higher than widely used machine learning models. Furthermore, the analysis of estimated propensity scores of amino acids and dipeptides were performed to provide insights into the biophysical and biochemical properties of amyloid proteins. As such, this demonstrates that the proposed iAMY-SCM is efficient and reliable in terms of simplicity, interpretability and implementation. To facilitate ease of use of the proposed iAMY-SCM, a user-friendly and publicly accessible web server at http://camt.pythonanywhere.com/iAMY-SCM has been established. We anticipate that that iAMY-SCM will be an important tool for facilitating the large-scale prediction and characterization of amyloid protein. •We develop a novel sequence-based predictor named iAMY-SCM for predicting and analyzing amyloid proteins.•iAMY-SCM was superior to other machine learning models, considering its simplicity, interpretability, and implementation.•The estimated propensity scores could provide a better understanding of physicochemical properties of amyloid proteins.•The iAMY-SCM web server was established and made freely available online at http://camt.pythonanywhere.com/iAMY-SCM.