Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed
  • FPredX: Interpretable model...
    Tam, Chunlai; Zhang, Kam Y. J.

    Proteins, structure, function, and bioinformatics, March 2022, 2022-03-00, 20220301, Volume: 90, Issue: 3
    Journal Article

    Fluorescent protein (FP) design is among the challenging protein design problems due to the tradeoffs among multiple properties to be optimized. Despite the accumulated efforts in design and characterization, progress has been slow in gaining a full understanding of sequence–property relationships to tackle the multiobjective design problem in FPs. In this study, we approach this problem by developing FPredX, a collection of gradient‐boosted decision tree models, which mapped FP sequences to four major design targets of FPs, including excitation maximum, emission maximum, brightness, and oligomeric state. By training using one‐hot encoded multiple aligned sequences with hyperparameters optimization in each model, FPredX models showed excellent prediction performance for all target properties compared with existing methods. We further interpreted the FPredX models by comparing the importance of positions along the aligned FP sequence to the predictive performance and suggested positions, which showed differential importance deemed by FPredX models to the prediction of each target property.