Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed
  • A Transformer-based visual ...
    Li, Yifan; Liu, Xiaotao; Yuan, Dian; Wang, Jiaoying; Wu, Peng; Liu, Jing

    Pattern recognition, November 2024, 2024-11-00, Volume: 155
    Journal Article

    Transformer has shown its great strength in visual object tracking due to its effective attention mechanism, but most prevailing transformer-based trackers only explore temporal information frame by frame, thus overlooking the rich context information inherent in videos. To alleviate this problem, we propose a transformer-based tracker via learning immediate appearance change information in videos, called IAC-tracker. The proposed tracker enhances the perception of the immediate motion state to improve the performance of single target tracking. IAC-tracker contains three key components: a spatial information extractor (SIE) with a superior attention mechanism to progressively extract spatial information, a temporal information extractor (TIE) with a designed temporal attention mechanism to progressively learn target immediate appearance change, and a novel spatial–temporal context enhanced fusion module integrating the information from SIE and TIE to prepare for the final prediction head. Comparison experiments with state-of-the-art trackers on six challenging datasets demonstrate the superior performance of IAC-tracker with real-time running speed. •A transformer tracking framework modeling spatial–temporal features is proposed.•A temporal information extractor is proposed to learn immediate appearance change.•A spatial–temporal context enhanced fusion module is proposed to integrate features.