UNI-MB - logo
UMNIK - logo
 
E-viri
Celotno besedilo
Recenzirano
  • PileNet: A high-and-low pas...
    Yang, Xiaoqi; Duan, Liangliang; Zhou, Quanqiang

    Journal of visual communication and image representation, June 2024, 2024-06-00, Letnik: 102
    Journal Article

    Multi-head self-attentions (MSAs) in Transformer are low-pass filters, which will tend to reduce high-frequency signals. Convolutional layers (Convs) in Convolutional Neural Network (CNN) are high-pass filters, which will tend to capture high-frequency components of the images. Therefore, CNN and Transformer contain complementary information, and the combination of the two is necessary for satisfactory detection results. In this work, we propose a novel framework PileNet that efficiently combine CNN and Transformer for accurate salient object detection (SOD). Specifically in PileNet, we introduce complementary encoder that extracts multi-level complementary saliency features. Next, we simplify the complementary features by adjusting the number of channels for all features to a fixed value. By introducing the multi-level feature aggregation (MLFA) and multi-level feature refinement (MLFR) units, the low- and high-level features can easily be transmitted to feature blocks at various pyramid levels. Finally, we fuse all the refined saliency features in a Unet-like structure from top to bottom and use multi-point supervision mechanism to produce the final saliency maps. Extensive experimental results over five widely used saliency benchmark datasets clearly demonstrate that our proposed model can accurately locate the entire salient objects with clear object boundaries and outperform sixteen previous state-of-the-art saliency methods in terms of a wide range of metrics. •A high-and-low pass complementary filter is used to generate encoders.•We design an effective multi-level feature refinement unit.•We design a multi-level feature aggregation unit with shared parameters.•A multi-point supervision mechanism is proposed to generate saliency maps.