Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed
  • PFD-Net: Pyramid Fourier De...
    Yang, Chaorong; Zhang, Zhaohui

    Computers in biology and medicine, April 2024, 2024-Apr, 2024-04-00, 20240401, Volume: 172
    Journal Article

    Medical image segmentation is crucial for accurately locating lesion regions and assisting doctors in diagnosis. However, most existing methods fail to effectively utilize both local details and global semantic information in medical image segmentation, resulting in the inability to effectively capture fine-grained content such as small targets and irregular boundaries. To address this issue, we propose a novel Pyramid Fourier Deformable Network (PFD-Net) for medical image segmentation, which leverages the strengths of CNN and Transformer. The PFD-Net first utilizes PVTv2-based Transformer as the primary encoder to capture global information and further enhances both local and global feature representations with the Fast Fourier Convolution Residual (FFCR) module. Moreover, PFD-Net further proposes the Dilated Deformable Refinement (DDR) module to enhance the model’s capacity to comprehend global semantic structures of shape-diverse targets and their irregular boundaries. Lastly, Cross-Level Fusion Block with deformable convolution (CLFB) is proposed to combine the decoded feature maps from the final Residual Decoder Block (DDR) with local features from the CNN auxiliary encoder branch, improving the network’s ability to perceive targets resembling the surrounding structures. Extensive experiments were conducted on nine publicly medical image datasets for five types of segmentation tasks including polyp, abdominal, cardiac, gland cells and nuclei. The qualitative and quantitative results demonstrate that PFD-Net outperforms existing state-of-the-art methods in various evaluation metrics, and achieves the highest performance of mDice with the value of 0.826 on the most challenging dataset (ETIS), which is 1.8% improvement compared to the previous best-performing HSNet and 3.6% improvement compared to the next-best PVT-CASCADE. Codes are available at https://github.com/ChaorongYang/PFD-Net. •PFD-Net with PVTv2-based primary encoder and CNN-based auxiliary encoder is proposed for medical image segmentation.•FFCR module is proposed to enhance local and global features from PVTv2 encoder by spatial-frequency domain-combination.•DDR module is proposed to enrich objects’ semantic information in the feature maps from FFCR.•CLFB module is constructed to refine targets’ boundaries.•Achieves competitive performance on nine publicly datasets for five segmentation tasks.