NUK - logo
E-viri
Celotno besedilo
Recenzirano Odprti dostop
  • FRIC: a framework for few-s...
    Zhou, Haonan; Xia, Lurui; Du, Xiaoping; Li, Sen

    International journal of digital earth, 12/2024, Letnik: 17, Številka: 1
    Journal Article

    ABSTRACTThe training of image captioning (IC) models requires a large number of caption-labeled samples, which is usually difficult to satisfy in the actual remote sensing scenarios. The performance of the models will be damaged due to the few-shot problems. We describe the few-shot problems in remote sensing image captioning (RC) and design two research schemes. Then, we propose a few-shot RC framework few-shot remote sensing image captioning framework (FRIC). FRIC does not need additional samples and uses a simple base model. FRIC tries to get performance boosts from split samples and reduce the negative effects of noises. Unlike previous works that use 100% samples to simulate few-shot scenarios, FRIC uses less than 1.0% data to simulate actual few-shot scenarios. While previous works focus on improving the encoder, FRIC focuses on optimizing the decoder with parameter ensemble, multi-model ensemble and self-distillation. FRIC can train a simple base model with limited caption-labeled samples to generate captions that meet human expectations. FRIC shows obvious advantages to other methods when trained with only 0.8% samples of RC datasets. No previous work has used such a small amount of data to train the RC model. In addition, the effectiveness of the components in FRIC is verified with ablation experiments.