Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. ...However, due to the different sensors' locations, characteristics and acquisition time, PAN and MS image pairs often tend to have various amounts of misalignment. Conventional deep-learning-based methods that were trained with such misaligned PAN-MS image pairs suffer from diverse artifacts such as double-edge and blur artifacts in the resultant PAN-sharpened images. In this paper, we propose a novel framework called shift-invariant pan-sharpening with moving object alignment (SIPSA-Net) which is the first method to take into account such large misalignment of moving object regions for PAN sharpening. The SISPA-Net has a feature alignment module (FAM) that can adjust one feature to be aligned to another feature, even between the two different PAN and MS domains. For better alignment in pan-sharpened images, a shift-invariant spectral loss is newly designed, which ignores the inherent misalignment in the original MS input, thereby having the same effect as optimizing the spectral loss with a well-aligned MS image. Extensive experimental results show that our SIPSA-Net can generate pan-sharpened images with remarkable improvements in terms of visual quality and alignment, compared to the state-of-the-art methods.
Recently, the analysis and use of Synthetic Aperture Radar (SAR) imagery have become crucial for surveillance, military operations, and environmental monitoring. A common challenge with SAR images is ...the presence of speckle noise, which can hinder their interpretability. To enhance the clarity of SAR images, this paper introduces a novel SAR-to-Electro-Optical (EO) image translation (SET) network, called SGCL-SET, which firstly incorporates EO object label information for stable translation. We use a pre-trained segmentation network to provide the segmentation regions with their labels into learning the SET. Our SGCL-SET can be trained to effectively learn the translation for the regions of confusing contexts by utilizing the segmentation and label information. Through comprehensive experiments on our KOMPSAT dataset, our SGCL-SET significantly outperforms all the previous methods with large margins across nine image quality evaluation metrics.
Satellite synthetic aperture radar (SAR) images are immensely valuable because they can be obtained regardless of weather and time conditions. However, SAR images have fatal noise and less contextual ...information, thus making them harder and less interpretable. Thus, translation of SAR to electro-optical (EO) images is highly required for easier interpretation. In this article, we propose a novel coarse-to-fine context-aware SAR-to-EO translation (CFCA-SET) framework and a misalignment-resistant (MR) loss for the misaligned pairs of SAR-EO images. With our auxiliary learning of SAR-to-near-infrared translation, CFCA-SET consists of a two-stage training: 1) the low-resolution SAR-to-EO translation is learned in the coarse stage via a local self-attention module that helps diminish the SAR noise and 2) the resulting output is used as guidance in the fine stage to generate the SAR colorization of high resolution. Our proposed auxiliary learning of SAR-to-NIR translation can successfully lead CFCA-SET to learn distinguishable characteristics of various SAR objects with less confusion in a context-aware manner. To handle the inevitable misalignment problem between SAR and EO images, we newly designed an MR loss function. Extensive experimental results show that our CFCA-SET can generate more recognizable and understandable EO-like images compared to other methods in terms of nine image quality metrics. Our CFCA-SET surpasses the state-of-the-art methods for two (QXS and CASET) datasets with the improvements: PSNR (3.6%, 29%), ERGAS (7.4%, 30%), SSIM (15%, 15%), SAM (21%, 38%), <inline-formula> <tex-math notation="LaTeX">D_{S} </tex-math></inline-formula> (16%, 13%), QNR (1.5%, 3.1%), CHD (18%, 12%), LPIPS (4.2%, 8%), and FID (9.0%, 33%).
Recent advances in deep learning have shown impressive performances for pan-sharpening. Pan-sharpening is the task of enhancing the spatial resolution of a multi-spectral (MS) image by exploiting the ...high-frequency information of its corresponding panchromatic (PAN) image. Many deep-learning-based pan-sharpening methods have been developed recently, surpassing the performances of traditional pan-sharpening approaches. However, most of them are trained in lower scales using misaligned PAN-MS training pairs, which has led to undesired artifacts and unsatisfying visual quality. In this paper, we propose an unsupervised learning framework with registration learning for pan-sharpening, called UPSNet. UPSNet can be effectively trained in the original scales, and implicitly learns the registration between PAN and MS images without any dedicatedly designed registration module involved. Additionally, we design two novel loss functions for training UPSNet: a guided-filter-based color loss between network outputs and aligned MS targets; and a dual-gradient detail loss between network outputs and PAN inputs. Extensive experimental results show that our UPSNet can generate pan-sharpened images with remarkable improvements in terms of visual quality and registration, compared to the state-of-the-art methods.
Many real-world image recognition problems, such as diagnostic medical imaging exams, are “long-tailed” – there are a few common findings followed by many more relatively rare conditions. In chest ...radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Display omitted
•Conducted a challenge on long-tailed, multi-label classification from chest X-ray.•Synthesized insights from top-performing CXR-LT challenge solutions.•Released labels for >350k chest X-rays for long-tailed disease classification.•Multimodal foundation models may advance long-tailed medical image classification.
Pan-sharpening is a process of merging a high-resolution (HR) panchromatic (PAN) image and its corresponding low-resolution (LR) multi-spectral (MS) image to create an HR-MS and pan-sharpened image. ...However, due to the different sensors' locations, characteristics and acquisition time, PAN and MS image pairs often tend to have various amounts of misalignment. Conventional deep-learning-based methods that were trained with such misaligned PAN-MS image pairs suffer from diverse artifacts such as double-edge and blur artifacts in the resultant PAN-sharpened images. In this paper, we propose a novel framework called shift-invariant pan-sharpening with moving object alignment (SIPSA-Net) which is the first method to take into account such large misalignment of moving object regions for PAN sharpening. The SISPA-Net has a feature alignment module (FAM) that can adjust one feature to be aligned to another feature, even between the two different PAN and MS domains. For better alignment in pan-sharpened images, a shift-invariant spectral loss is newly designed, which ignores the inherent misalignment in the original MS input, thereby having the same effect as optimizing the spectral loss with a well-aligned MS image. Extensive experimental results show that our SIPSA-Net can generate pan-sharpened images with remarkable improvements in terms of visual quality and alignment, compared to the state-of-the-art methods.
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" \(\unicode{x2013}\) there are a few common findings followed by many more relatively rare ...conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest ...radiography, diagnosis is both a
and
problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge,
, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.
Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest ...radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.