•Different types of uncertainties for deep-learning based medical image segmentation were analysed.•We propose a general aleatoric uncertainty estimation method based on test-time augmentation.•A ...theoretical formulation of test-time augmentation was proposed.•The proposed method was validated with 2D fetal brain segmentation and 3D brain tumor segmentation tasks.
Despite the state-of-the-art performance for medical image segmentation, deep convolutional neural networks (CNNs) have rarely provided uncertainty estimations regarding their segmentation outputs, e.g., model (epistemic) and image-based (aleatoric) uncertainties. In this work, we analyze these different types of uncertainties for CNN-based 2D and 3D medical image segmentation tasks at both pixel level and structure level. We additionally propose a test-time augmentation-based aleatoric uncertainty to analyze the effect of different transformations of the input image on the segmentation output. Test-time augmentation has been previously used to improve segmentation accuracy, yet not been formulated in a consistent mathematical framework. Hence, we also propose a theoretical formulation of test-time augmentation, where a distribution of the prediction is estimated by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We compare and combine our proposed aleatoric uncertainty with model uncertainty. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) the test-time augmentation-based aleatoric uncertainty provides a better uncertainty estimation than calculating the test-time dropout-based model uncertainty alone and helps to reduce overconfident incorrect predictions, and 2) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions.
The state-of-the-art models for medical image segmentation are variants of U-Net and fully convolutional networks (FCN). Despite their success, these models have two limitations: (1) their optimal ...depth is apriori unknown, requiring extensive architecture search or inefficient ensemble of models of varying depths; and (2) their skip connections impose an unnecessarily restrictive fusion scheme, forcing aggregation only at the same-scale feature maps of the encoder and decoder sub-networks. To overcome these two limitations, we propose UNet++, a new neural architecture for semantic and instance segmentation, by (1) alleviating the unknown network depth with an efficient ensemble of U-Nets of varying depths, which partially share an encoder and co-learn simultaneously using deep supervision; (2) redesigning skip connections to aggregate features of varying semantic scales at the decoder sub-networks, leading to a highly flexible feature fusion scheme; and (3) devising a pruning scheme to accelerate the inference speed of UNet++. We have evaluated UNet++ using six different medical image segmentation datasets, covering multiple imaging modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and electron microscopy (EM), and demonstrating that (1) UNet++ consistently outperforms the baseline models for the task of semantic segmentation across different datasets and backbone architectures; (2) UNet++ enhances segmentation quality of varying-size objects-an improvement over the fixed-depth U-Net; (3) Mask RCNN++ (Mask R-CNN with UNet++ design) outperforms the original Mask R-CNN for the task of instance segmentation; and (4) pruned UNet++ models achieve significant speedup while showing only modest performance degradation. Our implementation and pre-trained models are available at https://github.com/MrGiovanni/UNetPlusPlus.
Due to the poor prognosis of Pancreatic cancer, accurate early detection and segmentation are critical for improving treatment outcomes. However, pancreatic segmentation is challenged by blurred ...boundaries, high shape variability, and class imbalance. To tackle these problems, we propose a multiscale attention network with shape context and prior constraint for robust pancreas segmentation. Specifically, we proposed a Multi-scale Feature Extraction Module (MFE) and a Mixed-scale Attention Integration Module (MAI) to address unclear pancreas boundaries. Furthermore, a Shape Context Memory (SCM) module is introduced to jointly model semantics across scales and pancreatic shape. Active Shape Model (ASM) is further used to model the shape priors. Experiments on NIH and MSD datasets demonstrate the efficiency of our model, which improves the state-of-the-art Dice Score for 1.01% and 1.03% respectively. Our architecture provides robust segmentation performance, against the blurry boundaries, and variations in scale and shape of pancreas.
•Introduce MFE and MAI modules for clearer pancreas segmentation boundaries.•Propose SCM module for combined pancreas location and shape modeling.•Develop ASM module to guide pancreas shape with prior constraints.•Use weighted BCE and IoU loss functions to tackle background-pancreas area imbalance.
Image Segmentation Using Deep Learning: A Survey Minaee, Shervin; Boykov, Yuri; Porikli, Fatih ...
IEEE transactions on pattern analysis and machine intelligence,
07/2022, Volume:
44, Issue:
7
Journal Article
Peer reviewed
Open access
Image segmentation is a key task in computer vision and image processing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, ...augmented reality, and image compression, among others, and numerous segmentation algorithms are found in the literature. Against this backdrop, the broad success of deep learning (DL) has prompted the development of new image segmentation approaches leveraging DL models. We provide a comprehensive review of this recent literature, covering the spectrum of pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the relationships, strengths, and challenges of these DL-based segmentation models, examine the widely used datasets, compare performances, and discuss promising research directions.
The U-shaped network is one of the end-to-end convolutional neural networks (CNNs). In electron microscope segmentation of ISBI challenge 2012, the concise architecture and outstanding performance of ...the U-shaped network are impressive. Then, a variety of segmentation models based on this architecture have been proposed for medical image segmentations. We present a comprehensive literature review of U-shaped networks applied to medical image segmentation tasks, focusing on the architectures, extended mechanisms and application areas in these studies. The aim of this survey is twofold. First, we report the different extended U-shaped networks, discuss main state-of-the-art extended mechanisms, including residual mechanism, dense mechanism, dilated mechanism, attention mechanism, multi-module mechanism, and ensemble mechanism, analyze their pros and cons. Second, this survey provides the overview of studies in main application areas of U-shaped networks, including brain tumor, stroke, white matter hyperintensities (WMHs), eye, cardiac, liver, musculoskeletal, skin cancer, and neuronal pathology. Finally, we summarize the current U-shaped networks, point out the open challenges and directions for future research.
•This study streamlines the GT U-Net architecture by removing higher-level group Transformer modules.•Rough spatial attention and channel attention are adopted to obtain more reasonable attention ...coefficients.•We propose a retinal vessel segmentation network that incorporates a rough attention fusion module.
The morphological changes of retinal vessels are of significant diagnostic value for early ophthalmic diseases and can aid in distinguishing other conditions such as diabetes and cardiovascular diseases. However, precise segmentation poses a challenge due to the complex structure of retinal vessels. To address these issues, we propose a Rough Attention Fusion Module (RAFM). This module employs max-pooling and average-pooling to define the upper and lower bounds of feature significance, introducing upper and lower weight matrices to obtain more reasonable attention coefficients. This enables the model to more accurately focus on important features in retinal images. Additionally, we integrate the RAFM into the GTS U-Net model, a simplified version of the GT U-Net model, which enhances the segmentation accuracy while reducing computational complexity. Ultimately, we construct a retinal vessel segmentation network based on the RAFM along with Group Transformer. Finally, the network structure is tested on the public DRIVE color fundus image dataset, achieving an Accuracy, F1 score, and AUC of 0.9641, 0.8506, and 0.9820, respectively. In contrast to prevalent retinal vessel segmentation networks in the mainstream, our proposed network demonstrates certain strengths.
Medical image segmentation is an important task in medical imaging and diagnosis. Data augmentation can substantially improve the accuracy of medical image segmentation when the dataset has a small ...amount of medical images. However, the data augmentation methods for medical image are usually based on big models that require extensive search space. Furthermore, excessively complex models often have a heavy burden for the general healthcare organization or researcher. To address this problem, we propose a method of data augmentation that is simple to implement even for the general researcher and simple to transplant across various models. Here we introduce our new methods called KeepMask and KeepMix, which can be simply ported to a variety of models and provide high performance. These methods allow data augmentation without any effect on the target organ or lesion and can also be adapted to multi-class segmentation. KeepMask and KeepMix can not only perturb the background of an existing medical image but also add target organs that are not present to it and generate new images based on the image. In this paper, we performed our methods on both binary class datasets and multi-class datasets and obtained better performance. We conducted numerous experiments showing the predicted segmentation images using our proposed methods obtained more accurate boundaries.
•Higher accuracy and predict more accurate boundaries.•Simple to transplant across various models.•Can be used for binary and multiclass tasks.•The images generated using the proposed data augmentation method are more consistent with the original dataset distributions.