The Perception-Distortion Tradeoff Blau, Yochai; Michaeli, Tomer
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Conference Proceeding
Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM, IFC, VIF) or by human opinion scores that quantify perceived perceptual quality. In this paper, we ...prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. However, as we show experimentally, for some measures it is less severe (e.g. distance between VGG features). We also show that generative-adversarial-nets (GANs) provide a principled way to approach the perception-distortion bound. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms.
Colorectal cancer is a leading cause of death. Colonoscopy is the criterion standard for detection and removal of precancerous lesions and has been shown to reduce mortality. The polyp miss rate ...during colonoscopies is 22% to 28%. DEEP DEtection of Elusive Polyps (DEEP2) is a new polyp detection system based on deep learning that alerts the operator in real time to the presence and location of polyps. The primary outcome was the performance of DEEP2 on the detection of elusive polyps.
The DEEP2 system was trained on 3611 hours of colonoscopy videos derived from 2 sources and was validated on a set comprising 1393 hours from a third unrelated source. Ground truth labeling was provided by offline gastroenterologist annotators who were able to watch the video in slow motion and pause and rewind as required. To assess applicability, stability, and user experience and to obtain some preliminary data on performance in a real-life scenario, a preliminary prospective clinical validation study was performed comprising 100 procedures.
DEEP2 achieved a sensitivity of 97.1% at 4.6 false alarms per video for all polyps and of 88.5% and 84.9% for polyps in the field of view for less than 5 and 2 seconds, respectively. DEEP2 was able to detect polyps not seen by live real-time endoscopists or offline annotators in an average of .22 polyps per sequence. In the clinical validation study, the system detected an average of .89 additional polyps per procedure. No adverse events occurred.
DEEP2 has a high sensitivity for polyp detection and was effective in increasing the detection of polyps both in colonoscopy videos and in real procedures with a low number of false alarms. (Clinical trial registration number: NCT04693078.)
Display omitted
Detecting Deficient Coverage in Colonoscopies Freedman, Daniel; Blau, Yochai; Katzir, Liran ...
IEEE transactions on medical imaging,
2020-Nov., 2020-11-00, 20201101, Letnik:
39, Številka:
11
Journal Article
Odprti dostop
Colonoscopy is tool of choice for preventing Colorectal Cancer, by detecting and removing polyps before they become cancerous. However, colonoscopy is hampered by the fact that endoscopists routinely ...miss 22-28% of polyps. While some of these missed polyps appear in the endoscopist's field of view, others are missed simply because of substandard coverage of the procedure, i.e. not all of the colon is seen. This paper attempts to rectify the problem of substandard coverage in colonoscopy through the introduction of the C2D2 (Colonoscopy Coverage Deficiency via Depth) algorithm which detects deficient coverage, and can thereby alert the endoscopist to revisit a given area. More specifically, C2D2 consists of two separate algorithms: the first performs depth estimation of the colon given an ordinary RGB video stream; while the second computes coverage given these depth estimates. Rather than compute coverage for the entire colon, our algorithm computes coverage locally, on a segment-by-segment basis; C2D2 can then indicate in real-time whether a particular area of the colon has suffered from deficient coverage, and if so the endoscopist can return to that area. Our coverage algorithm is the first such algorithm to be evaluated in a large-scale way; while our depth estimation technique is the first calibration-free unsupervised method applied to colonoscopies. The C2D2 algorithm achieves state of the art results in the detection of deficient coverage. On synthetic sequences with ground truth, it is 2.4 times more accurate than human experts; while on real sequences, C2D2 achieves a 93.0% agreement with experts.
Image restoration algorithms are typically evaluated by some distortion measure (e.g. PSNR, SSIM, IFC, VIF) or by human opinion scores that quantify perceived perceptual quality. In this paper, we ...prove mathematically that distortion and perceptual quality are at odds with each other. Specifically, we study the optimal probability for correctly discriminating the outputs of an image restoration algorithm from real images. We show that as the mean distortion decreases, this probability must increase (indicating worse perceptual quality). As opposed to the common belief, this result holds true for any distortion measure, and is not only a problem of the PSNR or SSIM criteria. We also show that generative-adversarial-nets (GANs) provide a principled way to approach the perception-distortion bound. This constitutes theoretical support to their observed success in low-level vision tasks. Based on our analysis, we propose a new methodology for evaluating image restoration methods, and use it to perform an extensive comparison between recent super-resolution algorithms.
The information bottleneck (IB) has been suggested as a fundamental principle governing performance in deep neural nets (DNNs). This idea sparked research on the information plane dynamics during ...training with the cross-entropy loss, and on using the IB of some "bottleneck" layer as a loss function. However, the claim that reaching the maximal value of the IB Lagrangian in each layer leads to optimal performance, was in fact never directly confirmed. In this paper, we propose a direct way of validating this hypothesis, using layer-by-layer training with the IB loss. In accordance with the original theory, we train each DNN layer explicitly with the IB objective (and without any classification loss), and freeze it before moving on to train the next layer. While mutual information (MI) is generally hard to estimate in high dimensions, we show that in the case of MI between DNN layers, this can be done quite accurately using a modification of the recently proposed mutual information neural estimator. Interestingly, we find that layer-by-layer training with the IB loss leads to accuracy which is on-par with end-to-end training with the cross entropy loss. This is, thus, the first direct experimental illustration of the link between the IB value in each layer, and a net's performance.
Lossy compression algorithms are typically designed and analyzed through the lens of Shannon's rate-distortion theory, where the goal is to achieve the lowest possible distortion (e.g., low MSE or ...high SSIM) at any given bit rate. However, in recent years, it has become increasingly accepted that "low distortion" is not a synonym for "high perceptual quality", and in fact optimization of one often comes at the expense of the other. In light of this understanding, it is natural to seek for a generalization of rate-distortion theory which takes perceptual quality into account. In this paper, we adopt the mathematical definition of perceptual quality recently proposed by Blau & Michaeli (2018), and use it to study the three-way tradeoff between rate, distortion, and perception. We show that restricting the perceptual quality to be high, generally leads to an elevation of the rate-distortion curve, thus necessitating a sacrifice in either rate or distortion. We prove several fundamental properties of this triple-tradeoff, calculate it in closed form for a Bernoulli source, and illustrate it visually on a toy MNIST example.