In this paper, we revisit the classical perspective-n-point (PnP) problem, and propose the first non-iterative O(n) solution that is fast, generally applicable and globally optimal. Our basic idea is ...to formulate the PnP problem into a functional minimization problem and retrieve all its stationary points by using the Gr"obner basis technique. The novelty lies in a non-unit quaternion representation to parameterize the rotation and a simple but elegant formulation of the PnP problem into an unconstrained optimization problem. Interestingly, the polynomial system arising from its first-order optimality condition assumes two-fold symmetry, a nice property that can be utilized to improve speed and numerical stability of a Grobner basis solver. Experiment results have demonstrated that, in terms of accuracy, our proposed solution is definitely better than the state-of-the-art O(n) methods, and even comparable with the reprojection error minimization method.
Single-sensor imaging using the Bayer color filter array (CFA) and demosaicking is well established for current compact and low-cost color digital cameras. An extension from the CFA to a ...multispectral filter array (MSFA) enables us to acquire a multispectral image in one shot without increased size or cost. However, multispectral demosaicking for the MSFA has been a challenging problem because of very sparse sampling of each spectral band in the MSFA. In this paper, we propose a high-performance multispectral demosaicking algorithm, and at the same time, a novel MSFA pattern that is suitable for our proposed algorithm. Our key idea is the use of the guided filter to interpolate each spectral band. To generate an effective guide image, in our proposed MSFA pattern, we maintain the sampling density of the G-band as high as the Bayer CFA, and we array each spectral band so that an adaptive kernel can be estimated directly from raw MSFA data. Given these two advantages, we effectively generate the guide image from the most densely sampled G-band using the adaptive kernel. In the experiments, we demonstrate that our proposed algorithm with our proposed MSFA pattern outperforms existing algorithms and provides better color fidelity compared with a conventional color imaging system with the Bayer CFA. We also show some real applications using a multispectral camera prototype we built.
A division-of-focal-plane or microgrid image polarimeter enables us to acquire a set of polarization images in one shot. Since the polarimeter consists of an image sensor equipped with a monochrome ...or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images. In this paper, we propose a novel MPFA demosaicking method based on edge-aware residual interpolation (EARI) and also extend it to CPFA demosaicking. The key of EARI is a new edge detector for generating an effective guide image used to interpolate the missing pixel values. We also present a newly constructed full color-polarization image dataset captured using a 3-CCD camera and a rotating polarizer. Using the dataset, we experimentally demonstrate that our EARI-based method outperforms existing methods in MPFA and CPFA demosaicking.
Color image demosaicking for the Bayer color filter array is an essential image processing operation for acquiring high-quality color images. Recently, residual interpolation (RI)-based algorithms ...have demonstrated superior demosaicking performance over conventional color difference interpolation-based algorithms. In this paper, we propose adaptive residual interpolation (ARI) that improves existing RI-based algorithms by adaptively combining two RI-based algorithms and selecting a suitable iteration number at each pixel. These are performed based on a unified criterion that evaluates the validity of an RI-based algorithm. Experimental comparisons using standard color image datasets demonstrate that ARI can improve existing RI-based algorithms by more than 0.6 dB in the color peak signal-to-noise ratio and can outperform state-of-the-art algorithms based on training images. We further extend ARI for a multispectral filter array, in which more than three spectral bands are arrayed, and demonstrate that ARI can achieve state-of-the-art performance also for the task of multispectral image demosaicking.
Deep Sensing for Compressive Video Acquisition Yoshida, Michitaka; Torii, Akihiko; Okutomi, Masatoshi ...
Sensors (Basel, Switzerland),
08/2023, Letnik:
23, Številka:
17
Journal Article
Recenzirano
Odprti dostop
A camera captures multidimensional information of the real world by convolving it into two dimensions using a sensing matrix. The original multidimensional information is then reconstructed from ...captured images. Traditionally, multidimensional information has been captured by uniform sampling, but by optimizing the sensing matrix, we can capture images more efficiently and reconstruct multidimensional information with high quality. Although compressive video sensing requires random sampling as a theoretical optimum, when designing the sensing matrix in practice, there are many hardware limitations (such as exposure and color filter patterns). Existing studies have found random sampling is not always the best solution for compressive sensing because the optimal sampling pattern is related to the scene context, and it is hard to manually design a sampling pattern and reconstruction algorithm. In this paper, we propose an end-to-end learning approach that jointly optimizes the sampling pattern as well as the reconstruction decoder. We applied this deep sensing approach to the video compressive sensing problem. We modeled the spatio–temporal sampling and color filter pattern using a convolutional neural network constrained by hardware limitations during network training. We demonstrated that the proposed method performs better than the manually designed method in gray-scale video and color video acquisitions.
Backgrounds
Cycle-consistent generative adversarial network (CycleGAN) is a deep neural network model that performs image-to-image translations. We generated virtual indigo carmine (IC) ...chromoendoscopy images of gastric neoplasms using CycleGAN and compared their diagnostic performance with that of white light endoscopy (WLE).
Methods
WLE and IC images of 176 patients with gastric neoplasms who underwent endoscopic resection were obtained. We used 1,633 images (911 WLE and 722 IC) of 146 cases in the training dataset to develop virtual IC images using CycleGAN. The remaining 30 WLE images were translated into 30 virtual IC images using the trained CycleGAN and used for validation. The lesion borders were evaluated by 118 endoscopists from 22 institutions using the 60 paired virtual IC and WLE images. The lesion area concordance rate and successful whole-lesion diagnosis were compared.
Results
The lesion area concordance rate based on the pathological diagnosis in virtual IC was lower than in WLE (44.1% vs. 48.5%,
p
< 0.01). The successful whole-lesion diagnosis was higher in the virtual IC than in WLE images; however, the difference was insignificant (28.2% vs. 26.4%,
p
= 0.11). Conversely, subgroup analyses revealed a significantly higher diagnosis in virtual IC than in WLE for depressed morphology (41.9% vs. 36.9%,
p
= 0.02), differentiated histology (27.6% vs. 24.8%,
p
= 0.02), smaller lesion size (42.3% vs. 38.3%,
p
= 0.01), and assessed by expert endoscopists (27.3% vs. 23.6%,
p
= 0.03).
Conclusions
The diagnostic ability of virtual IC was higher for some lesions, but not completely superior to that of WLE. Adjustments are required to improve the imaging system’s performance.
Image-based reconstruction of an object’s 3D shape having the wavelength-by-wavelength spectral reflectance property enables higher-fidelity object 3D modeling compared with typical RGB-based ...modeling. In this paper, we propose a novel projector–camera system for practical and low-cost acquisition of a dense object 3D model with the spectral reflectance property. Different from existing spectral 3D data acquisition systems that use a dedicated multispectral camera or light, we use a standard RGB camera and an off-the-shelf projector as active illumination for both the 3D reconstruction and the spectral reflectance estimation. We first propose a calibration-free multi-view structured-light method to reconstruct the 3D points while estimating the intrinsic parameters and the poses of both the camera and the projector, which are alternately moved around the object during our image acquisition procedure. We then exploit the projector for multispectral imaging and estimate the spectral reflectance of each 3D point based on a novel spectral reflectance estimation method considering the geometric relationship between the reconstructed 3D points and the estimated projector positions. Experimental results on both synthetic and real data demonstrate that our system can precisely acquire a dense spectral 3D model using off-the-shelf devices.
A division-of-focal-plane or microgrid image polarimeter enables us to acquire a set of polarization images in one shot. Since the polarimeter consists of an image sensor equipped with a monochrome ...or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images. In this study, we proposed a novel MPFA demosaicking method based on intensity-guided residual interpolation (IGRI) and extended it to CPFA demosaicking. The key of IGRI is generating an effective intensity guide image, for which we proposed two methods considering four-directional intensity and polarization edge information. We also constructed a new full color-polarization image dataset captured using a 3-CCD RGB camera and a rotating polarizer. By using the constructed dataset, we experimentally validated that our IGRI-based methods outperform existing methods in MPFA and CPFA demosaicking.
In this paper, we propose a deep snapshot high dynamic range (HDR) imaging framework that can effectively reconstruct an HDR image from the RAW data captured using a multi-exposure color filter array ...(ME-CFA), which consists of a mosaic pattern of RGB filters with different exposure levels. To effectively learn the HDR image reconstruction network, we introduce the idea of luminance normalization that simultaneously enables effective loss computation and input data normalization by considering relative local contrasts in the “normalized-by-luminance” HDR domain. This idea enables the network to equally handle the errors in both bright and dark areas regardless of absolute luminance levels, which significantly improves the visual image quality. Experimental results using public HDR image datasets demonstrate that our framework outperforms other snapshot methods and produces high-quality HDR images with fewer visual artifacts, resulting in more than 4dB color peak signal-to-noise ratio improvement in the linear HDR domain.
Image classification needs to consider image degradations in practice because an image classification network trained with clean images works poorly for degraded images while digital images usually ...include some degradations such as JPEG compression. To tackle this problem, a common approach is training classification networks on degraded images with various levels of degradation, e.g. various quality factors for JPEG compression. However, the classification networks do not usually have enough accuracy for clean images because the classification networks have been averagely trained for various levels of degradation. This paper aims to construct a classification network of degraded images without sacrificing the performance of clean images. This paper proposes a network to learn the classification of degraded images and degradation levels of degraded images as multi-task learning. Learning the classification of degraded images is the main task of multi-task learning. On the other hand, learning degradation levels is a sub-task of multi-task learning and reinforces the classification ability. In the proposed network, a feature extractor of the classification is trained with image features acquired from a classification network of clean images by using consistency regularization with the cosine similarity loss. The experimental results using different types of degradations, including JPEG compression, Gaussian blur, Gaussian noise, and salt-and-pepper noise, show that our proposed network has enough ability to classify degraded images without sacrificing the performance for clean images.