This paper introduces a new algorithmic technique for solving certain problems in geometric computer vision. The main novelty of the method is a branch-and-bound search over rotation space, which is ...used in this paper to determine camera orientation. By searching over all possible rotations, problems can be reduced to known fixed-rotation problems for which optimal solutions have been previously given. In particular, a method is developed for the estimation of the essential matrix, giving the first guaranteed optimal algorithm for estimating the relative pose using a cost function based on reprojection errors. Recently convex optimization techniques have been shown to provide optimal solutions to many of the common problems in structure from motion. However, they do not apply to problems involving rotations. The search method described in this paper allows such problems to be solved optimally. Apart from the essential matrix, the algorithm is applied to the camera pose problem, providing an optimal algorithm. The approach has been implemented and tested on a number of both synthetically generated and real data sets with good performance.
Display omitted
•A fully automatic system for abdominal organ segmentation is presented.•Regional convolutional neural networks are used for organwise segmentation.•State-of-the-art results on ...abdominal organ segmentation challenge.
A fully automatic system for abdominal organ segmentation is presented. As a first step, an organ localization is obtained via a robust and efficient feature registration method where the center of the organ is estimated together with a region of interest surrounding the center. Then, a convolutional neural network performing voxelwise classification is applied. Two convolutional neural networks of different architecture are compared. The first one has a structure similar to networks used for classification and is applied using a sliding window approach. The second one has a structure allowing it to be applied in a fully convolutional manner reducing computation time. Despite limited training data, our experimental results are on par with state-of-the-art approaches that have been developed over many years. More specifically the method is applied to the MICCAI2015 challenge “Multi-Atlas Labeling Beyond the Cranial Vault” in the free competition for organ segmentation in the abdomen. The method performed well for both types of convolutional neural networks. For the fully convolutional network a mean Dice coefficient of 0.767 was achieved, for the network applied with sliding window this number was 0.757.
Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. ...This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robust and invariant visual features, while the geometric estimation should be left to principled algorithms. We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. Our approach is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching by jointly refining keypoints and poses with little overhead. The code will be publicly available at github.com/cvg/pixloc.
Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A ...cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.
Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice that is, for ...example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic meaning of each scene part should not be affected by seasonal and other changes. However, these representations are typically not very discriminative due to the very limited number of available classes. In this paper, we propose a novel neural network, the Fine-Grained Segmentation Network (FGSN), that can be used to provide image segmentations with a larger number of labels and can be trained in a self-supervised fashion. In addition, we show how FGSNs can be trained to output consistent labels across seasonal changes. We show through extensive experiments that integrating the fine-grained segmentations produced by our FGSNs into existing localization algorithms leads to substantial improvements in localization performance.
We present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization.
To this end we start from a new formulation of ...length-based optimization schemes, based on surface continuation constraints, and discuss the connections to existing schemes. The formulation is based on a
cell complex
and considers basic regions and boundary elements. The corresponding optimization problem is cast as an integer linear program.
We then show how the method can be extended to include curvature regularity, again cast as an integer linear program. Here, we are considering pairs of boundary elements to reflect curvature. Moreover, a constraint set is derived to ensure that the boundary variables indeed reflect the boundary of the regions described by the region variables.
We show that by solving the linear programming relaxation one gets reasonably close to the global optimum, and that curvature regularity is indeed much better suited in the presence of long and thin objects compared to standard length regularity.
We introduce a framework for computing statistically optimal estimates of geometric reconstruction problems. While traditional algorithms often suffer from either local minima or non-optimality—or a ...combination of both—we pursue the goal of achieving global solutions of the statistically optimal cost-function.Our approach is based on a hierarchy of convex relaxations to solve non-convex optimization problems with polynomials. These convex relaxations generate a monotone sequence of lower bounds and we show how one can detect whether the global optimum is attained at a given relaxation. The technique is applied to a number of classical vision problems: triangulation, camera pose, homography estimation and last, but not least, epipolar geometry estimation. Experimental validation on both synthetic and real data is provided. In practice, only a few relaxations are needed for attaining the global optimum.
Why is it that semidefinite relaxations have been so successful in numerous applications in computer vision and robotics for solving non-convex optimization problems involving rotations? In studying ...the empirical performance, we note that there are few failure cases reported in the literature, in particular for estimation problems with a single rotation, motivating us to gain further theoretical understanding. A general framework based on tools from algebraic geometry is introduced for analyzing the power of semidefinite relaxations of problems with quadratic objective functions and rotational constraints. Applications include registration, hand–eye calibration, and rotation averaging. We characterize the extreme points and show that there exist failure cases for which the relaxation is not tight, even in the case of a single rotation. We also show that some problem classes are always tight given an appropriate parametrization. Our theoretical findings are accompanied with numerical simulations, providing further evidence and understanding of the results.
We consider the problem of localizing a novel image in a large 3D model. In principle, this is just an instance of camera pose estimation, but the scale introduces some challenging problems. For one, ...it makes the correspondence problem very difficult and it is likely that there will be a significant rate of outliers to handle. In this paper we use recent theoretical as well as technical advances to tackle these problems. Many modern cameras and phones have gravitational sensors that allow us to reduce the search space. Further, there are new techniques to efficiently and reliably deal with extreme rates of outliers. We extend these methods to camera pose estimation by using accurate approximations and fast polynomial solvers. Experimental results are given demonstrating that it is possible to reliably estimate the camera pose despite more than 99% of outlier correspondences.
This paper investigates a classical problem in computer vision: Given corresponding points in multiple images, when is there a unique projective reconstruction of the 3D geometry of the scene points ...and the camera positions? A set of points and cameras is said to be critical when there is more than one way of realizing the resulting image points. For two views, it has been known for almost a century that the critical configurations consist of points and camera lying on a ruled quadric surface. We give a classification of all possible critical configurations for any number of points in three images, and show that in most cases, the ambiguity extends to any number of cameras. The underlying framework for deriving the critical sets is projective geometry. Using a generalization of Pascal's Theorem, we prove that any number of cameras and scene points on an elliptic quartic form a critical set. Another important class of critical configurations consists of cameras and points on rational quartics. The theoretical results are accompanied by many examples and illustrations.PUBLICATION ABSTRACT