We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by: (1) solving for ...correspondences between points on the two shapes; (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape contexts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; regularized thin-plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning transform. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits, and the COIL data set.
Retrieving images from large and varied collections using image content as a key is a challenging and important problem. We present a new image representation that provides a transformation from the ...raw pixel data to a small set of image regions that are coherent in color and texture. This "Blobworld" representation is created by clustering pixels in a joint color-texture-position feature space. The segmentation algorithm is fully automatic and has been run on a collection of 10,000 natural images. We describe a system that uses the Blobworld representation to retrieve images from this collection. An important aspect of the system is that the user is allowed to view the internal representation of the submitted image and the query results. Similar systems do not offer the user this view into the workings of the system; consequently, query results from these systems can be inexplicable, despite the availability of knobs for adjusting the similarity metrics. By finding image regions that roughly correspond to objects, we allow querying at the level of objects rather than global image properties. We present results indicating that querying for images using Blobworld produces higher precision than does querying using color and texture histograms of the entire image in cases where the image contains distinctive objects.
Efficient shape matching using shape contexts Mori, G.; Belongie, S.; Malik, J.
IEEE transactions on pattern analysis and machine intelligence,
11/2005, Letnik:
27, Številka:
11
Journal Article
Recenzirano
Odprti dostop
We demonstrate that shape contexts can be used to quickly prune a search for similar shapes. We present two algorithms for rapid shape retrieval: representative shape contexts, performing comparisons ...based on a small number of shape contexts, and shapemes, using vector quantization in the space of shape contexts to obtain prototypical shape pieces.
Spectral grouping using the Nystrom method Fowlkes, C.; Belongie, S.; Chung, F. ...
IEEE transactions on pattern analysis and machine intelligence,
2004-Feb., 2004-Feb, 2004-02-00, 20040201, Letnik:
26, Številka:
2
Journal Article
Recenzirano
Odprti dostop
Spectral graph theoretic methods have recently shown great promise for the problem of image segmentation. However, due to the computational demands of these approaches, applications to large problems ...such as spatiotemporal data and high resolution imagery have been slow to appear. The contribution of this paper is a method that substantially reduces the computational requirements of grouping algorithms based on spectral partitioning making it feasible to apply them to very large grouping problems. Our approach is based on a technique for the numerical solution of eigenfunction problems known as the Nystrom method. This method allows one to extrapolate the complete grouping solution using only a small number of samples. In doing so, we leverage the fact that there are far fewer coherent groups in a scene than pixels.
Relative ranking of facial attractiveness Altwaijry, Hani; Belongie, S.
2013 IEEE Workshop on Applications of Computer Vision (WACV),
2013-Jan., 20130101
Conference Proceeding, Journal Article
Automatic evaluation of human facial attractiveness is a challenging problem that has received relatively little attention from the computer vision community. Previous work in this area have posed ...attractiveness as a classification problem. However, for applications that require fine-grained relationships between objects, learning to rank has been shown to be superior over the direct interpretation of classifier scores as ranks 27. In this paper, we propose and implement a personalized relative beauty ranking system. Given training data of faces sorted based on a subject's personal taste, we learn how to rank novel faces according to that person's taste. Using a blend of Facial Geometric Relations, HOG, GIST, L*a*b* Color Histograms, and Dense-SIFT + PCA feature types, our system achieves an average accuracy of 63% on pairwise comparisons of novel test faces. We examine the effectiveness of our method through lesion testing and find that the most effective feature types for predicting beauty preferences are HOG, GIST, and Dense-SIFT + PCA features.
Normalized cuts in 3-D for spinal MRI segmentation Carballido-Gamio, J.; Belongie, S.J.; Majumdar, S.
IEEE transactions on medical imaging,
2004-Jan., 2004-Jan, 2004-01-00, 20040101, Letnik:
23, Številka:
1
Journal Article
Odprti dostop
Segmentation of medical images has become an indispensable process to perform quantitative analysis of images of human organs and their functions. Normalized Cuts (NCut) is a spectral graph theoretic ...method that readily admits combinations of different features for image segmentation. The computational demand imposed by NCut has been successfully alleviated with the Nystro/spl uml/m approximation method for applications different than medical imaging. In this paper we discuss the application of NCut with the Nystro/spl uml/m approximation method to segment vertebral bodies from sagittal T1-weighted magnetic resonance images of the spine. The magnetic resonance images were preprocessed by the anisotropic diffusion algorithm, and three-dimensional local histograms of brightness was chosen as the segmentation feature. Results of the segmentation as well as limitations and challenges in this area are presented.
In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called "tracking by ...detection" has been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrade the classifier and can cause drift. In this paper, we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can therefore lead to a more robust tracker with fewer parameter tweaks. We propose a novel online MIL algorithm for object tracking that achieves superior results with real-time performance. We present thorough experimental results (both qualitative and quantitative) on a number of challenging video clips.
Beyond pairwise clustering Agarwal, S.; Jongwoo Lim; Zelnik-Manor, L. ...
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05),
2005, Letnik:
2
Conference Proceeding
Odprti dostop
We consider the problem of clustering in domains where the affinity relations are not dyadic (pairwise), but rather triadic, tetradic or higher. The problem is an instance of the hypergraph ...partitioning problem. We propose a two-step algorithm for solving this problem. In the first step we use a novel scheme to approximate the hypergraph using a weighted graph. In the second step a spectral partitioning algorithm is used to partition the vertices of this graph. The algorithm is capable of handling hyperedges of all orders including order two, thus incorporating information of all orders simultaneously. We present a theoretical analysis that relates our algorithm to an existing hypergraph partitioning algorithm and explain the reasons for its superior performance. We report the performance of our algorithm on a variety of computer vision problems and compare it to several existing hypergraph partitioning algorithms.
End-to-end scene text recognition Kai Wang; Babenko, B.; Belongie, S.
2011 International Conference on Computer Vision,
2011-Nov.
Conference Proceeding
This paper focuses on the problem of word detection and recognition in natural images. The problem is significantly more challenging than reading text in scanned documents, and has only recently ...gained attention from the computer vision community. Sub-components of the problem, such as text detection and cropped image word recognition, have been studied in isolation 7, 4, 20. However, what is unclear is how these recent approaches contribute to solving the end-to-end problem of word recognition. We fill this gap by constructing and evaluating two systems. The first, representing the de facto state-of-the-art, is a two stage pipeline consisting of text detection followed by a leading OCR engine. The second is a system rooted in generic object recognition, an extension of our previous work in 20. We show that the latter approach achieves superior performance. While scene text recognition has generally been treated with highly domain-specific methods, our results demonstrate the suitability of applying generic computer vision methods. Adopting this approach opens the door for real world scene text recognition to benefit from the rapid advances that have been taking place in object recognition.