Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained ...with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.
•Build a large-scale 3D shape retrieval benchmark that supports multi-modal queries.•Evaluate the 26 3D shape retrieval methods using 3 types of metrics.•Solicit and identify state-of-the-art methods ...and promising related techniques.•Perform detailed analysis on diverse methods w.r.t accuracy and efficiency.•Make benchmark and evaluation tools freely available to the community.
Large-scale 3D shape retrieval has become an important research direction in content-based 3D shape retrieval. To promote this research area, two Shape Retrieval Contest (SHREC) tracks on large scale comprehensive and sketch-based 3D model retrieval have been organized by us in 2014. Both tracks were based on a unified large-scale benchmark that supports multimodal queries (3D models and sketches). This benchmark contains 13680 sketches and 8987 3D models, divided into 171 distinct classes. It was compiled to be a superset of existing benchmarks and presents a new challenge to retrieval methods as it comprises generic models as well as domain-specific model types. Twelve and six distinct 3D shape retrieval methods have competed with each other in these two contests, respectively. To measure and compare the performance of the participating and other promising Query-by-Model or Query-by-Sketch 3D shape retrieval methods and to solicit state-of-the-art approaches, we perform a more comprehensive comparison of twenty-six (eighteen originally participating algorithms and eight additional state-of-the-art or new) retrieval methods by evaluating them on the common benchmark. The benchmark, results, and evaluation tools are publicly available at our websites (http://www.itl.nist.gov/iad/vug/sharp/contest/2014/Generic3D/, 2014, http://www.itl.nist.gov/iad/vug/sharp/contest/2014/SBR/, 2014).
We have witnessed 3D shape models abundant in many application fields including 3D CAD/CAM, augmented/mixed reality (AR/MR), and entertainment. Creating 3D shape models from scratch is still very ...expensive. Efficient and accurate methods for shape retrieval is essential for 3D shape models to be reused. To retrieve similar 3D shape models, one must provide an arbitrary 3D shape as a query. Most of the research on 3D shape retrieval has been conducted with a “whole” shape as a query (aka whole-to-whole shape retrieval), while a “part” shape (aka part-to-whole shape retrieval) is more practically requested as a query especially by mechanical engineering with 3D CAD/CAM applications. A “part” shape is naturally constructed by a 3D range scanner as an input device. In this paper, we focus on the efficient method for part-to-whole shape retrieval where the “part” shape is assumed to be given by a 3D range scanner. Specifically, we propose a Super-Vector coding feature with SURF local features extracted from the View-Normal-Angle image, or the image synthesized by taking account of the angle between the view vector and the surface normal vector, together with the depth-buffered image, for part-to-whole shape retrieval. In addition, we propose a weighted whole-to-whole re-ranking method taking advantage of global information based on the result of part-to-whole shape retrieval. Through experiments we demonstrate that our proposed method outperforms the previous methods with or without re-ranking.
We propose a new method of similarity search for 3D shape models, given an arbitrary 3D shape as a query. The method features the high search performance enabled in part by our unique feature vector ...called Multi-Fourier Spectra Descriptor (MFSD), and in part by augmenting the feature vector with spectral clustering. The MFSD is composed of four independent Fourier spectra with periphery enhancement. It allows us to faithfully capture the inherent characteristics of an arbitrary 3D shape object regardless of the dimension, orientation, and original location of the object when it is first defined. Given a 3D shape database, the augmentation with spectral clustering is done first by computing the
p
-minimum spanning tree of the whole data set, where
p
is a number usually much less than
m
, the size of the whole 3D shape data set. We then define the affinity matrix, which is a square matrix of size
m
by
m
, where each element of the matrix denotes the distance between two shape objects. The distance is computed in advance by traversing the
p
-minimum spanning tree. The eigenvalue decomposition is then applied to the affinity matrix to reduce dimensionality of the matrix, followed by grouping into
k
clusters. The cluster information is kept for augmenting the search performance when a query is given. With a series of benchmark data sets, we will demonstrate that our approach outperforms previously known methods for 3D shape retrieval.
We present a new dimensionality reduction method, called SimRank similarity preserving projection (SSPP), that finds a subspace preserving semantic similarity among data represented with SimRank ...similarity on a bipartite graph. The relationship between 3D models and keywords in a tagged 3D model dataset is represented with bipartite graph. For shape‐based 3D model auto‐annotation, we try to capture the relationship of tagged 3D models by SSPP. Experimental results show that our method outperforms the baseline and previous methods.
We propose a new shape descriptor based on local and global features for 3D shape retrieval. Local features include "slope density" and "chain code", while global features include "distance histogram ...between a randomly chosen point and the shortest distance polygon from the point". We demonstrate our proposed shape descriptor and compare it with several previous methods by using Engineering Shape Benchmark (ESB). From our experiments, our method outperforms previous methods in terms of Precision@1 (P@1).
In this paper, we propose a new feature vector for 3D object retrieval, which we call Local Feature Correlation Descriptor (LCoD). Given a 3D object, we first render depth-buffer images from multiple ...viewpoints. We then extract local features from each depth-buffer image. For every depth-buffer image, we compute the correlation matrix of local features, and define the vector as LCoD, which is obtained by the elements of the correlation matrix. Our experiments on the Princeton Shape Benchmark show that LCoD achieves the First Tier of 0.4708, which exhibits higher search performance than conventional techniques.
Research of 3D shape retrieval has been popular recently. However, only small amount of research has focused on partial 3D shape retrieval. In this paper, we propose new features KAZE+VLAD for ...partial 3D shape retrieval. In KAZE+VLAD, firstly, extracts the KAZE as local features from Depth Buffer images generated from multiple view points of the 3D Object. Secondly, integrates local features into features of 3D object by encoding method. We used VLAD for encoding method. We used a modified Princeton Segmentation Benchmark, and conducted comparative experiments between our proposed and previous features.
Research of 3D shape retrieval has been popular recently. However, only small amount of research has focused on partial 3D shape retrieval. In this paper, we propose new features KAZE+VLAD for ...partial 3D shape retrieval. In KAZE+VLAD, firstly, extracts the KAZE as local features from Depth Buffer images generated from multiple view points of the 3D Object. Secondly, integrates local features into features of 3D object by encoding method. We used VLAD for encoding method. We used a modified Princeton Segmentation Benchmark, and conducted comparative experiments between our proposed and previous features.
近年,三次元物体の形状類似検索の研究が注目を集めている.しかし,現実的な応用を考えると,「部分」から「全体」を検索できるような部分検索ができることが重要である.本稿では,三次元物体の部分検索を目的とした新しい特徴量KAZE+VLADを提案する.KAZE+VLADでは,まず,三次元物体より複数視点で生成したDepth Buffer画像から,局所特徴量としてKAZEを抽出する.そして,エンコーディング手法にVLADを用いて,局所特徴量群を三次元物体の特徴量として統合する.局所的な形状を捉えた特徴量を用いることで,三次元物体の部分検索を実現する.Princeton Segmentation Benchmarkをもとに作成した,部分検索データセットを用いた比較実験で,KAZE+VLADは,D2やMFSDといった従来手法よりも優れた検索性能を得た.
Benchmark for photo-based 3D shape retrieval Tatsuma, Atsushi; Tashiro, Shoki; Aono, Masaki
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific,
2014-Dec.
Conference Proceeding
With the increased use of 3D shape objects in a wide range of fields, photo-based 3D shape retrieval has recently attracted attention. Photo-based 3D shape retrieval is a newly developed search ...technique to retrieve 3D shape objects using a 2D photo image as the search query. We report the development of a benchmark dataset for evaluating the search performance of a photo-based 3D shape retrieval algorithm. The dataset comprises 1875 2D query images and 200 3D target objects. In addition, we conduct comparative experiments with representative feature extraction methods for photo-based 3D shape retrieval using the benchmark dataset. The experimental results suggest that the feature extraction method based on image gradient can return superior search results.