•Proposing “transcoding 3D shape representations” for unsupervised feature learning.•Implementing the approach as a DNN called Shape Auto-Transcoder (SAT).•Evaluating SAT under scenarios of retrieval ...and classification of 3D shapes.
Unsupervised learning of 3D shape feature is a challenging yet important problem for organizing a large collection of 3D shape models that do not have annotations. Recently proposed neural network-based approaches attempt to learn meaningful 3D shape feature by autoencoding a single 3D shape representation such as voxel, 3D point set, or multiview 2D images. However, using single shape representation isn't sufficient in training an effective 3D shape feature extractor, as none of existing shape representation can fully describe geometry of 3D shapes by itself. In this paper, we propose to use transcoding across multiple 3D shape representations as the unsupervised method to obtain expressive 3D shape feature. A neural network called Shape Auto-Transcoder (SAT) learns to extract 3D shape features via cross-prediction of multiple heterogeneous 3D shape representations. Architecture and training objective of SAT are carefully designed to obtain effective feature embedding. Experimental evaluation using 3D model retrieval and 3D model classification scenarios demonstrates high accuracy as well as compactness of the proposed 3D shape feature. The code of SAT is available at https://github.com/takahikof/ShapeAutoTranscoder.
3D point set reconstruction is an important and challenging 3D shape analysis task. Current state-of-the-art algorithms for 3D point set reconstruction employ a deep neural network (DNN) having an ...encoder–decoder architecture. Recently, the decoder DNNs that transform multiple 2D planar patches to reconstruct a 3D shape have seen some success. These “patch-folding” decoders are adept at approximating smooth surfaces in 3D objects. However, 3D point sets generated by these decoders often lack local geometrical details, as 2D planar patches tend to overly constrain the patch folding process. In this paper, we propose a novel decoder DNN for 3D point sets called
Hyperplane Mixing and Folding Net
(
HMF-Net
). HMF-Net uses less constrained
hyperplane
, not 2D plane, patches as its input to the folding process. HMF-Net has, as its core building block, a stack of token-mixing layers to effectively learn global consistency among the hyperplane patches. In addition to HMF-Net, we also propose a novel loss for 3D point set reconstruction called
Weighted Chamfer Distance
(
WCD
). WCD tries to weight, or amplify, loss from parts of shape that are highly variable across training samples by emphasizing higher point-pair distance values between a generated point set and a groundtruth point set. This helps the decoder DNN learn shape details better. We comprehensively evaluate our algorithm under three 3D point set reconstruction scenarios, that are, shape completion, shape upsampling, and shape reconstruction from 2D images. Experimental results demonstrate that our algorithm yields accuracies higher than the existing algorithms for 3D point set reconstruction.
•Proposed a part-based 3D model retrieval (P3DMR) algorithm using deep learning.•Created two benchmark datasets for P3DMR.•Demonstrated effectiveness of the proposed approach for P3DMR.
Given a query ...that specifies partial 3D shape, a Part-based 3D Model Retrieval (P3DMR) system finds 3D shapes whose part or parts matches the query. An approach to P3DMR is to partition or segment whole models into sub-parts and performs query-part-to-target-parts matching. Whatever the definition of part, e.g., a rectangular volume in Euclidean space or a part segmented on a mesh manifold, the computation will be very costly. The part-whole matching must account for, for each 3D whole shape in a database, varying position, scale and orientation of the segmented sub parts. Another approach, in an attempt to make part-whole matching efficient, tries to approximate part-whole inclusion test with a single comparison between a pair of features, one representing the part-based query and the other representing the whole shape. Aggregation of local geometrical features of parts into a feature per whole 3D shape, e.g., via Bag-of-Features approach, is an example. This approach so far suffered from inaccuracy as the aggregation is not optimized for part-whole inclusion test of 3D shapes. This paper proposes a novel P3DMR algorithm called Part-Whole Relation Embedding network (PWRE-net) that effectively and efficiently performs part-whole inclusion test via learned embedding into a common feature space. Using deep neural network, the PWRE-net learns, from a large number of part-whole shape pairs, a common embedding of partial shapes and their associated whole shapes. For the training, training datasets containing part-whole shape pairs are created automatically from unlabeled 3D models. Experimental evaluation shows that PWRE-net outperforms existing algorithms both in terms of retrieval accuracy and efficiency.
Unsupervised learning of feature representations is a challenging yet important problem for analyzing a large collection of multimedia data that do not have semantic labels. Recently proposed neural ...network-based unsupervised learning approaches have succeeded in obtaining features appropriate for classification of multimedia data. However, unsupervised learning of feature representations adapted to content-based matching, comparison, or retrieval of multimedia data has not been explored well. To obtain such retrieval-adapted features, we introduce the idea of combining diffusion distance on a feature manifold with neural network-based unsupervised feature learning. This idea is realized as a novel algorithm called DeepDiffusion (DD). DD simultaneously optimizes two components, a feature embedding by a deep neural network and a distance metric that leverages diffusion on a latent feature manifold, together. DD relies on its loss function but not encoder architecture. It can thus be applied to diverse multimedia data types with their respective encoder architectures. Experimental evaluation using 3D shapes and 2D images demonstrates versatility as well as high accuracy of the DD algorithm. Code is available at https://github.com/takahikof/DeepDiffusion.
Non-rigid 3D shape retrieval has become an active and important research topic in content-based 3D object retrieval. The aim of this paper is to measure and compare the performance of ...state-of-the-art methods for non-rigid 3D shape retrieval. The paper develops a new benchmark consisting of 600 non-rigid 3D watertight meshes, which are equally classified into 30 categories, to carry out experiments for 11 different algorithms, whose retrieval accuracies are evaluated using six commonly utilized measures. Models and evaluation tools of the new benchmark are publicly available on our web site 1.
► We develop a new benchmark consisting of 600 non-rigid 3D watertight meshes. ► We evaluate and compare 11 non-rigid 3D shape retrieval methods. ► We find that no single method performs best for all kinds of objects. ► Models and evaluation tools of the new benchmark are publicly available.
•Build a large-scale 3D shape retrieval benchmark that supports multi-modal queries.•Evaluate the 26 3D shape retrieval methods using 3 types of metrics.•Solicit and identify state-of-the-art methods ...and promising related techniques.•Perform detailed analysis on diverse methods w.r.t accuracy and efficiency.•Make benchmark and evaluation tools freely available to the community.
Large-scale 3D shape retrieval has become an important research direction in content-based 3D shape retrieval. To promote this research area, two Shape Retrieval Contest (SHREC) tracks on large scale comprehensive and sketch-based 3D model retrieval have been organized by us in 2014. Both tracks were based on a unified large-scale benchmark that supports multimodal queries (3D models and sketches). This benchmark contains 13680 sketches and 8987 3D models, divided into 171 distinct classes. It was compiled to be a superset of existing benchmarks and presents a new challenge to retrieval methods as it comprises generic models as well as domain-specific model types. Twelve and six distinct 3D shape retrieval methods have competed with each other in these two contests, respectively. To measure and compare the performance of the participating and other promising Query-by-Model or Query-by-Sketch 3D shape retrieval methods and to solicit state-of-the-art approaches, we perform a more comprehensive comparison of twenty-six (eighteen originally participating algorithms and eight additional state-of-the-art or new) retrieval methods by evaluating them on the common benchmark. The benchmark, results, and evaluation tools are publicly available at our websites (http://www.itl.nist.gov/iad/vug/sharp/contest/2014/Generic3D/, 2014, http://www.itl.nist.gov/iad/vug/sharp/contest/2014/SBR/, 2014).
•Build a small scale and a large scale sketch-based 3D model retrieval benchmark.•Evaluate 15 best sketch-based 3D model retrieval algorithms on the two benchmarks.•Solicit and identify the ...state-of-the-art methods and promising related techniques.•Incisive analysis on diverse methods w.r.t scalability and efficiency performance.•The benchmarks and evaluation tools provide good reference to the related community.
Sketch-based 3D shape retrieval has become an important research topic in content-based 3D object retrieval. To foster this research area, two Shape Retrieval Contest (SHREC) tracks on this topic have been organized by us in 2012 and 2013 based on a small-scale and large-scale benchmarks, respectively. Six and five (nine in total) distinct sketch-based 3D shape retrieval methods have competed each other in these two contests, respectively. To measure and compare the performance of the top participating and other existing promising sketch-based 3D shape retrieval methods and solicit the state-of-the-art approaches, we perform a more comprehensive comparison of fifteen best (four top participating algorithms and eleven additional state-of-the-art methods) retrieval methods by completing the evaluation of each method on both benchmarks. The benchmarks, results, and evaluation tools for the two tracks are publicly available on our websites 1,2.
Sketch-Based 3D Object Retrieval (SB3DOR) algorithms retrieve 3D models similar to hand-drawn sketch queries. It is one of the most effective modalities to query 3D models by their shape. However, ...comparison of a sketch, which is a 2D image, with a 3D model is not straightforward. Most of the SB3DOR algorithms compare a sketch image with images of 3D models rendered from multiple viewpoints. However, retrieval accuracies of state-of-the-art SB3DOR algorithms are still not satisfactory, due, in part, to stylistic variation, semantic influence, abstraction, and drawing error found in sketches. To improve retrieval accuracy of SB3DOR systems, we propose a manifold-based similarity metric learning algorithm that relates two kinds of features, that are, of sketches and 3D models. Features in a high dimensional space often lie on a lower dimensional subspace, or manifold. Feature similarity may be computed more accurately on the manifold than in the original high dimensional space. Our Cross-Domain Manifold Ranking (CDMR) algorithm tries to keep the two distinct feature manifolds, one for sketch features and the other for 3D model features, intact. These two manifolds are interlinked by using inter-feature similarity to form a Cross-Domain Manifold (CDM). If available, semantic labels may also be used in forming the CDM. Relevance diffusion is used to compute similarities between a sketch and 3D models in a database. Experimental evaluation showed that the CDMR algorithm produces higher retrieval accuracy than the algorithms we compared against.
Invariance against rotation of 3D objects is one of the essential properties for 3D shape analysis. Recently proposed algorithms (e.g., 9, 10, 12, 13) have achieved rotationally invariant 3D point ...set analysis by using inherently rotation-invariant 3D shape features, i.e., distances and angles among 3D points, as input to Deep Neural Networks (DNNs). The DNNs capture spatial hierarchy and context among the geometric features to produce accurate analytical results. In this paper, we delve further into the DNN-based approach to rotation-invariant and highly accurate 3D point set analysis. In particular, we focus our attention on segmentation of 3D point sets, which is one of the most challenging among 3D point set analysis tasks. We propose a novel DNN for 3D point set segmentation called Rotation-invariant and Multi-scale feature Graph convolutional neural network, or RMGnet. Our RMGnet is more flexible than the previous methods as it accepts as input any handcrafted 3D shape features having rotation invariance. In addition, to accurately segment 3D point sets composed of parts having various sizes, we randomize scales at which handcrafted features are extracted and perform multi-resolution analysis of the features by using the DNN. Experimental evaluation demonstrates high segmentation accuracy as well as rotation invariance of the proposed RMGnet.