We present a learning-based method for 6 DoF pose estimation of rigid objects in point cloud data. Many recent learning-based approaches use primarily RGB information for detecting objects, in some ...cases with an added refinement step using depth data. Our method consumes unordered point sets with/without RGB information, from initial detection to the final transformation estimation stage. This allows us to achieve accurate pose estimates, in some cases surpassing state of the art methods trained on the same data.
Pose estimation is the task of determining the 6D position of an object in a scene. Pose estimation aid the abilities and flexibility of robotic set-ups. However, the system must be configured ...towards the use case to perform adequately. This configuration is time-consuming and limits the usability of pose estimation and, thereby, robotic systems. Deep learning is a method to overcome this configuration procedure by learning parameters directly from the dataset. However, obtaining this training data can also be very time-consuming. The use of synthetic training data avoids this data collection problem, but a configuration of the training procedure is necessary to overcome the domain gap problem. Additionally, the pose estimation parameters also need to be configured. This configuration is jokingly known as grad student descent as parameters are manually adjusted until satisfactory results are obtained. This paper presents a method for automatic configuration using only synthetic data. This is accomplished by learning the domain randomization during network training, and then using the domain randomization to optimize the pose estimation parameters. The developed approach shows state-of-the-art performance of 82.0 % recall on the challenging OCCLUSION dataset, outperforming all previous methods with a large margin. These results prove the validity of automatic set-up of pose estimation using purely synthetic data.
A performance evaluation of point pair features Kiforenko, Lilita; Drost, Bertram; Tombari, Federico ...
Computer vision and image understanding,
January 2018, 2018-01-00, Volume:
166
Journal Article
Peer reviewed
•This work presents an evaluation of PPF features on a large set of 3D scene.•First, the internal variations of PPFs is evaluated.•Then, PPFs are compared to local histogram feature descriptors.•The ...evaluation is made on feature and pose estimation level.
More than a decade ago, the point pair features (PPFs) were introduced, showing a great potential for 3D object detection and pose estimation under very different conditions. Many modifications have been made to the original PPF, in each case showing varying degrees of improvement for specific datasets. However, to the best of our knowledge, no comprehensive evaluation of these features has been made. In this work, we evaluate PPFs on a large set of 3D scenes. We not only compare PPFs to local point cloud descriptors, but also investigate the internal variations of PPFs (different types of relations between two points). Our comparison is made on 7 publicly available datasets, showing variations on a number of parameters, e.g. acquisition technique, the number of objects/scenes and the amount of occlusion and clutter. We evaluate feature performance both at a point-wise object-scene correspondence level and for overall object detection and pose estimation in a RANSAC pipeline. Additionally, we also present object detection and pose estimation results for the original, voting based, PPF algorithm. Our results show that in general PPF is the top performer, however, there are datasets, which have low resolution data, where local histogram features show a higher performance than PPFs. We also found that PPFs compared to most local histogram features degrade faster under disturbances such as occlusion and clutter, however, PPFs still remain more descriptive on an absolute scale. The main contribution of this paper is a detailed analysis of PPFs, which highlights under which conditions PPFs perform particularly well as well as its main weaknesses.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, NLZOH, NUK, OILJ, UL, UPCLJ, UPUK, ZRSKP
Individual testing of samples is time- and cost-intensive, particularly during an ongoing pandemic. Better practical alternatives to individual testing can significantly decrease the burden of ...disease on the healthcare system. Herein, we presented the clinical validation of Segtnan™ on 3929 patients. Segtnan™ is available as a mobile application entailing an AI-integrated personalized risk assessment approach with a novel data-driven equation for pooling of biological samples. The AI was selected from a comparison between 15 machine learning classifiers (highest accuracy = 80.14%) and a feed-forward neural network with an accuracy of 81.38% in predicting the rRT-PCR test results based on a designed survey with minimal clinical questions. Furthermore, we derived a novel pool-size equation from the pooling data of 54 published original studies. The results demonstrated testing capacity increase of 750%, 60%, and 5% at prevalence rates of 0.05%, 22%, and 50%, respectively. Compared to Dorfman's method, our novel equation saved more tests significantly at high prevalence, i.e., 28% (p = 0.006), 40% (p = 0.00001), and 66% (p = 0.02). Lastly, we illustrated the feasibility of the Segtnan™ usage in clinically complex settings like emergency and psychiatric departments.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
5.
Teaching a Robot the Semantics of Assembly Tasks Savarimuthu, Thiusius Rajeeth; Buch, Anders Glent; Schlette, Christian ...
IEEE transactions on systems, man, and cybernetics. Systems,
05/2018, Volume:
48, Issue:
5
Journal Article, Publication
Peer reviewed
Open access
We present a three-level cognitive system in a learning by demonstration context. The system allows for learning and transfer on the sensorimotor level as well as the planning level. The ...fundamentally different data structures associated with these two levels are connected by an efficient mid-level representation based on so-called "semantic event chains." We describe details of the representations and quantify the effect of the associated learning procedures for each level under different amounts of noise. Moreover, we demonstrate the performance of the overall system by three demonstrations that have been performed at a project review. The described system has a technical readiness level (TRL) of 4, which in an ongoing follow-up project will be raised to TRL 6.
We present an approach to learn dense, continuous 2D-3D correspondence distributions over the surface of objects from data with no prior knowledge of visual ambiguities like symmetry. We also present ...a new method for 6D pose estimation of rigid objects using the learnt distributions to sample, score and refine pose hypotheses. The correspondence distributions are learnt with a contrastive loss, represented in object-specific latent spaces by an encoder-decoder query model and a small fully connected key model. Our method is unsupervised with respect to visual ambiguities, yet we show that the query- and key models learn to represent accurate multi-modal surface distributions. Our pose estimation method improves the state-of-the-art significantly on the comprehensive BOP Challenge, trained purely on synthetic data, even compared with methods trained on real data. The project site is at surfemb.github.io.
It is possible to associate a highly constrained subset of relative 6 DoF poses between two 3D shapes, as long as the local surface orientation, the normal vector, is available at every surface ...point. Local shape features can be used to find putative point correspondences between the models due to their ability to handle noisy and incomplete data. However, this correspondence set is usually contaminated by outliers in practical scenarios, which has led to many past contributions based on robust detectors such as the Hough transform or RANSAC. The key insight of our work is that a single correspondence between oriented points on the two models is constrained to cast votes in a 1 DoF rotational subgroup of the full group of poses, SE(3). Kernel density estimation allows combining the set of votes efficiently to determine a full 6 DoF candidate pose between the models. This modal pose with the highest density is stable under challenging conditions, such as noise, clutter, and occlusions, and provides the output estimate of our method. We first analyze the robustness of our method in relation to noise and show that it handles high outlier rates much better than RANSAC for the task of 6 DoF pose estimation. We then apply our method to four state of the art data sets for 3D object recognition that contain occluded and cluttered scenes. Our method achieves perfect recall on two LIDAR data sets and outperforms competing methods on two RGB-D data sets, thus setting a new standard for general 3D object recognition using point cloud data.
Purpose
– The purpose of this paper is to propose a new algorithm based on programming by demonstration and exception strategies to solve assembly tasks such as peg-in-hole.
...Design/methodology/approach
– Data describing the demonstrated tasks are obtained by kinesthetic guiding. The demonstrated trajectories are transferred to new robot workspaces using three-dimensional (3D) vision. Noise introduced by vision when transferring the task to a new configuration could cause the execution to fail, but such problems are resolved through exception strategies.
Findings
– This paper demonstrated that the proposed approach combined with exception strategies outperforms traditional approaches for robot-based assembly. Experimental evaluation was carried out on Cranfield Benchmark, which constitutes a standardized assembly task in robotics. This paper also performed statistical evaluation based on experiments carried out on two different robotic platforms.
Practical implications
– The developed framework can have an important impact for robot assembly processes, which are among the most important applications of industrial robots. Our future plans involve implementation of our framework in a commercially available robot controller.
Originality/value
– This paper proposes a new approach to the robot assembly based on the Learning by Demonstration (LbD) paradigm. The proposed framework enables to quickly program new assembly tasks without the need for detailed analysis of the geometric and dynamic characteristics of workpieces involved in the assembly task. The algorithm provides an effective disturbance rejection, improved stability and increased overall performance. The proposed exception strategies increase the success rate of the algorithm when the task is transferred to new areas of the workspace, where it is necessary to deal with vision noise and altered dynamic characteristics of the task.
Since the introduction of modern deep learning methods for object pose estimation, test accuracy and efficiency has increased significantly. For training, however, large amounts of annotated training ...data are required for good performance. While the use of synthetic training data prevents the need for manual annotation, there is currently a large performance gap between methods trained on real and synthetic data. This paper introduces a new method, which bridges this gap.Most methods trained on synthetic data use 2D images, as domain randomization in 2D is more developed. To obtain precise poses, many of these methods perform a final refinement using 3D data. Our method integrates the 3D data into the network to increase the accuracy of the pose estimation. To allow for domain randomization in 3D, a sensor-based data augmentation has been developed. Additionally, we introduce the SparseEdge feature, which uses a wider search space during point cloud propagation to avoid relying on specific features without increasing run-time.Experiments on three large pose estimation benchmarks show that the presented method outperforms previous methods trained on synthetic data and achieves comparable results to existing methods trained on real data.
Abstract
The ability to quantify and compare typography has potential in many disciplines such as marketing, branding, education, and literacy studies. However, formal features of typography have ...been difficult to operationalize for quantitative analysis. The article proposes a quantitative, distinctive feature-based framework for describing and comparing fonts. The analyses made using the framework yield a clear and quantifiable separation of well-established typographical categories. It is also sensitive enough to pick up even small variations between fonts. The framework can aid in developing a more generally accepted typographical meta-language that allows for comparison and cross-fertilization of typographical knowledge across disciplines.