Teaching a Robot the Semantics of Assembly Tasks Savarimuthu, Thiusius Rajeeth; Buch, Anders Glent; Schlette, Christian ...
IEEE transactions on systems, man, and cybernetics. Systems,
05/2018, Volume:
48, Issue:
5
Journal Article, Publication
Peer reviewed
Open access
We present a three-level cognitive system in a learning by demonstration context. The system allows for learning and transfer on the sensorimotor level as well as the planning level. The ...fundamentally different data structures associated with these two levels are connected by an efficient mid-level representation based on so-called "semantic event chains." We describe details of the representations and quantify the effect of the associated learning procedures for each level under different amounts of noise. Moreover, we demonstrate the performance of the overall system by three demonstrations that have been performed at a project review. The described system has a technical readiness level (TRL) of 4, which in an ongoing follow-up project will be raised to TRL 6.
In this paper, we review current knowledge on tool use development in infants in order to provide relevant information to cognitive developmental roboticists seeking to design artificial systems that ...develop tool use abilities. This information covers: 1) sketching developmental pathways leading to tool use competences; 2) the characterization of learning and test situations; 3) the crystallization of seven mechanisms underlying the developmental process; and 4) the formulation of a number of challenges and recommendations for designing artificial systems that exhibit tool use abilities in complex contexts.
Computational modeling of the primate visual system yields insights of potential relevance to some of the challenges that computer vision is facing, such as object recognition and categorization, ...motion detection and activity recognition, or vision-based navigation and manipulation. This paper reviews some functional principles and structures that are generally thought to underlie the primate visual cortex, and attempts to extract biological principles that could further advance computer vision research. Organized for a computer vision audience, we present functional principles of the processing hierarchies present in the primate visual system considering recent discoveries in neurophysiology. The hierarchical processing in the primate visual system is characterized by a sequence of different levels of processing (on the order of 10) that constitute a deep hierarchy in contrast to the flat vision architectures predominantly used in today's mainstream computer vision. We hope that the functional description of the deep hierarchies realized in the primate visual system provides valuable insights for the design of computer vision algorithms, fostering increasingly productive interaction between biological and computer vision research.
A performance evaluation of point pair features Kiforenko, Lilita; Drost, Bertram; Tombari, Federico ...
Computer vision and image understanding,
January 2018, 2018-01-00, Volume:
166
Journal Article
Peer reviewed
•This work presents an evaluation of PPF features on a large set of 3D scene.•First, the internal variations of PPFs is evaluated.•Then, PPFs are compared to local histogram feature descriptors.•The ...evaluation is made on feature and pose estimation level.
More than a decade ago, the point pair features (PPFs) were introduced, showing a great potential for 3D object detection and pose estimation under very different conditions. Many modifications have been made to the original PPF, in each case showing varying degrees of improvement for specific datasets. However, to the best of our knowledge, no comprehensive evaluation of these features has been made. In this work, we evaluate PPFs on a large set of 3D scenes. We not only compare PPFs to local point cloud descriptors, but also investigate the internal variations of PPFs (different types of relations between two points). Our comparison is made on 7 publicly available datasets, showing variations on a number of parameters, e.g. acquisition technique, the number of objects/scenes and the amount of occlusion and clutter. We evaluate feature performance both at a point-wise object-scene correspondence level and for overall object detection and pose estimation in a RANSAC pipeline. Additionally, we also present object detection and pose estimation results for the original, voting based, PPF algorithm. Our results show that in general PPF is the top performer, however, there are datasets, which have low resolution data, where local histogram features show a higher performance than PPFs. We also found that PPFs compared to most local histogram features degrade faster under disturbances such as occlusion and clutter, however, PPFs still remain more descriptive on an absolute scale. The main contribution of this paper is a detailed analysis of PPFs, which highlights under which conditions PPFs perform particularly well as well as its main weaknesses.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, NLZOH, NUK, OILJ, UL, UPCLJ, UPUK, ZRSKP
In Mobile Manipulation (MM), navigation and manipulation are generally solved as subsequent disjoint tasks. Combined optimization of navigation and manipulation costs can improve the time efficiency ...of MM. However, this is challenging as precise object pose estimates, which are necessary for such combined optimization, are often not available until the later stages of MM. Moreover, optimizing navigation and manipulation costs with conventional planning methods using uncertain object pose estimates can lead to failures and hence requires re-planning. Hence, in the presence of object pose uncertainty, pre-active approaches are preferred. We propose such a pre-active approach for determining the base pose and pre-grasp manipulator configuration to improve the time efficiency of MM. We devise a Reinforcement Learning (RL) based solution that learns suitable base poses for grasping and pre-grasp manipulator configurations using layered learning that guides exploration and enables sample-efficient learning. Further, we accelerate learning of pre-grasp manipulator configurations by providing dense rewards using a predictor network trained on previously learned base poses for grasping. Our experiments validate that in the presence of uncertain object pose estimates, the proposed approach results in reduced execution time. Finally, we show that our policy learned in simulation can be easily transferred to a real robot.
Humans, but also robots, learn to improve their behavior. Without existing knowledge, learning either needs to be explorative and, thus, slow or-to be more efficient-it needs to rely on supervision, ...which may not always be available. However, once some knowledge base exists an agent can make use of it to improve learning efficiency and speed. This happens for our children at the age of around three when they very quickly begin to assimilate new information by making guided guesses how this fits to their prior knowledge. This is a very efficient generative learning mechanism in the sense that the existing knowledge is generalized into as-yet unexplored, novel domains. So far generative learning has not been employed for robots and robot learning remains to be a slow and tedious process. The goal of the current study is to devise for the first time a general framework for a generative process that will improve learning and which can be applied at all different levels of the robot's cognitive architecture. To this end, we introduce the concept of structural bootstrapping-borrowed and modified from child language acquisition-to define a probabilistic process that uses existing knowledge together with new observations to supplement our robot's data-base with missing information about planning-, object-, as well as, action-relevant entities. In a kitchen scenario, we use the example of making batter by pouring and mixing two components and show that the agent can efficiently acquire new knowledge about planning operators, objects as well as required motor pattern for stirring by structural bootstrapping. Some benchmarks are shown, too, that demonstrate how structural bootstrapping improves performance.
We propose a new methodology for learning and adaption of manipulation skills that involve physical contact with the environment. Pure position control is unsuitable for such tasks because even small ...errors in the desired trajectory can cause significant deviations from the desired forces and torques. The proposed algorithm takes a reference Cartesian trajectory and force/torque profile as input and adapts the movement so that the resulting forces and torques match the reference profiles. The learning algorithm is based on dynamic movement primitives and quaternion representation of orientation, which provide a mathematical machinery for efficient and stable adaptation. Experimentally we show that the robot’s performance can be significantly improved within a few iteration steps, compensating for vision and other errors that might arise during the execution of the task. We also show that our methodology is suitable both for robots with admittance and for robots with impedance control.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Demographic change is expected to challenge many societies in the next few decades if todays’ standards of services in e.g. elder care shall be maintained. Robots are considered to at least partially ...mitigate this challenge, however, robots are rarely applied in the welfare domain. This paper describes the development of a concept for a novel welfare robot that is modular and affordable. The development is based on a participatory design process and by taking strengths and limitations of selected, commercially available robots into account. This work contributes a design methodology specific for welfare robots and a resulting robot concept that address three use cases in a care center. The concept includes multi-modal robot perception that facilitates a proactive robot behavior for achieving smooth interactions with end-users.
Full text
Available for:
EMUNI, FZAB, GEOZS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
This paper presents a robot system for performing pick and place operations with deformable objects. The system uses a structured light scanner to capture a point cloud of the object to be grasped. ...This point cloud is then analyzed to determine a pick and place action. Finally, the determined action is executed by the robot to solve the task. The robotic placement strategy contains several free parameters, which should be chosen in a context-specific manner. To determine these parameters we rely on simulation-based optimization of the individual use cases. The entire system is tested extensively in real world trials. First, the reliability of the grasp is evaluated for 7 different types of pork cuts. Then the validity of the simulation-based optimization of the placement strategy is evaluated for 2 of the most different pork cuts, to show the generality of the overall approach.
Full text
Available for:
EMUNI, FZAB, GEOZS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
We provide new insights to the problem of shape feature description and matching, techniques that are often applied within 3D object recognition pipelines. We subject several state of the art ...features to systematic evaluations based on multiple datasets from different sources in a uniform manner. We have carefully prepared and performed a neutral test on the datasets for which the descriptors have shown good recognition performance. Our results expose an important fallacy of previous results, namely that the performance of the recognition system does not correlate well with the performance of the descriptor employed by the recognition system. In addition to this, we evaluate several aspects of the matching task, including the efficiency of the different features, and the potential in using dimension reduction. To arrive at better generalization properties, we introduce a method for fusing several feature matches with a limited processing overhead. Our fused feature matches provide a significant increase in matching accuracy, which is consistent over all tested datasets. Finally, we benchmark all features in a 3D object recognition setting, providing further evidence of the advantage of fused features, both in terms of accuracy and efficiency.