Accurate detection and estimation of pallet poses from color and depth data (RGB-D) are integral components many in advanced warehouse intelligent systems. State-of-the art object pose estimation ...methods follow a two-stage process, relying on off-the-shelf segmentation or object detection in the initial stage and subsequently predicting the pose of objects using cropped images. The cropped patches may include both the target object and irrelevant information, such as background or other objects, leading to challenges in handling pallets in warehouse settings with heavy occlusions from loaded objects. In this study, we propose an innovative deep learning-based approach to address the occlusion problem in pallet pose estimation from RGB-D images. Inspired by the selective attention mechanism in human perception, our developed model learns to identify and attenuate the significance of features in occluded regions, focusing on the visible and informative areas for accurate pose estimation. Instead of directly estimating pallet poses from cropped patches as in existing methods, we introduce two feature map re-weighting modules with cross-modal attention. These modules effectively filter out features from occluded regions and background, enhancing pose estimation accuracy. Furthermore, we introduce a large-scale annotated pallet dataset specifically designed to capture occlusion scenarios in warehouse environments, facilitating comprehensive training and evaluation. Experimental results on the newly collected pallet dataset show that our proposed method increases accuracy by 13.5% compared to state-of-the-art methods.
The task of object pose estimation in computer vision heavily relies on both color (RGB) and depth (D) images to provide crucial appearance and geometric information, assisting algorithms in ...understanding occlusions and object geometry, thereby enhancing accuracy. However, the dependency on specialized sensors capable of capturing depth poses challenges in terms of cost and availability. Consequently, researchers are exploring methods to estimate object poses solely from RGB images. Nevertheless, this approach encounters difficulties in handling occlusions, discerning object geometry, and resolving ambiguities arising from similar color or texture patterns. This paper introduces a novel geometry-aware method for object pose estimation utilizing RGB images as input to determine the poses of multiple object instances. Our approach leverages both depth and color images during training but only relies on color images during inference. Departing from traditional depth sensors, our method computes predicted point clouds directly from estimated depth images derived from RGB inputs. A key innovation lies in the formulation of a multi-scale fusion module adept at seamlessly integrating features extracted from RGB images with those inferred from the predicted point clouds. This fusion process significantly fortifies the pose estimation pipeline by harnessing the strengths of both modalities, resulting in notably improved object poses. Extensive experimentation demonstrates that our approach markedly outperforms state-of-the-art RGB-based methods on Occluded-LINEMOD and YCB-Video datasets. Moreover, our method achieves competitive results compared to RGB-D approaches that necessitate both RGB and depth data from physical sensors.
Hand-object configuration recovery is an important task in computer vision. The estimation of pose and shape for both hands and objects during interactive scenarios has various applications, ...particularly in augmented reality, virtual reality, or imitation-based robot learning. The problem is particularly challenging when the hand is interacting with objects in the environment, as this setting features both extreme occlusions and non-trivial shape deformations. While existing works treat the problem of estimating hand configurations (that is pose and shape parameters) in isolation from the recovery of parameters related to the object acted upon, we stipulate that the two problems are related and can be solved more accurately concurrently. We introduce an approach that jointly learns the features of hand and object from color and depth (RGB-D) images. Our approach fuses appearance and geometric features in an adaptive manner which allows us to accent or suppress features that are more meaningful for the upstream task of hand-object configuration recovery. We combine a deep Hough voting strategy that builds on our adaptive features with a graph convolutional network (GCN) to learn the interaction relationships between the hand and held object shapes during interaction. Experimental results demonstrate that our proposed approach consistently outperforms state-of-the-art methods on popular datasets.
Grasp detection plays a pivotal role in robotic manipulation, allowing robots to interact with and manipulate objects in their surroundings. Traditionally, this has relied on three-dimensional (3D) ...point cloud data acquired from specialized depth cameras. However, the limited availability of such sensors in real-world scenarios poses a significant challenge. In many practical applications, robots operate in diverse environments where obtaining high-quality 3D point cloud data may be impractical or impossible. This paper introduces an innovative approach to grasp generation using color images, thereby eliminating the need for dedicated depth sensors. Our method capitalizes on advanced deep learning techniques for depth estimation directly from color images. Instead of relying on conventional depth sensors, our approach computes predicted point clouds based on estimated depth images derived directly from Red-Green-Blue (RGB) input data. To our knowledge, this is the first study to explore the use of predicted depth data for grasp detection, moving away from the traditional dependence on depth sensors. The novelty of this work is the development of a fusion module that seamlessly integrates features extracted from RGB images with those inferred from the predicted point clouds. Additionally, we adapt a voting mechanism from our previous work (VoteGrasp) to enhance robustness to occlusion and generate collision-free grasps. Experimental evaluations conducted on standard datasets validate the effectiveness of our approach, demonstrating its superior performance in generating grasp configurations compared to existing methods. With our proposed method, we achieved a significant 4% improvement in average precision compared to state-of-the-art grasp detection methods. Furthermore, our method demonstrates promising practical viability through real robot grasping experiments, achieving an impressive 84% success rate.
Object recognition and pose estimation are critical components in autonomous robot manipulation systems, playing a crucial role in enabling robots to interact effectively with the environment. During ...actual execution, the robot must recognize the object in the current scene, estimate its pose, and then select a feasible grasp pose from the pre-defined grasp configurations. While most existing methods primarily focus on pose estimation, they often neglect the graspability and reachability aspects. This oversight can lead to inefficiencies and failures during execution. In this study, we introduce an innovative graspability-aware object pose estimation framework. Our proposed approach not only estimates the poses of multiple objects in clustered scenes but also identifies graspable areas. This enables the system to concentrate its efforts on specific points or regions of an object that are suitable for grasping. It leverages both depth and color images to extract geometric and appearance features. To effectively combine these diverse features, we have developed an adaptive fusion module. Additionally, the fused features are further enhanced through a graspability-aware feature enhancement module. The key innovation of our method lies in improving the discriminability and robustness of the features used for object pose estimation. We have achieved state-of-the-art results on public datasets when compared to several baseline methods. In real robot experiments conducted on a Franka Emika robot arm equipped with an Intel Realsense camera and a two-finger gripper, we consistently achieved high success rates, even in cluttered scenes.
Eight known compounds and one new compound, a eudesmane-type sesquiterpene (
1
), were isolated from the leaves of
Artemisia vulgaris
. Their structures were elucidated by HR-ESI-MS, and NMR analyses.
Grasp generation is a crucial task in robotics, especially in unstructured environments, where robots must identify suitable grasp locations on objects and determine the grasp configuration. Recent ...advances in deep learning have led to the development of end-to-end models for 6-DOF grasp generation that can learn to directly map from input point clouds to grasp configurations without intermediate processing steps. However, these models often treat all points in a scene equally, leading to suboptimal results in cluttered contexts where meaningfulness distributions are disparate due to occlusion. While attention mechanisms have shown promise in improving the accuracy and efficiency of various tasks in occluded scenes, their effectiveness in improving grasp generation performance is still an active area of research. Inspired by this potential, we explore the power of attention mechanisms in improving grasp generation from 3D point clouds. Building upon the previous work with VoteGrasp 2022, we integrate a wide range of attention modules and compare their effects and characteristics to identify the most successful combination for enhancing grasp generation performance. We also extend VoteGrasp by adding a semantic object classification loss to the loss function, making our method more flexible than existing approaches. Based on the detailed experiments and analysis, our research provides valuable insights into the use of attention mechanisms for 3D point cloud grasp generation, highlighting their potential to improve the accuracy and efficiency of robotic systems.
•Infection status of major helminthic diseases in Vietnam is provided.•Helminth control programs in the past are reviewed.•The main achievement is evaluated.•Remaining and emerging problems are ...identified.•The future works are pertained.
In Vietnam, helminthioses remain a major threat to public health and contribute to the maintenance of poverty in highly endemic regions. Through increased awareness of the damaging effects caused by helminthioses, the Vietnamese government has implemented many national programs over the past 30 years for the prevention and control of the most important helminthioses, such as, lymphatic filariasis, soil transmitted helminths, food borne zoonotic helminths, and others. Various control strategies have been applied to reduce or eliminate these worms, e.g. mass drug administration, economic development, control of vectors or intermediate hosts, public health interventions through education, proper composting procedures for excreta potentially containing helminth eggs, and the expansion of food supply chains and improved technologies for the production and inspection of food products. These control measures have resulted in a significant reduction in the distribution and transmission of helminth infections and have improved the overall living conditions and health outcomes of the Vietnamese citizens. However, the persistence of several helminth diseases continues in some endemic areas, especially where poverty is widespread and local traditions include the consumption of raw foods, especially fish and meats. This manuscript provides an overview of the helminth infection prevention and control programs conducted in Vietnam, their achieved results, learned lessons, and future works.