Accurate registration is critical for robotic mapping and simultaneous localization and mapping (SLAM). Sparse or non‐uniform point clouds can be very challenging to register, even in ideal ...environments. Previous research by Holz et al. has developed a mesh‐based extension to the popular generalized iterative closest point (GICP) algorithm, which can accurately register sparse clouds where unmodified GICP would fail. This paper builds on that work by expanding the comparison between the two algorithms across multiple data sets at a greater range of distances. The results confirm that Mesh‐GICP is more accurate, more precise, and faster. They also show that it can successfully register scans 4–17 times further apart than GICP. In two different experiments this paper uses Mesh‐GICP to compare three different registration methods—pairwise, metascan, keyscan—in two different situations, one in a visual odometry (VO) style, and another in a mapping style. The results of these experiments show that the keyscan method is the most accurate of the three so long as there is sufficient overlap between the target and source clouds. Where there is unsufficient overlap, pairwise matching is more accurate.
Detection and tracking of moving objects (DATMO) in an urban environment using Light Detection and Ranging (LiDAR) is a major challenge for autonomous vehicles due to sparse point cloud, multiple ...moving directions, various traffic participants, and computational load. To address the complexity of this issue, this study presents a novel model-free approach for DATMO using 2D LiDAR implemented on autonomous vehicles. The approach has been used to classify moving points in the point cloud using the predicted Static Obstacle Map (SOM) generated via interaction between Geometric Model-Free Approach (GMFA) and SOM, and estimates the state of each moving object via GMFA. The motion of each point represented by the state of moving objects updates the SOM. The interaction between GMFA and SOM estimates the correspondence between consecutive point clouds in real-time. The proposed approach has been evaluated via RT range and labeled dataset. The accuracy of estimation of the yaw angle and the velocity of a moving vehicle has been quantitatively evaluated using the RT-range. The performance is significantly improved compared with the geometric model-based tracking (MBT). The estimation of the yaw angle, which has a significant effect on the cut-in/cut-out intention of the target vehicle, is shown to be remarkably improved. Based on the evaluation of the labeled dataset, false-positive and false-negative features are suppressed more than MBT.
Multimodal 3D object detection has gained significant attention due to the fusion of light detection and range (LiDAR) and RGB data. Existing 3D detection models in autonomous driving are typically ...trained on dense point cloud data from high-specification LiDAR sensors. However, budgetary constraints often lead to adopting low point-per-second (PPS) LiDAR sensors in real-world scenarios. The low PPS specification can generate a sparse point cloud. In this case, the existing models trained on dense data with a high PPS specification cannot achieve optimal performance under a sparse point cloud. To address this problem, we propose DenseSphere for robust multimodal 3D object detection under a sparse point cloud. Considering the data acquisition process of LiDAR sensors, DenseSphere involves the spherical coordinate-based point upsampler. Specifically, points are interpolated in the horizontal or vertical direction using bilateral interpolation. The interpolated points are refined using dilated pyramid blocks with various receptive fields. For efficient fusion with generated dense point cloud, we use a graph-based detector and hierarchical layers. Then, we demonstrate the performance of DenseSphere by comparing it with other multimodal 3D object detection models through experiments. The visual results and source code with the pretrained models are available at https://github.com/Jung-jongwon/DenseSphere.
In this paper, we propose a Geometric Model-Free Approach with a Particle Filter (GMFA-PF) through the use of automotive LiDAR for real-time tracking of moving objects within an urban driving ...environment. GMFA-PF proved to be lightweight, capable of finishing the process within the sensing period of the LiDAR on a single CPU. The proposed GMFA-PF tracks and estimates moving objects without any assumptions on the geometry of the target. This approach enables efficient tracking of multiple object classes, with robustness to a sparse point cloud. Point cloud on moving objects is classified via the predicted Static Obstacle Map (STOM). A likelihood field is generated through the classified point cloud and is used in particle filtering to estimate the moving object's pose, shape, and speed. Quantitative and qualitative comparisons - with Geometric Model-Based Tracking (MBT), Deep Neural Network (DNN), and GMFA - are performed for GMFA-PF using urban driving and scenario driving data gathered on an autonomous vehicle fitted with close-to-market sensors. The proposed approach shows robust tracking and accurate estimation performance in both sparse and dense point clouds; GMFA-PF achieves improved tracking performance in dense traffic and reduces yaw estimation delay compared to the others. Autonomous vehicles with GMFA-PF demonstrated auto-nomous driving on urban roads.
3D scene completion (SC) has made progress in the last three years. From the application of mobile robot system, SC should support the downstream task (i.e. mapping or perception), instead of only ...predicting the completed scenes. However, as the low-cost few-beam LiDAR is widely applied in mobile robot, gap between SC and downstream tasks is large. To generate the high quality completion result, the bottleneck lies in the triple sparsity of input, ground truth (GT) occupancy, and GT foreground. To deal with the triple sparsity, we present an extreme sparse scene completion network (ESC-Net). At first, input sparsity hides most of the spatial information of the scene. A feature completion (FC) decoder is designed to mine the spatial feature using feature-level completion. Then, GT occupancy sparsity hinders representation learning of the real scene with continuous surfaces. A multi-view multi-task attention (MMA) loss is presented to recover the high-quality object boundaries via correcting occupancy and semantic labels of regions from 3D and bird's eye view (BEV) spaces. After that, GT foreground sparsity is the imbalance of foreground and background GT labels. It causes the inaccuracy of local 3D object completion. A combination network (ESC-Net-D) is presented to recover 3D structural details of both foreground and background. Experiment is conducted on KITTI and SemanticPOSS datasets. It shows that ESC-Net has the performance higher than current methods not only on completion task, but also on the downstream tasks (i.e. 3D registration, 3D object detection). Hence, we believe that ESC-Net benefits to the community of mobile robot. Source code is released soon.
Capturing the motion of humans without any markers in a moving scene is a challenging issue. Typical methods, such as those based on fixed cameras or wearable sensors, have limits on the measurement ...range and the number of targets. Therefore, we propose a markerless motion capture technique to estimate the 3D pose of humans through a multi-UAV system. By utilizing the advantage of the large observation baselines between UAVs in the system, an algorithm to estimate the pose of airborne cameras and establish the sparse points cloud of background in overlapping view simultaneously was presented. A 3σ rule based on reprojection error is employed to eliminate the incorrect conjugate keypoint pairs, enhancing the precision of pose estimation for airborne cameras. To verify the accuracy of the developed technique, an indoor test was conducted to compare it with a commercial capture system, the results show that the average measurement error of the 3D joints of humans is less than 30mm. Furthermore, we demonstrate the feasibility of our approach in outdoor scenes, including multi-person scenarios and moving scenarios.
This article introduces a novel approach to hand gesture recognition utilizing sparse time-series point cloud data obtained through a short-range mmWave radar sensor. Our proposed method not only ...mitigates the need for complex data format conversions but also operates efficiently with sparse time-series point cloud data, leading to significant advantages in processing time and storage consuming. This study focuses on accurately classifying point cloud sequences representing hand gestures by none-complex sequence modeling. The proposed methods include a modified PointNet configuration suitable for gesture recognition and an optimized point cloud data preprocessing. The sequential features of input data applied to the proposed model by integrating frame order information into the vector representation of each point and using point augmentation and sampling to normalize the point cloud that is measured differently depending on the type of hand gesture and position. The performance of a point cloud-based recognition model with a sparse matrix form can be improved by ensuring the preservation of a fixed input shape. Performance experiments demonstrate the superiority of the proposed methods in classification performance compared to the existing methods in the recurrent neural network (RNN) series and PointNet. The experimental results provide insights for selecting optimal parameters in specific application environments. In conclusion, this study presents a robust system for hand gesture recognition, offering accurate classification of point cloud sequences without the need for complex data format conversion. The simplicity of data processing and reduced computational cost are notable advantages, contributing to the development of cost-effective and efficient hand gesture recognition systems.
TP242%TD67; 环境感知与地下空间导航是煤矿智能化信息领域的重要研究方向,对实现无人化、全自动化、智能化的煤矿生产作业至关重要.随着第五代移动通信技术(5th generation mobile networks, ...5G)和毫米波成像雷达软硬件日益紧密结合与成熟,毫米波探测与通讯应用到更多领域.5G通讯技术依托高速率、低延时、高带宽的特点给现有的无线电通讯技术带来巨大的变革;同时,毫米波雷达相比激光雷达,低成本、抗干扰、三维点云(3 dimension point cloud,3D)数量相对激光点云数量少1~2个数量级的特点,使得其在地下环境3D成像及同步定位与地图构建(Simultaneous Localization and Mapping,SLAM)领域得到越来越多的关注.基于5G通讯的V2X(Vehicle to Everything)技术结合毫米波SLAM导航,为煤矿机器人的自主导航提供新的解决方案.系统综述了当下煤矿机器人自主导航以及实现煤矿智能化所面临的问题;近期国内外毫米波成像最新进展;地下环境毫米波雷达模块组通讯与信号获取方法;高分辨率成像遇到的稀疏特征提取问题;稀疏点云的处理策略与算法评估;深度学习在毫米波稀疏点云处理中的研究现状与发展方向;SLAM算法应用于不同环境的研究现状及SLAM导航算法.归纳了煤矿地下环境中应用SLAM地图构建、路径规划及避障的困难和挑战,并对未来煤矿复杂环境下毫米波通讯与导航兼容并蓄的新应用提出了展望.
In the vehicle pose estimation task based on roadside Lidar in cooperative perception, the measurement distance, angle, and laser resolution directly affect the quality of the target point cloud. For ...incomplete and sparse point clouds, current methods are either less accurate in correspondences solved by local descriptors or not robust enough due to the reduction of effective boundary points. In response to the above weakness, this paper proposed a registration algorithm Environment Constraint Principal Component-Iterative Closest Point (ECPC-ICP), which integrated road information constraints. The road normal feature was extracted, and the principal component of the vehicle point cloud matrix under the road normal constraint was calculated as the initial pose result. Then, an accurate 6D pose was obtained through point-to-point ICP registration. According to the measurement characteristics of the roadside Lidars, this paper defined the point cloud sparseness description. The existing algorithms were tested on point cloud data with different sparseness. The simulated experimental results showed that the positioning MAE of ECPC-ICP was about 0.5% of the vehicle scale, the orientation MAE was about 0.26°, and the average registration success rate was 95.5%, which demonstrated an improvement in accuracy and robustness compared with current methods. In the real test environment, the positioning MAE was about 2.6% of the vehicle scale, and the average time cost was 53.19 ms, proving the accuracy and effectiveness of ECPC-ICP in practical applications.
In this paper, we propose a novel approach that enables simultaneous localization, mapping (SLAM) and objects recognition using visual sensors data in open environments that is capable to work on ...sparse data point clouds. In the proposed algorithm the ORB-SLAM uses the current and previous monocular visual sensors video frame to determine observer position and to determine a cloud of points that represent objects in the environment, while the deep neural network uses the current frame to detect and recognize objects (OR). In the next step, the sparse point cloud returned from the SLAM algorithm is compared with the area recognized by the OR network. Because each point from the 3D map has its counterpart in the current frame, therefore the filtration of points matching the area recognized by the OR algorithm is performed. The clustering algorithm determines areas in which points are densely distributed in order to detect spatial positions of objects detected by OR. Then by using principal component analysis (PCA)—based heuristic we estimate bounding boxes of detected objects. The image processing pipeline that uses sparse point clouds generated by SLAM in order to determine positions of objects recognized by deep neural network and mentioned PCA heuristic are main novelties of our solution. In contrary to state-of-the-art approaches, our algorithm does not require any additional calculations like generation of dense point clouds for objects positioning, which highly simplifies the task. We have evaluated our research on large benchmark dataset using various state-of-the-art OR architectures (YOLO, MobileNet, RetinaNet) and clustering algorithms (DBSCAN and OPTICS) obtaining promising results. Both our source codes and evaluation data sets are available for download, so our results can be easily reproduced.