•Despite scanning differences, canopy profiles remain regular even in dense areas.•Using canopy profiles as primitives, transforming feature norms for the forest.•A new, high-resolution, ...256-dimensional feature descriptor for forest scenes.•A fresh approach to registering UAV- and terrestrial-based forest point clouds.
Within this study, we present a pioneering cross-platform point cloud registration (PCR) framework aimed at the automated alignment of UAV and terrestrial forest LiDAR point clouds. This framework leverages canopy profile skyline (CPS) descriptors and feature orientation information to support registration. Given the inherent irregular and natural distribution of point clouds derived from crown environments, conventional registration techniques that operate on geometric primitives such as points, lines, and planes are susceptible to inadequacies. In this article, we first analyze the high resolution and robustness of a skyline formed from successive tree canopy profiles as feature elements in forest data. Subsequently, we highlight the intriguing property that, whether they are UAV-based or terrestrial-based scanning data, the tree CPS obtained from the same location exhibits remarkably similar shapes and trends. Therefore, we propose a novel feature descriptor that encompasses M dimensions across 8 directions, with the aim of establishing reliable feature correspondences, subsequently leveraging directional constraints and employing one-shot estimation RANSAC to achieve automated UAV and terrestrial-based forest point cloud registration. Finally, the standard ICP algorithm combined with the constraint strategy based on tree trunk geometric morphological features will be utilized to refine the registration results. We conducted tests using five datasets with heterogeneous tree species and structures, and the results demonstrate that the proposed approach achieves SOTA performance.
Point cloud semantic segmentation in urban scenes plays a vital role in intelligent city modeling, autonomous driving, and urban planning. Point cloud semantic segmentation based on deep learning ...methods has achieved significant improvement. However, it is also challenging for accurate semantic segmentation in large scenes due to complex elements, variety of scene classes, occlusions, and noise. Besides, most methods need to split the original point cloud into multiple blocks before processing and cannot directly deal with the point clouds on a large scale. We propose a novel context-aware network (CAN) that can directly deal with large-scale point clouds. In the proposed network, a local feature aggregation module (LFAM) is designed to preserve rich geometric details in the raw point cloud and reduce the information loss during feature extraction. Then, in combination with a global context aggregation module (GCAM), capture long-range dependencies to enhance the network feature representation and suppress the noise. Finally, a context-aware upsampling module (CAUM) is embedded into the proposed network to capture the global perception from a broad perspective. The ensemble of low-level and high-level features facilitates the effectiveness and efficiency of 3-D point cloud feature refinement. Comprehensive experiments were carried out on three large-scale point cloud datasets in both outdoor and indoor environments to evaluate the performance of the proposed network. The results show that the proposed method outperformed the state-of-the-art representative semantic segmentation networks, and the overall accuracy (OA) of Tongji-3D, Semantic3D, and Stanford large-scale 3-D indoor spaces (S3DIS) is 96.01%, 95.0%, and 88.55%, respectively.
In recent years, point clouds have been widely used in powerline inspection, smart cities, autonomous driving, and other fields. The deep learning-based point cloud processing methods have attracted ...more and more attention due to the developments of laser scanning technology and machine learning. However, the recent methods largely ignore global contextual relationships and do not make full use of the complementation between local feature and high-level geometric information. To the problem, we propose a novel deep neural network, namely, the Dense connection-based Kernel Point Network (DenseKPNet), which can greatly expand the receptive field of kernel point convolution to extract rich semantic context information and valuable geometric features from the local region effectively. Specifically, we first design a multiscale convolution kernel point module to extract initial geometric features from coarse to fine. Then, we design the dense connection module to efficiently learn more expressive local geometric features while capturing rich contextual information. In addition, we propose the kernel point convolution attention module (KPCAM), which can capture global interdependencies between points and strengthen the discriminativeness of effective features. We evaluate our method on public indoor and outdoor datasets. The qualitative and quantitative experimental results show the effectiveness of DenseKPNet. The mIoU of the proposed method on S3DIS and semantic3D datasets can reach 68.9% and 77.9%, respectively.
Mobile mapping is applied widely in society, for example, in asset management, fleet management, construction planning, road safety, and maintenance optimization. Yet, further advances in these ...technologies are called for. Advances can be radical, such as changes to the prevailing paradigms in mobile mapping, or incremental, such as the state-of-the-art mobile mapping methods. With current multi-sensor systems in mobile mapping, laser-scanned data are often registered in point clouds with the aid of global navigation satellite system (GNSS) positioning or simultaneous localization and mapping (SLAM) techniques and then labeled and colored with the aid of machine learning methods and digital camera data. These multi-sensor platforms are beginning to undergo further advancements via the addition of multi-spectral and other sensors and via the development of machine learning techniques used in processing this multi-modal data. Embedded systems and minimalistic system designs are also attracting attention, from both academic and commercial perspectives.This book contains the accepted publications of the Special Issue 'Advances in Mobile Mapping Technologies' of the Remote Sensing journal. It consists of works introducing a new mobile mapping dataset (‘Paris CARLA 3D’), system calibration studies, SLAM topics, and multiple deep learning works for asset detection. We, the Guest Editors, Ville Lehtola from University of Twente, Netherlands, Andreas Nüchter from University of Würzburg, Germany, and François Goulette from Mines Paris- PSL University, France, wish to thank all the authors who contributed to this collection.
Existing airborne laser scanning (ALS) point cloud semantic segmentation approaches are limited by their overreliances on sufficient point-wise annotations that further confine their generalization ...ability to new scenes. To overcome these problems, a novel three-stage multiprototype relational network (Thr-MPRNet) is proposed for few-shot ALS point cloud semantic segmentation by transferring knowledge from well-annotated photogrammetric point clouds. In MPRNet, a 3-D few-shot learning (FSL) structure containing a feature learner (F-L) and a relation learner (R-L) is built to learn meta-knowledge from multiple point-wise tasks, and a multiprototype generator is designed to represent the semantic distribution of point clouds that can dynamically adapt to large-scale scenarios. Then, to transfer knowledge across different domains, MPRNet is trained in a unified framework with three task-based learning stages. Prior knowledge is first meta-learned from the source photogrammetric point clouds and then transferred to novel target datasets with a few labeled ALS point clouds. Finally, the MPRNet can be flexibly generalized to the unlabeled target ALS point clouds without further retraining from scratch. In the experiments, the SensatUrban dataset is used as the source photogrammetric point clouds, and two ALS point cloud datasets (ISPRS and DALES) are used to evaluate the few-shot semantic segmentation ability of the proposed method. The experiments demonstrate that Thr-MPRNet obtains promising generalization performance on different target datasets. More importantly, it outperforms supervised networks with 10% labeled samples. In summary, the proposed method achieves state-of-the-art cross-domain semantic segmentation performance and greatly alleviates the dependence on ALS point cloud annotations.
Dynamic point clouds are a potential new frontier in visual communication systems. A few articles have addressed the compression of point clouds, but very few references exist on exploring temporal ...redundancies. This paper presents a novel motion-compensated approach to encoding dynamic voxelized point clouds at low bit rates. A simple coder breaks the voxelized point cloud at each frame into blocks of voxels. Each block is either encoded in intra-frame mode or is replaced by a motion-compensated version of a block in the previous frame. The decision is optimized in a rate-distortion sense. In this way, both the geometry and the color are encoded with distortion, allowing for reduced bit-rates. In-loop filtering is employed to minimize compression artifacts caused by distortion in the geometry information. Simulations reveal that this simple motion-compensated coder can efficiently extend the compression range of dynamic voxelized point clouds to rates below what intra-frame coding alone can accommodate, trading rate for geometry accuracy.
We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from ...static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point’s (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.
A study on the quality evaluation of point clouds in the presence of coding distortions is presented. For that, four different point cloud coding solutions, notably the standardized MPEG codecs G-PCC ...and V-PCC, a deep learning-based coding solution RS-DLPCC, and Draco, are compared using a subjective evaluation methodology. Furthermore, several full-reference, reduced-reference and no-reference point cloud quality metrics are evaluated. Two different point cloud normal computation methods were tested for the metrics that rely on them, notably the Cloud Compare quadric fitting method with radius of five, ten, and twenty and Meshlab KNN with K six, ten, and eighteen. To generalize the results, the objective quality metrics were also benchmarked on a public database, with mean opinion scores available. To evaluate the statistical differences between the metrics, the Krasula method was employed. The Point Cloud Quality Metric reveals the best performance and a very good representation of the subjective results, as well as being the metric with the most statistically significant results. It was also revealed that the Cloud Compare quadric fitting method with radius 10 and 20 produced the most reliable normals for the metrics dependent on them. Finally, the study revealed that the most commonly used metrics fail to accurately predict the compression quality when artifacts generated by deep learning methods are present.
3D dense captioning is the process of generating natural language descriptions for objects in a 3D scene, represented as RGB-D scans or point clouds. Three problems currently limit the potential ...performance of existing methods. First, existing methods randomly select points from the point cloud, leading to the exclusion of important points or inclusion of low-value points, for the detected objects. This, in turn, degrades the quality of generated descriptions. Although our previously proposed method, namely Recurrent Point Clouds Selection (RPCS) mitigates aforementioned issue, it possesses an inaccurate termination criterion that causes unexpected interruptions in the loop. Furthermore, existing methods utilize the older object detector, which limits their inherent performance. Finally, the descriptions generated by existing methods only describe individual detected objects in the scene, which may be inconvenient for viewers from a visual perspective. To address these problems, we propose RPCS v2.0, which features several improvements over our original design. To maintain a high quality of generated descriptions while avoiding the unexpected interruptions inherent to the original RPCS method, we propose a modified termination criterion that continuously compares the element counts of the bad list. To address limitations associated with the older object detector, we implemented a newer object detector to further improve performance. Finally, to ensure the user-friendliness of visuals, we leveraged a language model to summarize the generated descriptions, producing a comprehensive representation of the entire scene. Our proposed approach, referred to as ScanT3D, significantly mitigated the issue of suboptimal description generation and outperformed the state-of-the-art methods by a large margin (65.84%CiDEr@0.5IoU).