Neural Radiance Fields (NeRFs) offers a state-of-the-art quality in synthesizing novel views of complex 3D scenes from a small subset of base images. For NeRFs to perform optimally, the registration ...of base images has to follow certain assumptions, including maintaining a constant distance between the camera and the object. We can address this limitation by training NeRFs with 3D point clouds instead of images, yet a straightforward substitution is impossible due to the sparsity of 3D clouds in the under-sampled regions, which leads to incomplete reconstruction output by NeRFs. To solve this problem, here we propose an auto-encoder-based architecture that leverages a hypernetwork paradigm to transfer 3D points with the associated color values through a lower-dimensional latent space and generate weights of NeRF model. This way, we can accommodate the sparsity of 3D point clouds and fully exploit the potential of point cloud data. As a side benefit, our method offers an implicit way of representing 3D scenes and objects that can be employed to condition NeRFs and hence generalize the models beyond objects seen during training. The empirical evaluation confirms the advantages of our method over conventional NeRFs and proves its superiority in practical applications.
•We propose a method which converts point clouds to NeRF.•We use hypernetworks paradigm for condition mechanism.•Our method offers a generative model.
Semantic segmentation, also called scene labeling, refers to the process of assigning a semantic label (e.g. car, people, and road) to each pixel of an image. It is an essential data processing step ...for robots and other unmanned systems to understand the surrounding scene. Despite decades of efforts, semantic segmentation is still a very challenging task due to large variations in natural scenes. In this paper, we provide a systematic review of recent advances in this field. In particular, three categories of methods are reviewed and compared, including those based on hand-engineered features, learned features and weakly supervised learning. In addition, we describe a number of popular datasets aiming for facilitating the development of new segmentation algorithms. In order to demonstrate the advantages and disadvantages of different semantic segmentation models, we conduct a series of comparisons between them. Deep discussions about the comparisons are also provided. Finally, this review is concluded by discussing future directions and challenges in this important field of research.
We propose a geometry-guided neural network architecture for robust and detail-preserving surface normal estimation for unstructured point clouds. Previous deep normal estimators usually estimate the ...normal directly from the neighbors of a query point, which lead to poor performance. The proposed network is composed of a weight learning sub-network (WL-Net) and a lightweight normal learning sub-network (NL-Net). WL-Net first predicates point-wise weights for generating an optimized point set (OPS) from the input. Then, NL-Net estimates a more accurate normal from the OPS especially when the local geometry is complex. To boost the weight learning ability of the WL-Net, we introduce two geometric guidance in the network. First, we design a weight guidance using the deviations between the neighbor points and the ground truth tangent plane of the query point. This deviation guidance offers a “ground truth” for weights corresponding to some reliable inliers and outliers determined by the tangent plane. Second, we integrate the normals of multiple scales into the input. Its performance and robustness are further improved without relying on multi-branch networks, which are employed in previous multi-scale normal estimators. Thus our method is more efficient. Qualitative and quantitative evaluations demonstrate the advantages of our approach over the state-of-the-art methods, in terms of estimation accuracy, model size and inference time. Code is available at https://github.com/2429581027/local-geometric-guided.
•A new two-step normal estimation method.•Integrate geometric priors into deep learning framework.•Replace multi-scale architecture by multi-scale geometric input.•Achieve 10.79 angle error in comparison with the previous state of the art of 11.78.
Precisely estimating a robot’s pose in a prior, global map is a fundamental capability for mobile robotics, e.g., autonomous driving or exploration in disaster zones. This task, however, remains ...challenging in unstructured, dynamic environments, where local features are not discriminative enough and global scene descriptors only provide coarse information. We therefore present SegMap: a map representation solution for localization and mapping based on the extraction of segments in 3D point clouds. Working at the level of segments offers increased invariance to view-point and local structural changes, and facilitates real-time processing of large-scale 3D data. SegMap exploits a single compact data-driven descriptor for performing multiple tasks: global localization, 3D dense map reconstruction, and semantic information extraction. The performance of SegMap is evaluated in multiple urban driving and search and rescue experiments. We show that the learned SegMap descriptor has superior segment retrieval capabilities, compared with state-of-the-art handcrafted descriptors. As a consequence, we achieve a higher localization accuracy and a 6% increase in recall over state-of-the-art handcrafted descriptors. These segment-based localizations allow us to reduce the open-loop odometry drift by up to 50%. SegMap is open-source available along with easy to run demonstrations.
Digitalization of Nuclear Power Plants (NPPs) is critical for their safe and effective operation and maintenance. Development of Digital Twins (DTs) of NPP legacy assets and subsystems is key to ...achieving this goal. Doing this effectively requires a framework for intelligent allocation of limited resources. This framework is developed here by synthesizing emerging best practices with NPP operators' needs for legacy assets management. Within the framework, a pipeline employs deep-learning object detection to read and locate equipment tags in images. It computes their locations in the corresponding 3D point clouds and then relates that data to an asset management system. The pipeline is premised on preservation and augmentation of existing NPP asset management processes that preclude options such as RFID tags or barcodes. It is a significant step toward more efficient development of DTs of legacy assets. The contributions are framed in the context of a typical Canadian legacy NPP.
•A framework that defines the key aspects of legacy assets for Digital Twins.•Automatic link between photographic records and asset management software.•Performance of tag detection and optical character recognition on real images.•Pipeline for efficient linking of assets in point clouds to management systems.•Identification of asset locations using legacy asset tags in nuclear power plants.
Traditional satellite or land use-derived indicators for assessing residential green space (RGS) exposure have limitations in predicting health benefits, owing to the individual differences in the ...absorption of ‘real’ green exposure. This study developed a novel framework, Greenspace Exposure Composite Indices (GCIs), that modifies objective RGS metrics by including residents’ subjective factors. First, three RGS objective indicators were established based on 3D point clouds: the overall green exposure index(GEI), floor-level green exposure index(FGEI), and activity-site green exposure index(AGEI). The individual factors (i.e. perception, emotion, and behaviour towards RGS) are then weighted to these objective indicators to obtain modified GCIs through the Brunswikian lens model. We also used this novel framework to examine the effects of the RGS indicators on environmental satisfaction based on a case study including 1594 participants in Nanjing, China. The Random Forest Model was used to examine the associations between GCIs and environmental satisfaction, and the results showed that GCIs had a higher explanatory degree of environmental satisfaction than traditional objective ones. Our findings demonstrated that incorporating subjective indicators to optimise objective RGS indicators could offer advantages in predicting environmental satisfaction. This framework is also applicable for predicting the potential effects of RGS exposure on other health-related outcomes.
Display omitted
•Individual subjective factors are integrated to modify the RGS exposure assessment.•RGS exposure objective metrics are developed based on 3D point clouds.•The Brunswikian lens model integrates individual factors into objective indicators.•Random forest model validates the framework's merits in predicting environmental satisfaction.
Automatic construction progress documentation and metric evaluation of execution work in confined building interiors requires particularly reliable geometric evaluation and interpretation of ...statistically uncertain as-built point clouds. This paper presents a method for high-resolution change detection based on dense 3D point clouds from terrestrial laser scanning (TLS) and the discretization of space by voxels. In order to evaluate the metric accuracy of a BIM according to the Level of Accuracy (LOA) specification, the effects of laser range measurements on the occupancy of space are modeled with belief functions and evaluated using Dempster and Shafer's theory of evidence. The application is demonstrated on the point cloud data of multi temporal scanning campaigns of real indoor reconstructions. The results show that TLS point clouds are suitable to verify a given BIM up to LOA 40 if special attention is paid to the scanning geometry during the acquisition. The proposed method can be used to document construction progress, verify and even update the LOA status of a given BIM, confirming valid and BIM-compliant as-built models for further planning.
•A new method for fusion of model uncertainty with uncertainties of point clouds.•Consideration of indoor scanning geometry for accuracy assessment.•Voxel based change detection and self-occlusion analysis.•Experiments on progress evaluation on two real construction sites.
To reduce the cost of storing, processing, and visualizing a large-scale point cloud, we propose a randomized resampling strategy that selects a representative subset of points while preserving ...application-dependent features. The strategy is based on graphs, which can represent underlying surfaces and lend themselves well to efficient computation. We use a general feature-extraction operator to represent application-dependent features and propose a general reconstruction error to evaluate the quality of resampling; by minimizing the error, we obtain a general form of optimal resampling distribution. The proposed resampling distribution is guaranteed to be shift-, rotation- and scale-invariant in the three-dimensional space. We then specify the feature-extraction operator to be a graph filter and study specific resampling strategies based on all-pass, low-pass, high-pass graph filtering and graph filter banks. We validate the proposed methods on three applications: Large-scale visualization, accurate registration, and robust shape modeling demonstrating the effectiveness and efficiency of the proposed resampling methods.
To accelerate the understanding of the relationship between genotype and phenotype, plant scientists and plant breeders are looking for more advanced phenotyping systems that provide more detailed ...phenotypic information about plants. Most current systems provide information on the whole-plant level and not on the level of specific plant parts such as leaves, nodes and stems. Computer vision provides possibilities to extract information from plant parts from images. However, the segmentation of plant parts is a challenging problem, due to the inherent variation in appearance and shape of natural objects. In this paper, deep-learning methods are proposed to deal with this variation. Moreover, a multi-view approach is taken that allows the integration of information from the two-dimensional (2D) images into a three-dimensional (3D) point-cloud model of the plant. Specifically, a fully convolutional network (FCN) and a masked R-CNN (region-based convolutional neural network) were used for semantic and instance segmentation on the 2D images. The different viewpoints were then combined to segment the 3D point cloud. The performance of the 2D and multi-view approaches was evaluated on tomato seedling plants. Our results show that the integration of information in 3D outperforms the 2D approach, because errors in 2D are not persistent for the different viewpoints and can therefore be overcome in 3D.
•A multi-view deep-learning method was developed for plant-part segmentation on 3D.•Deep-neural networks were used for 2D semantic and instance plant segmentation.•Results show that the integration of information in 3D outperforms the 2D approach.•Errors in 2D can be fixed with the combination of multiple viewpoints.•A qualitative analysis of errors identified some improvements in 3D method.