•An improved DBSCAN method is proposed for 3D intersecting-plane segmentation which can detect the correct boundary points.•An approach for automatic point selection is proposed for modeling a ...plane.•An adaptive coplanar threshold is designed for differentiating planes.
This paper introduces a new DBSCAN-based method for boundary detection and plane segmentation for 3D point clouds. The proposed method is based on candidate samples selection in 3D space and plane validity detection via revising the classical DBSCAN clustering algorithm to obtain a valid fitting plane. Technically, a coplanar threshold is designed as an additional clustering condition to group 3D points whose distances to the fitting plane satisfy the constraint of the threshold as one cluster. The threshold value is automatically adjusted to fit the local distribution of samples in the input dataset, which is free of parameter tuning. Planar objects can be detected by the proposed method since a cluster contains only data points belonging to one plane, and the boundaries among different planes can be correctly detected. Experimental evaluations are performed on both synthetic and real point cloud datasets. Results show that the proposed approach is effective for planar segmentation and high-quality segmentation of intersection boundaries.
In order to improve the efficiency of robots picking apples in challenging orchard environments, a method for precisely detecting apples and planning the picking sequence is proposed. Firstly, the ...EfficientFormer network serves as the foundation for YOLOV5, which uses the EF-YOLOV5s network to locate apples in difficult situations. Meanwhile, the Soft Non-Maximum Suppression (NMS) algorithm is adopted to achieve accurate identification of overlapping apples. Secondly, the adjacently identified apples are automatically divided into different picking clusters by the improved density-based spatial clustering of applications with noise (DBSCAN). Finally, the order of apple harvest is determined to guide the robot to complete the rapid picking, according to the weight of the Gauss distance weight combined with the significance level. In the experiment, the average precision of this method is 98.84%, which is 4.3% higher than that of YOLOV5s. Meanwhile, the average picking success rate and picking time are 94.8% and 2.86 seconds, respectively. Compared with sequential and random planning, the picking success rate of the proposed method is increased by 6.8% and 13.1%, respectively. The research proves that this method can accurately detect apples in complex environments and improve picking efficiency, which can provide technical support for harvesting robots.
Clustering is a technique that allows data to be organized into groups of similar objects. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) constitutes a popular clustering ...algorithm that relies on a density-based notion of cluster and is designed to discover clusters of arbitrary shape. The computational complexity of DBSCAN is dominated by the calculation of the ϵ-neighborhood for every object in the dataset. Thus, the efficiency of DBSCAN can be improved in two different ways: (1) by reducing the overall number of ϵ-neighborhood queries (also known as region queries), or (2) by reducing the complexity of the nearest neighbor search conducted for each region query. This paper deals with the first issue by considering the most relevant region query strategies for DBSCAN, all of them characterized by inspecting the neighborhoods of only a subset of the objects in the dataset. We comparatively evaluate these region query strategies (or DBSCAN variants) in terms of clustering effectiveness and efficiency; additionally, a novel region query strategy is introduced in this work. The results show that some DBSCAN variants are only slightly inferior to DBSCAN in terms of effectiveness, while greatly improving its efficiency. Among these variants, the novel one outperforms the rest.
Intelligent fault diagnosis technology, as a promising approach, is gradually playing an irreplaceable role in ensuring the safety, reliability, and efficiency of mechanical equipment. However, in ...real-world industrial scenarios, obtaining adequate high-quality label information is typically challenging and unrealistic, resulting in the performance degradation of most existing supervised learning-based diagnosis models, and necessitating the development of unsupervised intelligent diagnostic models. In addition, the sample independence hypothesis is widely used in existing studies, which significantly ignores the further mining of relevant auxiliary information between samples and its positive effect on performance improvement. To overcome these challenges, a novel intelligent fault diagnosis framework, called the convolutional capsule auto-encoder-based unsupervised directed hierarchical graph network with clustering representation (CCAE-UDHGN-CR), is established and employed in unlabeled scenarios. First, a novel convolutional capsule auto-encoder (CCAE), which combines reconstruction loss and semantic clustering loss, is constructed and used to extract deep coding features that contain attribute information of the sample itself. Then, with the assistance of cosine similarity measurement strategy, the internal correlation between samples is fully mined, and on this basis, the conversion of deep coding features to the graph sample set is realized, which serves as the input of the subsequent unsupervised directed hierarchical graph network (UDHGN). Finally, the deep representation features extracted by the UDHGN are further fed into the density-based spatial clustering of applications with noise (DBSCAN) model to complete the determination of category information. A total of three cases based on key functional components and manipulator are employed for performance verification. The comprehensive diagnosis results all show that the proposed CCAE-UDHGN-CR model can effectively alleviate the dependence on label information while maintaining excellent diagnosis performance.
This paper focuses on removing mismatches from given putative feature matches created typically based on descriptor similarity. To achieve this goal, existing attempts usually involve estimating the ...image transformation under a geometrical constraint, where a pre-defined transformation model is demanded. This severely limits the applicability, as the transformation could vary with different data and is complex and hard to model in many real-world tasks. From a novel perspective, this paper casts the feature matching into a spatial clustering problem with outliers. The main idea is to adaptively cluster the putative matches into several motion consistent clusters together with an outlier/mismatch cluster. To implement the spatial clustering, we customize the classic density based spatial clustering method of applications with noise (DBSCAN) in the context of feature matching, which enables our approach to achieve quasi-linear time complexity. We also design an iterative clustering strategy to promote the matching performance in case of severely degraded data. Extensive experiments on several datasets involving different types of image transformations demonstrate the superiority of our approach over state-of-the-art alternatives. Our approach is also applied to near-duplicate image retrieval and co-segmentation and achieves promising performance.
•A novel, scalable classification approach for aerial laser scanning.•Pre-gridding in three dimensions allows parallelizable feature calculation.•Application of a random forest approach using ...pre-calculated features.•Enables 99% reduction in the machine learning processing.•Fully automated approach demonstrated for 2 urban areas.
The opportunities now afforded by increasingly available, dense, aerial urban LiDAR point clouds (greater than100 pts/m2) are arguably stymied by their sheer size, which precludes the effective use of many tools designed for point cloud data mining and classification. This paper introduces the point cloud voxel classification (PCVC) method, an automated, two-step solution for classifying terabytes of data without overwhelming the computational infrastructure. First, the point cloud is voxelized to reduce the number of points needed to be processed sequentially. Next, descriptive voxel attributes are assigned to aid in further classification. These attributes describe the point distribution within each voxel and the voxel’s geo-location. These include 5 point-descriptors (density, standard deviation, clustered points, fitted plane, and plane’s angle) and 2 voxel position attributes (elevation and neighbors). A random forest algorithm is then used for final classification of the object within each voxel using four categories: ground, roof, wall, and vegetation. The proposed approach was evaluated using a 297,126,417 point dataset from a 1 km2 area in Dublin, Ireland and 50% denser dataset of New York City of 13,912,692 points (150 m2). PCVC’s main advantage is scalability achieved through a 99 % reduction in the number of points that needed to be sequentially categorized. Additionally, PCVC demonstrated strong classification results (precision of 0.92, recall of 0.91, and F1-score of 0.92) compared to previous work on the same data set (precision of 0.82-0.91, recall 0.86-0.89, and F1-score of 0.85-0.90).
The majority of photovoltaic (PV) systems in the Netherlands are small scale, installed on rooftops, where the lack of onsite global tilted irradiance (GTI) measurements and the frequent presence of ...shadow due to objects in the close vicinity oppose challenge in their monitoring process. In this study, a new algorithmic tool is introduced that creates a reference data‐set through the combination of data‐sets of the unshaded PV systems in the surrounding area. It subsequently compares the created reference data‐set with the one of the PV system of interest, detects any energy loss and clusters the distinctive loss due to shadow, created by the surrounding objects. The new algorithm is applied successfully to a number of different cases of shaded PV systems. Finally, suggestions on the unsupervised use of the algorithm by any monitoring platform are discussed, along with its limitations algorithm and suggestions for further research.
In this study, a new algorithmic tool is introduced that creates a reference data set through the combination of data sets of the unshaded PV systems in the surrounding area. It subsequently compares the created reference data set with the one of the PV systems of interest, detects any energy loss and clusters the distinctive loss due to shadow, created by the surrounding objects. The algorithm is applied successfully to a number of different cases of shaded PV systems.
Cluster detection is important and widely used in a variety of applications, including public health, public safety, transportation, and so on. Given a collection of data points, we aim to detect ...density-connected spatial clusters with varying geometric shapes and densities, under the constraint that the clusters are statistically significant. The problem is challenging, because many societal applications and domain science studies have low tolerance for spurious results, and clusters may have arbitrary shapes and varying densities. As a classical topic in data mining and learning, a myriad of techniques have been developed to detect clusters with both varying shapes and densities (e.g., density-based, hierarchical, spectral, or deep clustering methods). However, the vast majority of these techniques do not consider statistical rigor and are susceptible to detecting spurious clusters formed as a result of natural randomness. On the other hand, scan statistic approaches explicitly control the rate of spurious results, but they typically assume a single “hotspot” of over-density and many rely on further assumptions such as a tessellated input space. To unite the strengths of both lines of work, we propose a statistically robust formulation of a multi-scale DBSCAN, namely Significant DBSCAN+, to identify significant clusters that are density connected. As we will show, incorporation of statistical rigor is a powerful mechanism that allows the new Significant DBSCAN+ to outperform state-of-the-art clustering techniques in various scenarios. We also propose computational enhancements to speed-up the proposed approach. Experiment results show that Significant DBSCAN+ can simultaneously improve the success rate of true cluster detection (e.g., 10–20% increases in absolute F1 scores) and substantially reduce the rate of spurious results (e.g., from thousands/hundreds of spurious detections to none or just a few across 100 datasets), and the acceleration methods can improve the efficiency for both clustered and non-clustered data.
DBSCAN is a widely used clustering algorithm based on density metrics that can efficiently identify clusters with uniform density. However, if the densities of different clusters are varying, the ...corresponding clustering results may be not good. To address this issue, we propose a multi-density DBSCAN based on the relative density (MDBSCAN), which can achieve better results on clusters with multiple densities. The intuition of our work is simple but effective, we first divide the dataset into two parts: low density and high density, and then we take a divide and conquer method to deal with the respective parts to avoid them interfering with each other. Specifically, the proposed MDBSCAN consists of three steps: (i) extract the low-density data points in the dataset by relative density. (ii) find natural clusters among the identified low-density data points. (iii) clustering the remaining data points (except the data points of natural clusters in a dataset) by using DBSCAN and assigning the noises (generated by DBSCAN) to the nearest clusters. To verify the effectiveness of the proposed MDBSCAN algorithm, we conduct experiments on ten synthetic datasets and six real-world datasets. Experimental results demonstrate that the proposed MDBSCAN algorithm outperforms the original DBSCAN and six extends of DBSCAN, especially including two state-of-the-art algorithms (DRL-DBSCAN and AMD-DBSCAN) in most cases.