The density-based spatial clustering of applications with noise (DBSCAN) algorithm is a well-known algorithm for spatial-clustering data point clouds. It can be applied to many applications, such as ...crack detection, rockfall detection, and glacier movement detection. Traditional DBSCAN requires two predefined parameters. Suitable values of these parameters depend upon the distribution of the input point cloud. Therefore, estimating these parameters is challenging. This paper proposed a new version of DBSCAN that can automatically customize the parameters. The proposed method consists of two processes: initial parameter estimation based on grid analysis and DBSCAN based on the divide-and-conquer (DC-DBSCAN) approach, which repeatedly performs DBSCAN on each cluster separately and recursively. To verify the proposed method, we applied it to a 3D point cloud dataset that was used to analyze rockfall events at the Puiggcercos cliff, Spain. The total number of data points used in this study was 15,567. The experimental results show that the proposed method is better than the traditional DBSCAN in terms of purity and NMI scores. The purity scores of the proposed method and the traditional DBSCAN method were 96.22% and 91.09%, respectively. The NMI scores of the proposed method and the traditional DBSCAN method are 0.78 and 0.49, respectively. Also, it can detect events that traditional DBSCAN cannot detect.
Today maritime transportation represents 90% of international trade volume and there are more than 50,000 vessels sailing the ocean every day. Therefore, reducing maritime transportation security ...risks by systematically modelling and surveillance should be of high priority in the maritime domain. By statistics, majority of maritime accidents are caused by human error due to fatigue or misjudgment. Auto-vessels equipped with autonomous and semi-autonomous systems can reduce the reliance on human’s intervention, thus make maritime navigation safer. This paper presents a clustering method for route planning and trajectory anomalies detection, which are the essential part of auto-vessel system design and development. In this paper, we present the development of an enhanced density-based spatial clustering (DBSCAN) method that can be applied on historical or real-time Automatic Identification System (AIS) data, so that vessel routes can be modelled, and the trajectories’ anomalies can be detected. The proposed methodology is based on developing an optimized trajectory clustering approach in two stages. Firstly, to increase the attribute dimension of the vessel’s positioning data, therefore other characteristics such as velocity and direction are considered in the clustering process along with geospatial information. Secondly, the DBSCAN clustering model has been enhanced by introducing the Mahalanobis Distance metric considering the correlations of the position cluster points aiming to make the identification process more accurate as well as reducing the computational cost.
•MR tractography data need to be clustered properly for reliable data interpretation.•Several studies for the MR tractography clustering have used DBSCAN method.•DBSCAN is unsupervised clustering ...method for the unimodal vector dataset.•DBSCAN is for unimodal vector dataset, tractography data is multimodal vector data.•We solved that limitation using fiber-distance matrix of the tractography dataset.
MR tractography from diffusion tensor imaging provides a non-invasive way to explore white matter pathways in the human brain. However, a challenge to extracting reliable anatomical information from these data is the use of reliable and effective clustering methodologies. In this paper, we implemented a new version of a robust unsupervised clustering method from MR tractography data using the density-based spatial clustering of applications with noise (DBSCAN) algorithm.
Conventional DBSCAN clustering methods for MR tractography data use each fiber’s start and end point as well as the distance between start and end points. Instead, in this study, we extracted and used a fiber-distance matrix generated for all fiber combinations from the tractography dataset in DBSCAN clustering. The two DBSCAN parameters—minimum point number and maximum radius of the neighborhood—were selected according to the value generated with the cluster stability index (CSI).
Performing the proposed CSI-optimized DBSCAN-based clustering method on MR tractography data of the superior longitudinal fasciculus generated 6 robust, non-overlapping, clusters that are neuroanatomically related.
Conventional DBSCAN-based clustering methods have intrinsic error potential in the clustering results due to deviations in fiber shape and fiber location. The proposed method did not exhibit clustering error caused by deviation in fiber trajectory or fiber location.
We implemented a new, robust DBSCAN-based fiber clustering method for MR tractography data. The CSI-optimized DBSCAN-based unsupervised clustering is applicable to investigation of the neuroconnectome and the fiber structure of the brain.
On-body device position awareness plays an important role in providing smartphone-based services with high levels of usability and quality. Traditionally, the problem assumed that the positions that ...were supported by the system were fixed at the time of design. Thus, if a user stores his/her terminal into an unsupported position, the system forcibly classifies it into one of the supported positions. In contrast, we propose a framework to discover new positions that are not initially supported by the system, which adds them as recognition targets via labeling by a user and re-training on-the-fly. In this article, we focus on a component of identifying a set of samples that are derived from a single storing position, which we call new position candidate identification. Clustering is applied as a key component to prepare a reliable dataset for re-training and to reduce the user's burden of labeling. Specifically, density-based spatial clustering of applications with noise (DBSCAN) is employed because it does not require the number of clusters in advance. We propose a method of finding an optimal value of a main parameter, Eps-neighborhood (
), which affects the accuracy of the resultant clusters. Simulation-based experiments show that the proposed method performs as if the number of new positions were known in advance. Furthermore, we clarify the timing of performing the new position candidate identification process, in which we propose criteria for qualifying a cluster as the one comprising a new position.
To address the problem of low quality of the outlier detection results caused by the irregular spatial distribution of crowdsourced bathymetric data, an intelligently optimized 3D-DBSCAN method is ...hereby proposed for detecting outliers in crowdsourced bathymetric data. Firstly, the potential of the classic DBSCAN algorithm in clustering irregularly distributed data has been explored for three-dimensional space and upgraded to the 3D-DBSCAN algorithm. Meanwhile, three key parameters affecting the 3D-DBSCAN algorithm results have been determined, namely, the minimum number of objects in the neighborhood, the horizontal neighborhood radius and the vertical neighborhood radius. Then, the abilities of the K-nearest neighbor method and the genetic algorithm in setting the initial range of parameters and searching for optimal values have been comprehensively utilized. The optimal combination solution vector of the key parameters of 3D-DBSCAN under different distribution conditions has been adaptively acquired by defining and calculating the silhouette coefficient index. Finally, spatial clustering analysis on crowdsourced bathymetric data and intelligently detect outliers has been conducted by using the optimal combination solution vector as the input of the 3D-DBSCAN model parameters. The experimental results show that the proposed method can not only detect outliers in crowdsourced bathymetric data after adaptively optimizing the parameters, but also detect outliers by analyzing the distance relationship between points in the 3D space, thereby overcoming the limitations of traditional methods in detecting water depth anomalies at specific positions and irregularly distributed anomalies.
Monitoring activity in computer networks is required to detect anomalous activities. This monitoring model is known as an intrusion detection system (IDS). Most IDS model developments are based on ...machine learning. The development of this model requires activity data in the network, either normal or anomalous, in sufficient amounts. The amount of available data also has an impact on the slow learning process in the IDS system, with the resulting performance sometimes not being proportional to the amount of data. This study proposes an IDS model that combines DBSCAN modification with the CART algorithm. DBSCAN modification is performed to reduce data by adding a MinNeighborhood parameter, which is used to determine the distance of the density to the cluster center point, which will then be marked for deletion. The test results, using the Kaggle and KDDCup99 datasets, show that the proposed system model is able to maintain a classification accuracy above 90% for 80% data reduction. This performance was also followed by a decrease in computation time, for the Kaggle dataset from 91.8 ms to 31.1 ms, while for the KDDCup99 dataset from 5.535 seconds to 1.120 seconds KCI Citation Count: 0
Accurate and efficient estimation of forest volume or biomass is critical for carbon cycles, forest management, and the timber industry. Individual tree detection and segmentation (ITDS) is the first ...and key step to ensure the accurate extraction of detailed forest structure parameters from LiDAR (light detection and ranging). However, ITDS is still a challenge to achieve using UAV-LiDAR (LiDAR from Unmanned Aerial Vehicles) in broadleaved forests due to the irregular and overlapped canopies. We developed an efficient and accurate ITDS framework for broadleaved forests based on UAV-LiDAR point clouds. It involves ITD (individual tree detection) from point clouds taken during the leaf-off season, initial ITS (individual tree segmentation) based on the seed points from ITD, and improvement of initial ITS through a refining process. The results indicate that this new proposed strategy efficiently provides accurate results for ITDS. We show the following: (1) point-cloud-based ITD methods, especially the Mean Shift, perform better for seed point selection than CHM-based (Canopy Height Model) ITD methods on the point clouds from leaf-off seasons; (2) seed points significantly improved the accuracy and efficiency of ITS algorithms; (3) the refining process using DBSCAN (density-based spatial clustering of applications with noise) and kNN (k-Nearest Neighbor classifier) classification significantly reduced edge errors in ITS results. Our study developed a novel ITDS strategy for UAV-LiDAR point clouds that demonstrates proficiency in dense deciduous broadleaved forests, and this proposed ITDS framework could be applied to single-phase point clouds instead of the multi-temporal LiDAR data in the future if the point clouds have detailed tree trunk points.
Based on outlier detection algorithms, a feasible quantification method for supraharmonic emission signals is presented. It is designed to tackle the requirements of high-resolution and low data ...volume simultaneously in the frequency domain. The proposed method was developed from the skewed distribution data model and the self-tuning parameters of density-based spatial clustering of applications with noise (DBSCAN) algorithm. Specifically, the data distribution of the supraharmonic band was analyzed first by the Jarque–Bera test. The threshold was determined based on the distribution model to filter out noise. Subsequently, the DBSCAN clustering algorithm parameters were adjusted automatically, according to the k-dist curve slope variation and the dichotomy parameter seeking algorithm, followed by the clustering. The supraharmonic emission points were analyzed as outliers. Finally, simulated and experimental data were applied to verify the effectiveness of the proposed method. On the basis of the detection results, a spectrum with the same resolution as the original spectrum was obtained. The amount of data declined by more than three orders of magnitude compared to the original spectrum. The presented method will benefit the analysis of quantification for the amplitude and frequency of supraharmonic emissions.