A key challenge in contemporary ecology and conservation is the accurate tracking of the spatial distribution of various human impacts, such as fishing. While coastal fisheries in national waters are ...closely monitored in some countries, existing maps of fishing effort elsewhere are fraught with uncertainty, especially in remote areas and the High Seas. Better understanding of the behavior of the global fishing fleets is required in order to prioritize and enforce fisheries management and conservation measures worldwide. Satellite-based Automatic Information Systems (S-AIS) are now commonly installed on most ocean-going vessels and have been proposed as a novel tool to explore the movements of fishing fleets in near real time. Here we present approaches to identify fishing activity from S-AIS data for three dominant fishing gear types: trawl, longline and purse seine. Using a large dataset containing worldwide fishing vessel tracks from 2011-2015, we developed three methods to detect and map fishing activities: for trawlers we produced a Hidden Markov Model (HMM) using vessel speed as observation variable. For longliners we have designed a Data Mining (DM) approach using an algorithm inspired from studies on animal movement. For purse seiners a multi-layered filtering strategy based on vessel speed and operation time was implemented. Validation against expert-labeled datasets showed average detection accuracies of 83% for trawler and longliner, and 97% for purse seiner. Our study represents the first comprehensive approach to detect and identify potential fishing behavior for three major gear types operating on a global scale. We hope that this work will enable new efforts to assess the spatial and temporal distribution of global fishing effort and make global fisheries activities transparent to ocean scientists, managers and the public.
Abstract Aquatic non-indigenous species (NIS) pose significant threats to biodiversity, disrupting ecosystems and inflicting substantial economic damages across agriculture, forestry, and fisheries. ...Due to the fast growth of global trade and transportation networks, NIS has been introduced and spread unintentionally in new environments. This study develops a new physics-informed model to forecast maritime shipping traffic between port regions worldwide. The predicted information provided by these models, in turn, is used as input for risk assessment of NIS spread through transportation networks to evaluate the capability of our solution. Inspired by the gravity model for international trades, our model considers various factors that influence the likelihood and impact of vessel activities, such as shipping flux density, distance between ports, trade flow, and centrality measures of transportation hubs. Accordingly, this paper introduces transformers to gravity models to rebuild the short- and long-term dependencies that make the risk analysis feasible. Thus, we introduce a physics-inspired framework that achieves an 89% binary accuracy for existing and non-existing trajectories and an 84.8% accuracy for the number of vessels flowing between key port areas, representing more than 10% improvement over the traditional deep-gravity model. Along these lines, this research contributes to a better understanding of NIS risk assessment. It allows policymakers, conservationists, and stakeholders to prioritize management actions by identifying high-risk invasion pathways. Besides, our model is versatile and can include new data sources, making it suitable for assessing international vessel traffic flow in a changing global landscape.
Data clustering is one of the most popular techniques in data mining. It is a process of partitioning an unlabeled dataset into groups, where each group contains objects which are similar to each ...other with respect to a certain similarity measure and different from those of other groups. Clustering high-dimensional data is the cluster analysis of data which have anywhere from a few dozen to many thousands of dimensions. Such high-dimensional data spaces are often encountered in areas such as medicine, bioinformatics, biology, recommendation systems and the clustering of text documents. Many algorithms for large data sets have been proposed in the literature using different techniques. However, conventional algorithms have some shortcomings such as the slowness of their convergence and their sensitivity to initialization values. Particle Swarm Optimization (PSO) is a population-based globalized search algorithm that uses the principles of the social behavior of swarms. PSO produces better results in complicated and multi-peak problems. This paper presents a literature survey on the PSO algorithm and its variants to clustering high-dimensional data. An attempt is made to provide a guide for the researchers who are working in the area of PSO and high-dimensional data clustering.
Automatic Identification System (AIS) messages are useful for tracking vessel activity across oceans worldwide using radio links and satellite transceivers. Such data play a significant role in ...tracking vessel activity and mapping mobility patterns such as those found during fishing activities. Accordingly, this paper proposes a geometric-driven semi-supervised approach for fishing activity detection from AIS data. Through the proposed methodology, it is shown how to explore the information included in the messages to extract features describing the geometry of the vessel route. To this end, we leverage the unsupervised nature of cluster analysis to label the trajectory geometry, highlighting changes in the vessel’s moving pattern, which tends to indicate fishing activity. The labels obtained by the proposed unsupervised approach are used to detect fishing activities, which we approach as a time-series classification task. We propose a solution using recurrent neural networks on AIS data streams with roughly 87% of the overall F-score on the whole trajectories of 50 different unseen fishing vessels. Such results are accompanied by a broad benchmark study assessing the performance of different Recurrent Neural Network (RNN) architectures. In conclusion, this work contributes by proposing a thorough process that includes data preparation, labeling, data modeling, and model validation. Therefore, we present a novel solution for mobility pattern detection that relies upon unfolding the geometry observed in the trajectory.
In this paper, we propose leveraging causal generative learning as an interpretable tool for explaining image classifiers. Specifically, we present a generative counterfactual inference approach to ...study the influence of visual features (pixels) as well as causal factors through generative learning. To this end, we first uncover the most influential pixels on a classifier’s decision by computing both Shapely and contrastive explanations for counterfactual images with different attribute values. We then establish a Monte Carlo mechanism using the generator of a causal generative model in order to adapt Shapley explainers to produce feature importances for the human-interpretable attributes of a causal dataset. This method is applied to the case where a classifier has been trained exclusively on the images of the causal dataset. Finally, we present optimization methods for creating counterfactual explanations of classifiers by means of counterfactual inference, proposing straightforward approaches for both differentiable and arbitrary classifiers. We exploit the Morpho-MNIST causal dataset as a case study for exploring our proposed methods for generating counterfactual explanations. However, our methods are applicable also to other causal datasets containing image data. We employ visual explanation methods from the OmnixAI open source toolkit to compare them with our proposed methods. By employing quantitative metrics to measure the interpretability of counterfactual explanations, we find that our proposed methods of counterfactual explanation offer more interpretable explanations compared to those generated from OmnixAI. This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.
Anomaly detection is a fundamental problem in data science and is one of the highly studied topics in machine learning. This problem has been addressed in different contexts and domains. This article ...investigates anomalous data within time series data in the maritime sector. Since there is no annotated dataset for this purpose, in this study, we apply an unsupervised approach. Our method benefits from the unsupervised learning feature of autoencoders. We utilize the reconstruction error as a signal for anomaly detection. For this purpose, we estimate the probability density function of the reconstruction error and find different levels of abnormality based on statistical attributes of the density of error. Our results demonstrate the effectiveness of this approach for localizing irregular patterns in the trajectory of vessel movements.
Automatic classification of vessel types in the maritime domain is one of the challenging problems due to the complexity of moving patterns in the ocean that are collected by the Automatic ...Identification System (AIS). In this study, we explore the usability of different patterns extracted from univariate and multivariate autoregressive modeling for classifying ship types. In order to assess the differentiation power of these features we apply different supervised machine learning classification algorithms and assess the performance of trajectory classification of four different vessel types. In addition, we study the effect of region specification for distinguishing the vessels. The proposed approach produced an accuracy of 86% which confirms that the features obtained from autoregression modeling can identify vessel types effectively. In addition, we demonstrate that the performance of classification can be enhanced further by considering the location of movement.
Building a rich and informative model from raw data is a hard but valuable process with many applications. Ship routing and scheduling are two essential operations in the maritime industry that can ...save a lot of resources if they are optimally designed, but still, need a lot of information to be successful. Past and recent works in the field assume the availability of information such as the birth time-windows, cargo volumes, and container handling productivity at ports and cruising speed. They employ navigation maps that contain information about the major sailing paths and have knowledge about bigger or smaller ports and offshore platforms. In this work, we present a methodology for extracting information about the navigation network for an area, using data from the trajectories of multiple vessels, which are collected using the Automatic Identification System (AIS). We introduce a method for identifying the points of major interest to the trajectory of a vessel and two clustering techniques for identifying: i) key areas in the monitored region such as ports, platforms or areas where vessels change their course (e.g., capes); and ii) the speed and course patterns of ships of a particular type when they follow a typical route. The resulting information is modeled using a network abstraction where nodes correspond to the areas identified by the first clustering technique. After, edges are enriched with information about the groups extracted using the second clustering technique. The first analysis on a real dataset in the area of the eastern Mediterranean sea demonstrates the capabilities of the proposed model and the information it can provide. The use of the model in an outlier behavior detection task also shows interesting results.
Federated learning (FL) is an emerging distributed machine learning paradigm without revealing private local data for privacy-preserving. However, there are still limitations. On one hand, user’ ...privacy can be deduced from local outputs. On the other hand, privacy, efficiency, and accuracy are hard to fulfill for conflicting goals. To tackle these problems, we propose a novel privacy-preserving FL (HEFL-LDP) algorithm, which integrates semi-homomorphic encryption and local differential privacy. With the reduction of computational and communication burden, HEFL-LDP resists model inversion attacks and membership inference attacks from a server or malicious client. Moreover, a new utility optimization strategy with accuracy-oriented privacy parameter adjustment and model shuffling is proposed to solve the problem of accuracy decline. The security and cost of the algorithm are verified through theoretical analysis and proof. Comprehensive experimental evaluations on the MNIST dataset and CIFAR-10 dataset demonstrate that HEFL-LDP significantly reduces the privacy budget and outperforms existing algorithms in computational cost and accuracy.
In the maritime environment, the Automatic Identification System (AIS) contains information related to vessel trajectories that can be used to detect unusual maritime occurrences and maritime traffic ...patterns. To detect such occurrences with supervised learning methods the AIS messages must be manually annotated, which can be a demanding process. Therefore, unsupervised methods are used to identify anomalous traffic patterns based on vessel trajectories. Typically, dense regions of maritime activity are studied to capture common traffic patterns which help identify trajectories that do not follow the norm. However, these approaches cannot detect anomalous behaviors along common pathways or incorporate time-related events into the analysis. Such challenges motivate the approach taken in this work by using auto-regressive techniques to model vessel trajectories and clustering analyses to explore behavior patterns of vessels. Results confirm that the Auto-regressive Integrated Moving Average (ARIMA) and Ornstein-Uhlenbeck (OU) processes are able to model the trajectories and can be used with density-based spatial clustering of applications with noise (DBSCAN), hierarchical clustering (HC), and spectral clustering (SC) to identify different behavioral patterns.