The detection, localization, and interpretation of defects in textured surfaces pose challenges for automatic visual inspection. Both fully-supervised and weakly-supervised approaches have been ...proposed, where fully-supervised methods yield good results but require complex region proposal processes and labeled datasets, whereas with weakly-supervised methods, inexact labels that are less informative than fully labeled data are available. This paper introduces an alternative inexactly supervised methodology that performs defect detection and localization, along with a novel graphical interpretation of detected defects in textured surface images using image-level labels, without relying on region proposal algorithms or explicit defect annotations. The methodology employs block decomposition and bags as a representation using multiple instance learning, where feature vectors are generated from a convolutional neural network with transfer learning. Dissimilarities between bags are computed and the class label assignment is performed using a variant of the k-nearest neighbor algorithm. A baseline methodology using multiple instance learning and low-level feature extraction is also considered as a reference for comparison. The contribution of this study consists in providing a simple but powerful methodology in a way that graphically interpretable detection and localization results are obtained, enhancing the understanding of the detection outcomes. The proposed methodology is extensively evaluated using three datasets, one real and two synthetic, reporting various performance metrics and examples of localization and interpretation results. Average accuracies of 0.9722 ± 0.0058 and 0.9817 ± 0.0099 on synthetic datasets and a series of visualizations of the defect detection and localization results demonstrate the competitiveness of the proposal.
Numerous acoustic features have been proposed as useful measures to characterize natural soundscapes, which can be employed to examine the impact of land transformation on the audible properties of a ...location. The extensive collection of available features demands an examination to identify the most informative and discriminative ones for a given problem. In this study, we conduct an empirical investigation into the selection of acoustic features for discriminating between highly and moderately transformed versions of four Colombian soundscapes: Moorlands, coffee plantations, dry tropical forests, and pastures. We employ classical supervised feature selection techniques along with exploratory tools such as correlation matrices and scatter plots. Our results indicate that a few acoustic features are sufficient to differentiate between the classes. Specifically, those features that estimate acoustic complexity via intrinsic variability of sound intensities or biodiversity through species richness or abundance in specific frequency bands are the most discriminative ones. These findings suggest that the selection of acoustic features can assist in analyzing and distinguishing between different soundscapes.
Distinguishing among the different seismic volcanic patterns is still one of the most important and labor-intensive tasks for volcano monitoring. This task could be lightened and made free from ...subjective bias by using automatic classification techniques. In this context, a core but often overlooked issue is the choice of an appropriate representation of the data to be classified. Recently, it has been suggested that using a relative representation (i.e. proximities, namely dissimilarities on pairs of objects) instead of an absolute one (i.e. features, namely measurements on single objects) is advantageous to exploit the relational information contained in the dissimilarities to derive highly discriminant vector spaces, where any classifier can be used. According to that motivation, this paper investigates the suitability of a dynamic time warping (DTW) dissimilarity-based vector representation for the classification of seismic patterns. Results show the usefulness of such a representation in the seismic pattern classification scenario, including analyses of potential benefits from recent advances in the dissimilarity-based paradigm such as the proper selection of representation sets and the combination of different dissimilarity representations that might be available for the same data.
•A representation, based on the DTW measure, is proposed for seismic classification.•Recent advances of the dissimilarity based representation are investigated for DTW.•Experiments with large scope dataset confirm the suitability of the DTW-space.•The proposed space, when derived from spectrograms, is the best representation.•Selecting small representation sets reduces the number of required DTW comparisons.
Sound synthesis refers to the creation of original acoustic signals with broad applications in artistic innovation, such as music creation for games and videos. Nonetheless, machine learning ...architectures face numerous challenges when learning musical structures from arbitrary corpora. This issue involves adapting patterns borrowed from other contexts to a concrete composition objective. Using Labeled Correlation Alignment (LCA), we propose an approach to sonify neural responses to affective music-listening data, identifying the brain features that are most congruent with the simultaneously extracted auditory features. For dealing with inter/intra-subject variability, a combination of Phase Locking Value and Gaussian Functional Connectivity is employed. The proposed two-step LCA approach embraces a separate coupling stage of input features to a set of emotion label sets using Centered Kernel Alignment. This step is followed by canonical correlation analysis to select multimodal representations with higher relationships. LCA enables physiological explanation by adding a backward transformation to estimate the matching contribution of each extracted brain neural feature set. Correlation estimates and partition quality represent performance measures. The evaluation uses a Vector Quantized Variational AutoEncoder to create an acoustic envelope from the tested Affective Music-Listening database. Validation results demonstrate the ability of the developed LCA approach to generate low-level music based on neural activity elicited by emotions while maintaining the ability to distinguish between the acoustic outputs.
•An incremental algorithm for MIL-based on classifier ensembles is presented.•The proposed algorithm deals with different types of changes in the target concept.•Experiments show that the approach ...can be effectively used in real-world applications.
Most Multiple Instance Learning (MIL) algorithms are designed with the assumption that the target concept is stationary in time, i.e. it is drawn from a stationary unknown distribution. However, in real industrial applications, like automatic visual inspection where defects may evolve, MIL has to deal with changing target concepts whose statistical characteristics may vary over time. Despite this fact, there is little discussion about how to learn from data in non-stationary environments (or data with concept drift) using multiple instance learners. In this work, an incremental MIL algorithm is proposed in order to learn non-stationary and recurrent target concepts in industrial visual inspection applications. Experiments on both synthetic and real-world datasets are conducted to test the performance of the proposed approach. Real-world datasets come from the automatic visual inspection task in industry. The experimental results show that the proposed approach is able to handle changing target concepts over time.
Training a given learning-based forecasting method to a satisfactory level of performance often requires a large dataset. Indeed, any data-driven methods require having examples that are providing a ...satisfactory representation of what we wish to model to work properly. This often implies using large datasets to be sure that the phenomenon of interest is properly sampled. However, learning from time series composed of too many samples can also be a problem, given that the computational requirements of the learning algorithms can easily grow following a polynomial complexity according to the training set size. In order to identify representative examples of a dataset, we are proposing a methodology using clustering-based stratification of time series to select a training data subset. The principle for constructing a representative sample set using this method consists in selecting heterogeneous instances picked from all the various clusters composing the dataset. Results obtained show that with a small number of training examples, obtained through the proposed clustering-based stratification, we can preserve the performance and improve the stability of models such as artificial neural networks and support vector regression, while training at a much lower computational cost. We illustrate the methodology through forecasting the one-step ahead Hourly Ontario Energy Price (HOEP).
Classification of birdsong recordings can be naturally formulated as a multiple instance problem, where bags of instances are represented by either features or dissimilarities. In bioacoustics, bags ...typically correspond to regions of interest in spectrograms, which are detected after a segmentation stage of the audio recordings. In this paper, we use different dissimilarity measures between bags and explore whether the subsequent application of metric learning/adaptation methods and the construction of dissimilarity spaces allow increasing the classification performance of birdsong recordings. A publicly available bioacoustic data set is used for the experiments. Our results suggest, in the first place, that appropriate dissimilarity measures are those which capture most of the overall differences between bags, such as the modified Hausdorff distance and the mean minimum distance; in the second place, they confirm the benefit from adapting the applied dissimilarity measure as well as the potential further enhancement of the classification performance by building dissimilarity spaces and increasing training set sizes.
•Birdsong recordings can be classified by different dissimilarity-based strategies.•Modified Hausdorff and mean–minimum are appropriate distances to compare birdsongs.•When adapting distances, a good and simple metric learning technique is preferred.•Metric learning followed by a dissimilarity space enhances birdsong recognition.•Metric learning and dissimilarity spaces effectively exploit the training set.
In this paper we investigate the exploitation of non linear scaling of distances for advanced nearest neighbor classification. Starting from the recently found relation between the Hypersphere ...Classifier (HC) 1 and the Adaptive Nearest Neighbor rule (ANN) 2, here we propose PowerHC, an improved version of HC in which distances are normalized using a non linear mapping; non linear scaling of data, whose usefulness for feature spaces has been already assessed, has been hardly investigated for distances. A thorough experimental evaluation, involving 24 datasets and a challenging real world scenario of seismic signal classification, confirms the suitability of the proposed approach.
The Rectified Nearest Feature Line Segment (RN-FLS) classifier is an improved version of the Nearest Feature Line (NFL) classification rule. RNFLS corrects two drawbacks of NFL, namely the ...interpolation and extrapolation inaccuracies, by applying two consecutive processes-segmentation and rectification - to the initial set of feature lines. The main drawbacks of this technique, occurring in both training and test phases, are the high computational cost of the rectification procedure and the exponential explosion of the number of lines. We propose a cheaper version of RNFLS, based on a characterization of the points that should form good lines. The characterization relies on a recent neighborhood-based principle that categorizes objects into four types: safe, borderline, rare and outliers, depending on the position of each point with respect to the other classes. The proposed approach represents a variant of RNFLS in the sense that it only considers lines between safe points. This allows a drastic reduction in the computational burden imposed by RNFLS. We carried out an empirical and thorough analysis based on different public data sets, showing that our proposed approach, in general, is not significantly different from RNFLS, but cheaper since the consideration of likely irrelevant feature line segments is avoided.