Regular machine learning and data mining techniques study the training data for future inferences under a major assumption that the future data are within the same feature space or have the same ...distribution as the training data. However, due to the limited availability of human labeled training data, training data that stay in the same feature space or have the same distribution as the future data cannot be guaranteed to be sufficient enough to avoid the over-fitting problem. In real-world applications, apart from data in the target domain, related data in a different domain can also be included to expand the availability of our prior knowledge about the target future data. Transfer learning addresses such cross-domain learning problems by extracting useful information from data in a related domain and transferring them for being used in target tasks. In recent years, with transfer learning being applied to visual categorization, some typical problems, e.g., view divergence in action recognition tasks and concept drifting in image classification tasks, can be efficiently solved. In this paper, we survey state-of-the-art transfer learning algorithms in visual categorization applications, such as object recognition, image classification, and human action recognition.
Deep neural networks are at the forefront of machine learning research. However, despite achieving impressive performance on complex tasks, they can be very sensitive: Small perturbations of inputs ...can be sufficient to induce incorrect behavior. Such perturbations, called adversarial examples, are intentionally designed to test the network’s sensitivity to distribution drifts. Given their surprisingly small size, a wide body of literature conjectures on their existence and how this phenomenon can be mitigated. In this article, we discuss the impact of adversarial examples on security, safety, and robustness of neural networks. We start by introducing the hypotheses behind their existence, the methods used to construct or protect against them, and the capacity to transfer adversarial examples between different machine learning models. Altogether, the goal is to provide a comprehensive and self-contained survey of this growing field of research.
Speeded-Up Robust Features (SURF) Bay, Herbert; Ess, Andreas; Tuytelaars, Tinne ...
Computer vision and image understanding,
06/2008, Volume:
110, Issue:
3
Journal Article
Peer reviewed
Open access
This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with ...respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.
This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps.
The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF’s application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF’s usefulness in a broad range of topics in computer vision.
Gait Recognition Based on Deep Learning: A Survey Filipi Gonçalves dos Santos, Claudio; Oliveira, Diego de Souza; A. Passos, Leandro ...
ACM computing surveys,
03/2023, Volume:
55, Issue:
2
Journal Article
Peer reviewed
Open access
In general, biometry-based control systems may not rely on individual expected behavior or cooperation to operate appropriately. Instead, such systems should be aware of malicious procedures for ...unauthorized access attempts. Some works available in the literature suggest addressing the problem through gait recognition approaches. Such methods aim at identifying human beings through intrinsic perceptible features, despite dressed clothes or accessories. Although the issue denotes a relatively long-time challenge, most of the techniques developed to handle the problem present several drawbacks related to feature extraction and low classification rates, among other issues. However, deep learning-based approaches recently emerged as a robust set of tools to deal with virtually any image and computer-vision-related problem, providing paramount results for gait recognition as well. Therefore, this work provides a surveyed compilation of recent works regarding biometric detection through gait recognition with a focus on deep learning approaches, emphasizing their benefits and exposing their weaknesses. Besides, it also presents categorized and characterized descriptions of the datasets, approaches, and architectures employed to tackle associated constraints.
Current RGB-D salient object detection (SOD) methods utilize the depth stream as complementary information to the RGB stream. However, the depth maps are usually of low-quality in existing RGB-D SOD ...datasets. Most RGB-D SOD networks trained with these datasets would produce error-prone results. In this paper, we propose a novel Complementary Depth Network (CDNet) to well exploit saliency-informative depth features for RGB-D SOD. To alleviate the influence of low-quality depth maps to RGB-D SOD, we propose to select saliency-informative depth maps as the training targets and leverage RGB features to estimate meaningful depth maps. Besides, to learn robust depth features for accurate prediction, we propose a new dynamic scheme to fuse the depth features extracted from the original and estimated depth maps with adaptive weights. What's more, we design a two-stage cross-modal feature fusion scheme to well integrate the depth features with the RGB ones, further improving the performance of our CDNet on RGB-D SOD. Experiments on seven benchmark datasets demonstrate that our CDNet outperforms state-of-the-art RGB-D SOD methods. The code is publicly available at https://github.com/blanclist/CDNet .
The physical world provides humans with continuous streams of experience in both space and time. The human mind, however, can parse and organize this continuous input into discrete, individual units. ...In the current work, we characterize the representational signatures of basic units of human experience across the spatial (object) and temporal (event) domains. We propose that there are three shared, abstract signatures of individuation underlying the basic units of representation across the two domains. Specifically, individuated entities in both the spatial domain (objects) and temporal domain (bounded events) resist restructuring, have distinct parts, and do not tolerate breaks; unindividuated entities in both the spatial domain (substances) and the temporal domain (unbounded events) lack these features. In three experiments, we confirm these principles and discuss their significance for cognitive and linguistic theories of objects and events. (PsycInfo Database Record (c) 2024 APA, all rights reserved) (Source: journal abstract)
Abstract
In recent years, with the development of sensors, communication networks, and deep learning, drones have been widely used in the field of object detection, tracking, and positioning. ...However, there are inefficient task execution and some complex algorithms still need to rely on large servers, which is intolerable in rescue and traffic scheduling tasks. Designing fast algorithms that can run on the airborne computer can effectively solve the problem. In this paper, an object detection and location system for drones is proposed. We combine the improved object detection algorithm ST-YOLO based on YOLOX and Swin Transformer with the visual positioning algorithm and deploy it on the airborne end by using TensorRT to realize the detection and location of objects during the flight of the drone. Field experiments show that the established system and algorithm are effective.
•An approach for faster and cheaper PV information collection is proposed.•An algorithm to detect photovoltaic arrays in aerial imagery is tested.•Results demonstrate the efficacy of the PV ...information collection approach.•The results are the first of their kind for solar photovoltaic array detection.•The data is publicly available: https://dx.doi.org/10.6084/m9.figshare.3385780.v1.
The quantity of small scale solar photovoltaic (PV) arrays in the United States has grown rapidly in recent years. As a result, there is substantial interest in high quality information about the quantity, power capacity, and energy generated by such arrays, including at a high spatial resolution (e.g., cities, counties, or other small regions). Unfortunately, existing methods for obtaining this information, such as surveys and utility interconnection filings, are limited in their completeness and spatial resolution. This work presents a computer algorithm that automatically detects PV panels using very high resolution color satellite imagery. The approach potentially offers a fast, scalable method for obtaining accurate information on PV array location and size, and at much higher spatial resolutions than are currently available. The method is validated using a very large (135km2) collection of publicly available (Bradbury et al., 2016) aerial imagery, with over 2700 human annotated PV array locations. The results demonstrate the algorithm is highly effective on a per-pixel basis. It is likewise effective at object-level PV array detection, but with significant potential for improvement in estimating the precise shape/size of the PV arrays. These results are the first of their kind for the detection of solar PV in aerial imagery, demonstrating the feasibility of the approach and establishing a baseline performance for future investigations.
In real-world transfer learning tasks, especially in cross-modal applications, the source domain and the target domain often have different features and distributions, which are well known as the ...heterogeneous domain adaptation (HDA) problem. Yet, existing HDA methods focus on either alleviating the feature discrepancy or mitigating the distribution divergence due to the challenges of HDA. In fact, optimizing one of them can reinforce the other. In this paper, we propose a novel HDA method that can optimize both feature discrepancy and distribution divergence in a unified objective function. Specifically, we present progressive alignment , which first learns a new transferable feature space by dictionary-sharing coding, and then aligns the distribution gaps on the new space. Different from previous HDA methods that are limited to specific scenarios, our approach can handle diverse features with arbitrary dimensions. Extensive experiments on various transfer learning tasks, such as image classification, text categorization, and text-to-image recognition, verify the superiority of our method against several state-of-the-art approaches.