Deep learning has become a standard approach to machine vision in recent years. Despite several advances, it requires large amounts of annotated data. Nonetheless, in many applications, large-scale ...data acquisition and annotation is expensive and data imbalance is an intrinsic problem. To address these challenges, we propose a novel synthetic database generation method that only requires (i) arbitrary natural images, i.e., does not demand real images from the target domain, and (ii) templates of the traffic signs. Our method does not aim at overcoming the training with real data but to be a compatible option when there is a lack of real data. Results with data of multiple countries show that the synthetic database generated without human effort is effective for training a deep traffic sign detector. On large datasets, training with a fully synthetic dataset almost matches the performance of training with a real one. When compared to training with a smaller dataset of real images, training with synthetic images increased the accuracy by 12.25%. The proposed method also improves the performance of the detector when target-domain data are available.
Self-driving cars: A survey Badue, Claudine; Guidolini, Rânik; Carneiro, Raphael Vivacqua ...
Expert systems with applications,
03/2021, Volume:
165
Journal Article
Peer reviewed
We survey research on self-driving cars published in the literature focusing on autonomous cars developed since the DARPA challenges, which are equipped with an autonomy system that can be ...categorized as SAE level 3 or higher. The architecture of the autonomy system of self-driving cars is typically organized into the perception system and the decision-making system. The perception system is generally divided into many subsystems responsible for tasks such as self-driving-car localization, static obstacles mapping, moving obstacles detection and tracking, road mapping, traffic signalization detection and recognition, among others. The decision-making system is commonly partitioned as well into many subsystems responsible for tasks such as route planning, path planning, behavior selection, motion planning, and control. In this survey, we present the typical architecture of the autonomy system of self-driving cars. We also review research on relevant methods for perception and decision making. Furthermore, we present a detailed description of the architecture of the autonomy system of the self-driving car developed at the Universidade Federal do Espírito Santo (UFES), named Intelligent Autonomous Robotics Automobile (IARA). Finally, we list prominent self-driving car research platforms developed by academia and technology companies, and reported in the media.
•Recently developments of autonomous driving from academic and industry point of view.•Breakdown of the main aspects comprising autonomous driving and their evolution.•Autonomous driving architecture review and proposal.
•Simple, yet powerful, method to copy a black-box CNN model with random natural images.•Some constraints are waived and copy attacks are performed with less information.•Understanding copy attacks ...with random natural images.•Throughout evaluation of copycat models created with random natural images.
Convolutional neural networks have been successful lately enabling companies to develop neural-based products, which demand an expensive process, involving data acquisition and annotation; and model generation, usually requiring experts. With all these costs, companies are concerned about the security of their models against copies and deliver them as black-boxes accessed by APIs. Nonetheless, we argue that even black-box models still have some vulnerabilities. In a preliminary work, we presented a simple, yet powerful, method to copy black-box models by querying them with natural random images. In this work, we consolidate and extend the copycat method: (i) some constraints are waived; (ii) an extensive evaluation with several problems is performed; (iii) models are copied between different architectures; and, (iv) a deeper analysis is performed by looking at the copycat behavior. Results show that natural random images are effective to generate copycats for several problems.
Stroke is a neurological condition that usually results in the loss of voluntary control of body movements, making it difficult for individuals to perform activities of daily living (ADLs). ...Brain-computer interfaces (BCIs) integrated into robotic systems, such as motorized mini exercise bikes (MMEBs), have been demonstrated to be suitable for restoring gait-related functions. However, kinematic estimation of continuous motion in BCI systems based on electroencephalography (EEG) remains a challenge for the scientific community. This study proposes a comparative analysis to evaluate two artificial neural network (ANN)-based decoders to estimate three lower-limb kinematic parameters: x- and y-axis position of the ankle and knee joint angle during pedaling tasks. Long short-term memory (LSTM) was used as a recurrent neural network (RNN), which reached Pearson correlation coefficient (PCC) scores close to 0.58 by reconstructing kinematic parameters from the EEG features on the delta band using a time window of 250 ms. These estimates were evaluated through kinematic variance analysis, where our proposed algorithm showed promising results for identifying pedaling and rest periods, which could increase the usability of classification tasks. Additionally, negative linear correlations were found between pedaling speed and decoder performance, thereby indicating that kinematic parameters between slower speeds may be easier to estimate. The results allow concluding that the use of deep learning (DL)-based methods is feasible for the estimation of lower-limb kinematic parameters during pedaling tasks using EEG signals. This study opens new possibilities for implementing controllers most robust for MMEBs and BCIs based on continuous decoding, which may allow for maximizing the degrees of freedom and personalized rehabilitation.Stroke is a neurological condition that usually results in the loss of voluntary control of body movements, making it difficult for individuals to perform activities of daily living (ADLs). Brain-computer interfaces (BCIs) integrated into robotic systems, such as motorized mini exercise bikes (MMEBs), have been demonstrated to be suitable for restoring gait-related functions. However, kinematic estimation of continuous motion in BCI systems based on electroencephalography (EEG) remains a challenge for the scientific community. This study proposes a comparative analysis to evaluate two artificial neural network (ANN)-based decoders to estimate three lower-limb kinematic parameters: x- and y-axis position of the ankle and knee joint angle during pedaling tasks. Long short-term memory (LSTM) was used as a recurrent neural network (RNN), which reached Pearson correlation coefficient (PCC) scores close to 0.58 by reconstructing kinematic parameters from the EEG features on the delta band using a time window of 250 ms. These estimates were evaluated through kinematic variance analysis, where our proposed algorithm showed promising results for identifying pedaling and rest periods, which could increase the usability of classification tasks. Additionally, negative linear correlations were found between pedaling speed and decoder performance, thereby indicating that kinematic parameters between slower speeds may be easier to estimate. The results allow concluding that the use of deep learning (DL)-based methods is feasible for the estimation of lower-limb kinematic parameters during pedaling tasks using EEG signals. This study opens new possibilities for implementing controllers most robust for MMEBs and BCIs based on continuous decoding, which may allow for maximizing the degrees of freedom and personalized rehabilitation.
•An interactive framework for reconstruction of strip-shredded documents.•The user lock and forbid pairs automatically selected by the recommender module.•Four query strategies for recommending the ...pairs of shreds to be annotated.•A novel methodology to assess the human impact on the quality of a reconstruction.•Annotating 25% of the shreds can yield an error reduction of more than 40%.
Display omitted
The advances in machine learning – particularly in deep learning – have enabled automatizing the reconstruction of shredded documents with significant accuracy. However, despite the recent remarkable results, the state-of-the-art on fully automatic reconstruction still has room for improvement, mainly due to imprecision on the evaluation of how the shreds fit each other (compatibility/cost evaluation). To tackle this problem, we propose a human-in-the-loop reconstruction framework that takes user inputs to improve the solutions (permutation of shreds). In our approach, the user verifies whether adjacent shreds of a solution are also adjacent in the original document. Unlike the current literature, our framework includes a recommender module that automatically selects pairs of shreds to be analyzed by a human. Four recommendation strategies were proposed and evaluated. Results achieved by coupling deep learning reconstruction methods into our framework have shown that introducing the human in the loop can reduce errors by more than 40%.
•Paper shreds matching via self-supervised deep learning.•Training with simulated cuts is effective for real-shredded documents.•A new public dataset with 100 strip-shredded documents (2292 ...shreds).•Accurate (over 90% accuracy) reconstruction of 100 mixed shredded documents.
The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. The solution presented in this work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario: the reconstruction of several mixed shredded documents at once. In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem. The model is trained in a self-supervised manner on samples extracted from simulated-shredded documents, which obviates manual annotation. Experimental results on three datasets – including a new collection of 100 strip-shredded documents produced for this work – have shown that the proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.
In general, proposed solutions for LiDAR-based localization used in autonomous cars require expensive sensors and computationally expensive mapping processes. Moreover, the global localization for ...autonomous driving is converging to the use of maps. Straightforward strategies to reduce the costs are to produce simpler sensors and use maps already available on the Internet. Here, an analysis is presented to show how simple can a LiDAR sensor be without degrading the localization accuracy that uses road and satellite maps together to globally pose the car. Three characteristics of the sensor are evaluated: the number of range readings, the amount of noise in the LiDAR readings, and the frame rate, with the aim of finding the minimum number of LiDAR lines, the maximum acceptable noise and the sensor frame rate needed to obtain an accurate position estimation. The analysis is performed using an autonomous car in complex field scenarios equipped with a 3D LiDAR Velodyne HDL-32E. Several experiments were conducted reducing the number of frames, the number of scans per 3D point-cloud and artificially adding up to 15% of error in the ray length. Among other results, we found that using only 4 vertical lines per scan and with an artificial error added up to 15% of the ray length, the car was capable to localize itself within 2.11 meters error average. All experimental results and the followed methodology are explained in detail herein.
Unsupervised domain adaptation for object detection addresses the adaption of detectors trained in a source domain to work accurately in an unseen target domain. Recently, methods approaching the ...alignment of the intermediate features proven to be promising, achieving state-of-the-art results. However, these methods are laborious to implement and hard to interpret. Although promising, there is still room for improvements to close the performance gap toward the upper-bound (when training with the target data). In this work, we propose a method to generate an artificial dataset in the target domain to train an object detector. We employed two unsupervised image translators (CycleGAN and an AdaIN-based model) using only annotated data from the source domain and non-annotated data from the target domain. Our key contributions are the proposal of a less complex yet more effective method that also has an improved interpretability. Results on real-world scenarios for autonomous driving show significant improvements, outperforming state-of-the-art methods in most cases, further closing the gap toward the upper-bound.
•A simple yet effective method for detecting objects on unsupervised domain adaptation.•Artificially generated images are useful for unsupervised domain adaptation.•An extensive comparison with the state-of-the-art is provided.•Experiments in three scenarios: synthetic data, adverse weather, and cross-camera.
•Localization with occupancy or reflectivity grid maps is more accurate.•Semantic grid maps lead to stable and reasonably accurate localization.•Localization with colour grid maps failed due to ...changes in illumination.•Entropy correlation coefficient is not a good metric for comparing colour maps.•The two-step mapping technique was successfully employed in all experiments.
The localization of self-driving cars is needed for several tasks such as keeping maps updated, tracking objects, and planning. Localization algorithms often take advantage of maps for estimating the car pose. Since maintaining and using several maps is computationally expensive, it is important to analyze which type of map is more adequate for each application. In order to contribute with this analysis, in this work, we compare the accuracy of a particle filter localization when using occupancy, reflectivity, color, or semantic grid maps. To the best of our knowledge, such evaluation is missing in the literature. For building semantic and color grid maps, point clouds from a Light Detection and Ranging (LiDAR) sensor are fused with images captured by a front-facing camera. Semantic information is extracted from images with the deep neural network DeepLabv3+. Experiments are performed in varied environments, under diverse conditions of illumination and traffic. Results show that occupancy grid maps lead to more accurate localization, followed by reflectivity grid maps. In most scenarios, the localization with semantic grid maps kept the position tracking without catastrophic losses, but with errors from 2 to 3 times bigger than the previous. Color grid maps led to inaccurate and unstable localization in most scenarios even using a robust metric, the entropy correlation coefficient, for comparing online data and the map.
Kinematic reconstruction of lower-limb movements using electroencephalography (EEG) has been used in several rehabilitation systems. However, the nonlinear relationship between neural activity and ...limb movement may challenge decoders in real-time Brain-Computer Interface (BCI) applications. This paper proposes a nonlinear neural decoder using an Unscented Kalman Filter (UKF) to infer lower-limb kinematics from EEG signals during pedaling. The results demonstrated maximum decoding accuracy using slow cortical potentials in the delta band (0.1-4 Hz) of 0.33 for Pearson's r-value and 8 for the signal-to-noise ratio (SNR). This leaves an open door to the development of closed-loop EEG-based BCI systems for kinematic monitoring during pedaling rehabilitation tasks.