Event-based cameras are bioinspired vision sensors whose pixels work independently from each other and respond asynchronously to brightness changes, with microsecond resolution. Their advantages make ...it possible to tackle challenging scenarios in robotics, such as high-speed and high dynamic range scenes. We present a solution to the problem of visual odometry from the data acquired by a stereo event-based camera rig. Our system follows a parallel tracking-and-mapping approach, where novel solutions to each subproblem (three-dimensional (3-D) reconstruction and camera pose estimation) are developed with two objectives in mind: being principled and efficient, for real-time operation with commodity hardware. To this end, we seek to maximize the spatio-temporal consistency of stereo event-based data while using a simple and efficient representation. Specifically, the mapping module builds a semidense 3-D map of the scene by fusing depth estimates from multiple viewpoints (obtained by spatio-temporal consistency) in a probabilistic fashion. The tracking module recovers the pose of the stereo rig by solving a registration problem that naturally arises due to the chosen map and event data representation. Experiments on publicly available datasets and on our own recordings demonstrate the versatility of the proposed method in natural scenes with general 6-DoF motion. The system successfully leverages the advantages of event-based cameras to perform visual odometry in challenging illumination conditions, such as low-light and high dynamic range, while running in real-time on a standard CPU. We release the software and dataset under an open source license to foster research in the emerging topic of event-based simultaneous localization and mapping.
Multi-directional stereo vision sensors have been paid increasing attention in the fields of environment perception and 3D reconstruction attributed to their advantage of wide field of view (FOV). ...Vision sensor setup requires the parameters to be calibrated accurately, which remains a challenging problem because of the discontinuity in the camera field. This paper proposes a flexible calibration method using an unconstrained 3D target for a quad-directional stereo vision sensor (QSVS) composed of a single camera and a dual mirror pyramid. We build the measurement models of the local virtual binocular structures and the global virtual multi-camera based on the optical properties of the catadioptric elements to reduce the calibration steps of the mirror poses. On this basis, this paper resolves the following three problems. First, the unique intrinsic parameters of the virtual multi-camera are obtained through the intrinsic calibration of the actual camera. Second, based on the proposed sorting algorithm of the feature points, we can determine the correspondence between the spatial feature points and the mirrored 2D points in the global calibration image. Third, we can determine the global structure parameters of the QSVS from a single calibration image based on the designed unconstrained 3D target with 6-DOF. And we optimize the calibration parameters by flexibly adjusting the target pose. Simulation and physical experiments are conducted to demonstrate the performance of the proposed method. This method also provides a new way to calibrate the miniaturized multi-camera vision systems and omnidirectional vision sensors without an overlapping FOV.
Wind as a clean and renewable energy source has been used by humans for centuries. However, in recent years with the increase in the number and size of wind turbines, their impact on avifauna has ...become worrisome. Researchers estimated that in the U.S. up to 500,000 birds die annually due to collisions with wind turbines. This article proposes a system for mitigating bird mortality around wind farms. The solution is based on a stereo-vision system embedded in distributed computing and IoT paradigms. After a bird's detection in a defined zone, the decision-making system activates a collision avoidance routine composed of light and sound deterrents and the turbine stopping procedure. The development process applies a User-Driven Design approach along with the process of component selection and heuristic adjustment. This proposal includes a bird detection method and localization procedure. The bird identification is carried out using artificial intelligence algorithms. Validation tests with a fixed-wing drone and verifying observations by ornithologists proved the system's desired reliability of detecting a bird with wingspan over 1.5 m from at least 300 m. Moreover, the suitability of the system to classify the size of the detected bird into one of three wingspan categories, small, medium and large, was confirmed.
•A dataset generation method in large depth of field is proposed.•A 3D displacement measuring device is applied to measure the expansion of boiler.•The MOGA can improve the efficiency and precision ...of ANN simultaneously.•The PSF can reduce pixel coordinate errors by half compared to one that does not include it.
In this paper, an acronym for stereo vision calibration-point spread function (SVC-PSF), is designed to accurately represent image degradation in large-depth-of-field (LDOF). The error of the dataset generated by the SVC-PSF is 50 % lower than that of Stereo Vision Calibration (SVC) method in LDOF. A three-dimensional point coordinate reconstruction artificial neural network (ANN) is improved through a multi-objective genetic algorithm (MOGA) to satisfy both computational accuracy and real-time performance. The MOGA improved ANN can reconstruct 1250 points within 0.069 s on an embedded platform. Finally, SVC-PSF is used to generate a dataset, and the MOGA-ANN improved model is trained based on this dataset. An embedded device, stereo displacement measurement device (SDMD) is designed to run SVC-PSF-MOGA-ANN algorithm, and this device is applied in thermal power plant boiler expansion monitor. The real-time monitor results shows that when the boiler power increases, more working medium is needed to produce steam, so the boiler's own weight increases, and the boiler expands downward under the action of gravity. At low power, the required working medium is reduced, and the corresponding expansion amount is also reduced.
Existing stereo matching networks with high accuracy generally demand high computational overhead, making them expensive to deploy and unsuitable for many real-time applications, especially on ...source-constrained hardware platforms. Some recent works have tried to build lightweight stereo matching networks, but their accuracy and inference speed is far from satisfactory when deployed on source-constrained GPUs. This paper aims to build a lightweight stereo matching network with high accuracy. First, we claim that a reasonable disparity upsampling strategy is crucial for lightweight stereo matching to achieve high accuracy, whose importance is underestimated in existing works. Accordingly, an efficient and effective disparity upsampling strategy named Spatial Adaptive Disparity Shuffle (SADS) is proposed to meet the upsampling requirements of disparities within diverse regions. Second, a novel Channel-Disparity-Mixed Attention (CDMA) mechanism is proposed to regularize the high-resolution 4D compact cost volume. The attention values in CDMA are adaptive across channels, pixels and disparity candidates, making CDMA suitable for regularizing the compact cost volume. Based on the proposed methods, a lightweight stereo matching network (SADSNet) is designed that achieves real-time performance with high accuracy on source-limited hardware platforms (e.g., the NVIDIA Jetson TX2). In addition, SADSNet can be easily scaled up or down to suit different application scenarios according to available computational power. Extensive experimental results on multiple datasets show its clear superiority over SOTA lightweight stereo models and even many large accuracy-oriented models.
Binocular stereo vision (BSV) system has been widely used in various fields, such as intelligent manufacture, smart robot, and so on. However, the location accuracy of the current BSV still cannot ...fully satisfy industry requirements due to lack of a parameters optimization BSV system. In this paper, a high accuracy BSV system is proposed. This is achieved through analyzing the seven parameters of the BSV system, which are classified into two groups: system structure parameters (SSPs) and camera calibration parameters (CCPs). For the SSPs, an improved analysis model is designed to expose the possible errors caused by three parameters. Furthermore, a new correlation model among them is also proposed to analyze the errors caused by their correlation. On the other hand, for the CCPs, the orthogonal experiment model is employed for selecting the optimal combination of the four calibration parameters. Meanwhile, the weight among the four parameters is also analyzed for reducing errors. Finally, the effectiveness of our proposed method is demonstrated by a large number of experiments. It gives a useful reference to the BSV system used in applied optics research and application fields.
S-PTAM: Stereo Parallel Tracking and Mapping Pire, Taihú; Fischer, Thomas; Castro, Gastón ...
Robotics and autonomous systems,
July 2017, 2017-07-00, Letnik:
93
Journal Article
Recenzirano
Odprti dostop
This paper describes a real-time feature-based stereo SLAM system that is robust and accurate in a wide variety of conditions – indoors, outdoors, with dynamic objects, changing light conditions, ...fast robot motions and large-scale loops. Our system follows a parallel-tracking-and-mapping strategy: a tracking thread estimates the camera pose at frame rate; and a mapping thread updates a keyframe-based map at a lower frequency. The stereo constraints of our system allow a robust initialization – avoiding the well-known bootstrapping problem in monocular systems–and the recovery of the real scale. Both aspects are essential for its practical use in real robotic systems that interact with the physical world.
In this paper we provide the implementation details, an exhaustive evaluation of the system in public datasets and a comparison of most state-of-the-art feature detectors and descriptors on the presented system. For the benefit of the community, its code for ROS (Robot Operating System) has been released.
•Parallel Nature of the SLAM problem is exploited achieving real-time performance.•Stereo constraints are used for point initialization, mapping and tracking phases.•Real-time loop detection and correction are included in the system.•Local Bundle Adjustment runs in parallel to refine local co-visible area.•Wheel odometry can be used to feed the Stereo SLAM system.
Active stereo vision has been used for real-time measurement in both academic and industrial fields. Currently, it is still very challenging for active stereo vision to achieve the continuous and ...robust measurement of a dynamic object due to the lack of a robust pattern extraction and matching method. In this article, we introduce a novel red green blue (RGB) line pattern with coarse-to-fine features in which the intervals between the adjacent green lines are largest and the intervals between the adjacent red lines are smallest. The large interval represents the coarse feature of the pattern and the small interval represents the fine feature of the pattern. Accordingly, we propose a matched pixel difference modeling (MPDM) method to model the matching relationship between two camera views based on the designed pattern. The line pattern is extracted in the hue saturation value (HSV) color space by slope difference distribution (SDD)-based threshold selection. The coarse green lines in two camera views are matched according to the minimum distance principle after the centroid-based alignment. The matching relationship is then modeled based on the matched green lines by least squares (LS) method. With the matching model, the fine blue lines and the fine red lines are matched successively. After stereo matching, the 3-D point on the measured object is computed by the ray intersection method. Experimental results showed that the proposed method is robust in measuring the dynamic objects by single-shot.
Learning-based stereo methods usually require a large scale dataset with depth, however obtaining accurate depth in the real domain is difficult, but groundtruth depth is readily available in the ...simulation domain. In this paper we propose a new framework, ActiveZero++, which is a mixed domain learning solution for active stereovision systems that requires no real world depth annotation. In the simulation domain, we use a combination of supervised disparity loss and self-supervised loss on a shape primitives dataset. By contrast, in the real domain, we only use self-supervised loss on a dataset that is out-of-distribution from either training simulation data or test real data. To improve the robustness and accuracy of our reprojection loss in hard-to-perceive regions, our method introduces a novel self-supervised loss called temporal IR reprojection. Further, we propose the confidence-based depth completion module, which uses the confidence from the stereo network to identify and improve erroneous areas in depth prediction through depth-normal consistency. Extensive qualitative and quantitative evaluations on real-world data demonstrate state-of-the-art results that can even outperform a commercial depth sensor. Furthermore, our method can significantly narrow the Sim2Real domain gap of depth maps for state-of-the-art learning based 6D pose estimation algorithms.