Abnormal behavior detection in crowd scenes is continuously a challenge in the field of computer vision. For tackling this problem, this paper starts from a novel structure modeling of crowd ...behavior. We first propose an informative structural context descriptor (SCD) for describing the crowd individual, which originally introduces the potential energy function of particle's interforce in solid-state physics to intuitively conduct vision contextual cueing. For computing the crowd SCD variation effectively, we then design a robust multi-object tracker to associate the targets in different frames, which employs the incremental analytical ability of the 3-D discrete cosine transform (DCT). By online spatial-temporal analyzing the SCD variation of the crowd, the abnormality is finally localized. Our contribution mainly lies on three aspects: 1) the new exploration of abnormal detection from structure modeling where the motion difference between individuals is computed by a novel selective histogram of optical flow that makes the proposed method can deal with more kinds of anomalies; 2) the SCD description that can effectively represent the relationship among the individuals; and 3) the 3-D DCT multi-object tracker that can robustly associate the limited number of (instead of all) targets which makes the tracking analysis in high density crowd situation feasible. Experimental results on several publicly available crowd video datasets verify the effectiveness of the proposed method.
Visual tracking is a central topic in computer vision. However, the accurate localization of target object in extreme conditions (such as occlusion, scaling, illumination change, and shape ...transformation) still remains a challenge. In this paper, we explore utilizing multi-cue information to ensure a robust tracking. Optical flow, color and depth clues are simultaneously incorporated in our framework. The optical flow can get a rough estimation of the target location. Then the part-based structure is adopted to establish the precise position, combining both color and depth statistics. In order to validate the robustness of the proposed method, we take four video sequences of different demanding situations and compare our method with five competitive ones representing state of the arts. Experiments prove the effectiveness of the proposed method.
Tunnels cause many traffic accidents in China every year, which is related to the inappropriate driving behaviors combined with less visual comfortability when driving across tunnels. The most common ...phenomenon when driving across the inlet and outlet sections of a tunnel is the abruptly varying visual brightness, causing a short-term blindness for a driver, known as the "black hole" and "white hole" effects, respectively. In this paper, we propose a tunnel brightness compensation model with spatial-temporal visual-content preservation. The highlights of this paper are twofold. 1) We analyze the visual brightness variation by introducing spatio-temporal orientation energy for a stable characterization of the visual comfortability of drivers in view of the distinguishable visual content caused by a sequential brightness variation. 2) We construct a tunnel brightness compensation model that preserves the spatial-temporal visual content by visual-content matching with multiple frames. Our method can manifestly improve the brightness quality and maintain the scene content simultaneously. Extensive visual brightness compensation experiments on 60 visual clips of inlet and outlet sections of tunnels demonstrate that the proposed method generates a state-of-the-art performance.
Robust Superpixel Tracking via Depth Fusion Yuan, Yuan; Fang, Jianwu; Wang, Qi
IEEE transactions on circuits and systems for video technology,
2014-Jan., 2014, 2014-01-00, Letnik:
24, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Although numerous trackers have been designed to adapt to the nonstationary image streams that change over time, it remains a challenging task to facilitate a tracker to accurately distinguish the ...target from the background in every frame. This paper proposes a robust superpixel-based tracker via depth fusion, which exploits the adequate structural information and great flexibility of mid-level features captured by superpixels, as well as the depth-map's discriminative ability for the target and background separation. By introducing graph-regularized sparse coding into the appearance model, the local geometrical structure of data is considered, and the resulting appearance model has a more powerful discriminative ability. Meanwhile, the similarity of the target superpixels' neighborhoods in two adjacent frames is also incorporated into the refinement of the target estimation, which helps a more accurate localization. Most importantly, the depth cue is fused into the superpixel-based target estimation so as to tackle the cluttered background with similar appearance to the target. To evaluate the effectiveness of the proposed tracker, four video sequences of different challenging situations are contributed by the authors. The comparison results demonstrate that the proposed tracker has more robust and accurate performance than seven ones representing the state-of-the-art.
Object-based change detection (CD) is an effective method of identifying detailed changes in land features by contrastively observing the same areas of high-resolution remote sensing images at ...different times. Binarization is the important step in partitioning changed and unchanged classes in the unsupervised domain. We formulate a novel binarization technique based on the Weibull mixture model, where generated similarity measure images are modeled using a mixture of nonnormal Weibull distributions. The parameters in the model are further globally estimated by employing a genetic algorithm. Two data sets with high-resolution remote sensing images are used to evaluate the effectiveness of the proposed method. Experimental results demonstrate that the method allows better and more robust unsupervised object-based CD than do state-of-the-art threshold-based and clustering-based methods. Advantages of the proposed method are embodied in the modeling of relatively few data of the changed class with a skewed and long tail distribution.
For analyzing the traffic anomaly within dashcam videos from the perspective of ego-vehicles, the agent should spatial-temporally localize the abnormal occasion and regions and give a semantically ...recounting of what happened. Most existing formulations concentrate on the former spatial-temporal aspect and mainly approach this goal by training normal pattern classifiers/regressors/dictionaries with large-scale availably labeled data. However, anomalies are context-related, and it is difficult to distinguish the margin of abnormal and normal clearly. This paper proposes a progressive unsupervised driving anomaly detection and recounting (D&R) framework. The highlights are three-fold: (1) We formulate driving anomaly D&R as a temporal-spatial-semantic (TSS) model, which achieves a coarse-to-fine focusing and generates convincing driving anomaly D&R. (2) This work contributes an unsupervised D&R without any training data while performing an effective performance. (3) We novelly introduce the traffic saliency, isolation forest, visual semantic causal relations of driving scene to effectively construct the TSS model. Extensive experiments on a driving anomaly dataset with 106 video clips (temporal-spatial-semantically labeled carefully by ourselves) demonstrate superior performance over existing techniques.
Visual tracking in condition of occlusion, appearance or illumination change has been a challenging task over decades. Recently, some online trackers, based on the detection by classification ...framework, have achieved good performance. However, problems are still embodied in at least one of the three aspects: 1) tracking the target with a single region has poor adaptability for occlusion, appearance or illumination change; 2) lack of sample weight estimation, which may cause overfitting issue; and 3) inadequate motion model to prevent target from drifting. For tackling the above problems, this paper presents the contributions as follows: 1) a novel part-based structure is utilized in the online AdaBoost tracking; 2) attentional sample weighting and selection is tackled by introducing a weight relaxation factor, instead of treating the samples equally as traditional trackers do; and 3) a two-stage motion model, multiple parts constraint, is proposed and incorporated into the part-based structure to ensure a stable tracking. The effectiveness and efficiency of the proposed tracker is validated upon several complex video sequences, compared with seven popular online trackers. The experimental results show that the proposed tracker can achieve increased accuracy with comparable computational cost.
Land cover (LC) information plays an important role in different geoscience applications such as land resources and ecological environment monitoring. Enhancing the automation degree of LC ...classification and updating at a fine scale by remote sensing has become a key problem, as the capability of remote sensing data acquisition is constantly being improved in terms of spatial and temporal resolution. However, the present methods of generating LC information are relatively inefficient, in terms of manually selecting training samples among multitemporal observations, which is becoming the bottleneck of application-oriented LC mapping. Thus, the objectives of this study are to speed up the efficiency of LC information acquisition and update. This study proposes a rapid LC map updating approach at a geo-object scale for high-spatial-resolution (HSR) remote sensing. The challenge is to develop methodologies for quickly sampling. Hence, the core step of our proposed methodology is an automatic method of collecting samples from historical LC maps through combining change detection and label transfer. A data set with Chinese Gaofen-2 (GF-2) HSR satellite images is utilized to evaluate the effectiveness of our method for multitemporal updating of LC maps. Prior labels in a historical LC map are certified to be effective in a LC updating task, which contributes to improve the effectiveness of the LC map update by automatically generating a number of training samples for supervised classification. The experimental outcomes demonstrate that the proposed method enhances the automation degree of LC map updating and allows for geo-object-based up-to-date LC mapping with high accuracy. The results indicate that the proposed method boosts the ability of automatic update of LC map, and greatly reduces the complexity of visual sample acquisition. Furthermore, the accuracy of LC type and the fineness of polygon boundaries in the updated LC maps effectively reflect the characteristics of geo-object changes on the ground surface, which makes the proposed method suitable for many applications requiring refined LC maps.
Recently, deep learning methods, for example, convolutional neural networks (CNNs), have achieved high performance in hyperspectral image (HSI) classification. The limited training samples of HSI ...images make it hard to use deep learning methods with many layers and a large number of convolutional kernels as in large scale imagery tasks, and CNN-based methods usually need long training time. In this paper, we present a wide sliding window and subsampling network (WSWS Net) for HSI classification. It is based on layers of transform kernels with sliding windows and subsampling (WSWS). It can be extended in the wide direction to learn both spatial and spectral features more efficiently. The learned features are subsampled to reduce computational loads and to reduce memorization. Thus, layers of WSWS can learn higher level spatial and spectral features efficiently, and the proposed network can be trained easily by only computing linear weights with least squares. The experimental results show that the WSWS Net achieves excellent performance with different hyperspectral remotes sensing datasets compared with other shallow and deep learning methods. The effects of ratio of training samples, the sizes of image patches, and the visualization of features in WSWS layers are presented.
With the rapid development of research on machine learning models, especially deep learning, more and more endeavors have been made on designing new learning models with properties such as fast ...training with good convergence, and incremental learning to overcome catastrophic forgetting. In this paper, we propose a scalable wide neural network (SWNN), composed of multiple multi-channel wide RBF neural networks (MWRBF). The MWRBF neural network focuses on different regions of data and nonlinear transformations can be performed with Gaussian kernels. The number of MWRBFs for proposed SWNN is decided by the scale and difficulty of learning tasks. The splitting and iterative least squares (SILS) training method is proposed to make the training process easy with large and high dimensional data. Because the least squares method can find pretty good weights during the first iteration, only a few succeeding iterations are needed to fine tune the SWNN. Experiments were performed on different datasets including gray and colored MNIST data, hyperspectral remote sensing data (KSC, Pavia Center, Pavia University, and Salinas), and compared with main stream learning models. The results show that the proposed SWNN is highly competitive with the other models.