Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years. The complexity of the task arises from the commonly-adopted definition ...of an abnormal event, that is, a rarely occurring event that typically depends on the surrounding context. Following the standard formulation of abnormal event detection as outlier detection, we propose a background-agnostic framework that learns from training videos containing only normal events. Our framework is composed of an object detector, a set of appearance and motion auto-encoders, and a set of classifiers. Since our framework only looks at object detections, it can be applied to different scenes, provided that normal events are defined identically across scenes and that the single main factor of variation is the background. This makes our method background agnostic, as we rely strictly on objects that can cause anomalies, and not on the background. To overcome the lack of abnormal data during training, we propose an adversarial learning strategy for the auto-encoders. We create a scene-agnostic set of out-of-domain pseudo-abnormal examples, which are correctly reconstructed by the auto-encoders before applying gradient ascent on the pseudo-abnormal examples. We further utilize the pseudo-abnormal examples to serve as abnormal examples when training appearance-based and motion-based binary classifiers to discriminate between normal and abnormal latent features and reconstructions. Furthermore, to ensure that the auto-encoders focus only on the main object inside each bounding box image, we introduce a branch that learns to segment the main object. We compare our framework with the state-of-the-art methods on four benchmark data sets, using various evaluation metrics. Compared to existing methods, the empirical results indicate that our approach achieves favorable performance on all data sets. In addition, we provide region-based and track-based annotations for two large-scale abnormal event detection data sets from the literature, namely ShanghaiTech and Subway.
Social networking services are becoming increasingly popular during the daily lives of Internet citizens, especially since the advent of smart mobile devices with integrated utility modules such as ...4G/WIFI connectivity, global positioning services, cameras, and heart beat sensors. Many devices are available for sharing information at any time, which can be listed by posting a photo, sharing a status, or narrating an event. The behavior of users means that the flow of data (or a social data stream) has real-time characteristics, which actually comprise notifications about your friends’ posts after a short delay for diffusion over the network. The data stream contains news pieces related to real social facts as well as unfocused information. In addition, important information (or events) attracts more public attention, which is demonstrated by the number of relevant messages or communication interactions between people interested in specific topics. From a technical perspective, the characteristics of data in the aforementioned scenario provide us with an opportunity to construct a model that can automatically determine the occurrence of events based on a social data stream. In this study, we propose an approach to solve the problem of early event identification, which requires appropriate approaches for processing incoming data in terms of the processing performance and number of data.
•We propose a real-time event detection method based on Twitter.•Frequency-based analytics indicate that the method performs well using social streams.•Online behavioral analysis of multiple users can be applied with big social data.
Social networks have been recently employed as a source of information for event detection, with particular reference to road traffic congestion and car accidents. In this paper, we present a ...real-time monitoring system for traffic event detection from Twitter stream analysis. The system fetches tweets from Twitter according to several search criteria; processes tweets, by applying text mining techniques; and finally performs the classification of tweets. The aim is to assign the appropriate class label to each tweet, as related to a traffic event or not. The traffic detection system was employed for real-time monitoring of several areas of the Italian road network, allowing for detection of traffic events almost in real time, often before online traffic news web sites. We employed the support vector machine as a classification model, and we achieved an accuracy value of 95.75% by solving a binary classification problem (traffic versus nontraffic tweets). We were also able to discriminate if traffic is caused by an external event or not, by solving a multiclass classification problem and obtaining an accuracy value of 88.89%.
This paper studies the fundamental dimensionality of synchrophasor data, and proposes an online application for early event detection using the reduced dimensionality. First, the dimensionality of ...the phasor measurement unit (PMU) data under both normal and abnormal conditions is analyzed. This suggests an extremely low underlying dimensionality despite the large number of the raw measurements. An early event detection algorithm based on the change of core subspaces of the PMU data at the occurrence of an event is proposed. Theoretical justification for the algorithm is provided using linear dynamical system theory. Numerical simulations using both synthetic and realistic PMU data are conducted to validate the proposed algorithm.
Pooling plays an important role in generating a discriminative video representation. In this paper, we propose a new semantic pooling approach for challenging event analysis tasks (e.g., event ...detection, recognition, and recounting) in long untrimmed Internet videos, especially when only a few shots/segments are relevant to the event of interest while many other shots are irrelevant or even misleading. The commonly adopted pooling strategies aggregate the shots indifferently in one way or another, resulting in a great loss of information. Instead, in this work we first define a novel notion of semantic saliency that assesses the relevance of each shot with the event of interest. We then prioritize the shots according to their saliency scores since shots that are semantically more salient are expected to contribute more to the final event analysis. Next, we propose a new isotonic regularizer that is able to exploit the constructed semantic ordering information. The resulting nearly-isotonic support vector machine classifier exhibits higher discriminative power in event analysis tasks. Computationally, we develop an efficient implementation using the proximal gradient algorithm, and we prove new and closed-form proximal steps. We conduct extensive experiments on three real-world video datasets and achieve promising improvements.
Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNNs) are able to extract ...higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs and RNNs as classifiers have recently shown improved performances over established methods in various sound recognition tasks. We combine these two approaches in a convolutional recurrent neural network (CRNN) and apply it on a polyphonic sound event detection task. We compare the performance of the proposed CRNN method with CNN, RNN, and other established methods, and observe a considerable improvement for four different datasets consisting of everyday sound events.
Sleep stage and Apnea-Hypopnea Index (AHI) are the most important metrics in the diagnosis of sleep syndrome disease. In previous studies, these two tasks are usually implemented separately, which is ...both time and resource consuming. In this work, we propose a novel single EEG based collaborative learning network (EEG-CLNet) for simultaneous sleep staging and obstructive sleep apnea (OSA) event detection through multi-task collaborative learning. The EEG-CLNet regards different tasks as a common unit to extract features from intra-groups via both local parameter sharing and cross task knowledge distillation, rather than just sharing parameters or shortening the distance between different tasks. Our approach has been validated on two datasets with the same or better performance than other methods. The experimental results show that our method achieves a performance gain of 1% - 5% compared with the baseline. Compared to previous works where two or even more models were required to perform sleep staging and OSA event detection, The EEG-CLNet could reduce the total number of model parameters and facilitate the model to mine the hidden relationships between different task semantic information. More importantly, it effectively alleviates the task bias problem in hard parameter sharing. As a consequence, this approach has notable potential to be a solution for lightweight wearable sleep monitoring system in the future.
Detecting hot social events (e.g., political scandal, momentous meetings, natural hazards, etc.) from social messages is crucial as it highlights significant happenings to help people understand the ...real world. On account of the streaming nature of social messages, incremental social event detection models in acquiring, preserving, and updating messages over time have attracted great attention. However, the challenge is that the existing event detection methods towards streaming social messages are generally confronted with ambiguous events features, dispersive text contents, and multiple languages, and hence result in low accuracy and generalization ability. In this paper, we present a novel rein F orced, i ncremental and cross-li n gual social Event detection architecture, namely FinEvent , from streaming social messages. Concretely, we first model social messages into heterogeneous graphs integrating both rich meta-semantics and diverse meta-relations, and convert them to weighted multi-relational message graphs. Second, we propose a new reinforced weighted multi-relational graph neural network framework by using a Multi-agent Reinforcement Learning algorithm to select optimal aggregation thresholds across different relations/edges to learn social message embeddings. To solve the long-tail problem in social event detection, a balanced sampling strategy guided Contrastive Learning mechanism is designed for incremental social message representation learning. Third, a new Deep Reinforcement Learning guided density-based spatial clustering model is designed to select the optimal minimum number of samples required to form a cluster and optimal minimum distance between two clusters in social event detection tasks. Finally, we implement incremental social message representation learning based on knowledge preservation on the graph neural network and achieve the transferring cross-lingual social event detection. We conduct extensive experiments to evaluate the FinEvent on Twitter streams, demonstrating a significant and consistent improvement in model quality with 14%-118%, 8%-170%, and 2%-21% increases in performance on offline, online, and cross-lingual social event detection tasks.
Automatically detecting anomalies in videos is a challenging problem due to non-deterministic definitions of abnormal events and lack of sufficient training data. To address these issues, we propose ...an autoencoder coupled with attention model to discover normal patterns in videos via adversarial learning. Abnormal events are detected by diverging them from the normal patterns with the reconstruction error produced by the autoencoder. To this end, we build an end-to-end trainable adversarial attention-based autoencoder network, called Ada-Net, to make the reconstructed frames indistinguishable from original frames. The Ada-Net combines an autoencoder network and a GAN model that is used to benefit enhancing the reconstruction ability of the autoencoder. To further improve the reconstruction performance, we integrate an attention model into the decoder to dynamically select informative parts of encoding features for decoding. The attenion mechanism is helpful to preserving important information for learning intrinsic normal patterns. Evaluations on four challenging datasets, including the Subway, the UCSD Pedestrian, the CUHK Avenue, and the ShanghaiTech datasets, demonstrate the effectiveness of the proposed method.
Increasing demand for larger touch screen panels (TSPs) places more energy burden to mobile systems with conventional sensing methods. To mitigate this problem, taking advantage of the touch event ...sparsity, this paper proposes a novel TSP readout system that can obtain huge energy saving by turning off the readout circuits when none of the sensors are activated. To this end, a novel ultra-low-power always-on event and region of interest detection based on lightweight compressed sensing is proposed. Exploiting the proposed event detector, the context-aware TSP readout system, which can improve the energy efficiency by up to 42×, is presented.