With a deeper understanding of the security issues in steganography, coverless steganography has become a hotspot due to no modification to the carriers. However, the existing coverless video ...steganographic algorithms have considered a few types of video attacks. In this paper, a robust coverless video steganography based on the similarity of inter-frames is proposed. First, a public video database is selected and preprocessed to construct a Secret Communication Video Database (SCVD). The similarity score between the first and last frames is calculated for video sorting to utilize the temporal characteristics of videos. After that, the mapping table between the secret information and the SCVD is designed for both senders and receivers. Finally, each secret information segment can be represented by one video sequence in the SCVD according to the mapping table to accomplish the data hiding and extraction. Experimental results show that the proposed method performs much better in capacity, robustness, and security than the state-of-the-art methods. It is worth mentioning that the proposed method overcomes the security issue of transmitting a large amount of auxiliary information in coverless video steganographic algorithms.
High-frame-rate (HFR) video is emerging in popular gaming applications to enhance the smooth experience perceived by end users. However, it is challenging to guarantee the delivery quality of HFR ...video in mobile cloud gaming scenarios because of the high transmission rate and limited wireless resources. To address this critical problem, we develop a novel transmission scheduling framework dubbed AdaPtive HFR vIdeo Streaming (APHIS). The term adaptive indicates this scheme's capability in dynamically adjusting the video traffic load and forward error correction (FEC) coding. First, we propose an online video frame selection algorithm to minimize the total distortion based on the network status, input video data, and delay constraint. Second, we introduce an unequal FEC coding scheme to provide differentiated protection for Intra (I) and Predicted (P) frames with low-latency cost. The proposed APHIS framework is able to appropriately filter video frames and adjust data protection levels to optimize the quality of HFR video streaming. We conduct extensive emulations in Exata involving HFR video encoded with H.264 codec. Experimental results show that APHIS outperforms the reference transmission schemes in terms of video peak signal-to-noise ratio, end-to-end delay, and goodput. Therefore, we recommend APHIS for delivering HFR video streaming in mobile cloud gaming systems.
Video anomaly recognition in smart cities is an important computer vision task that plays a vital role in smart surveillance and public safety but is challenging due to its diverse, complex, and ...infrequent occurrence in real-time surveillance environments. Various deep learning models use significant amounts of training data without generalization abilities and with huge time complexity. To overcome these problems, in the current work, we present an efficient light-weight convolutional neural network (CNN)-based anomaly recognition framework that is functional in a surveillance environment with reduced time complexity. We extract spatial CNN features from a series of video frames and feed them to the proposed residual attention-based long short-term memory (LSTM) network, which can precisely recognize anomalous activity in surveillance videos. The representative CNN features with the residual blocks concept in LSTM for sequence learning prove to be effective for anomaly detection and recognition, validating our model's effective usage in smart cities video surveillance. Extensive experiments on the real-world benchmark UCF-Crime dataset validate the effectiveness of the proposed model within complex surveillance environments and demonstrate that our proposed model outperforms state-of-the-art models with a 1.77%, 0.76%, and 8.62% increase in accuracy on the UCF-Crime, UMN and Avenue datasets, respectively.
This study examined the relationship between problematic video game play (PVGP), video game usage, and attention deficit hyperactivity disorder (ADHD) traits in an adult population. A sample of 205 ...healthy adult volunteers completed the Adult ADHD Self-Report Scale (ASRS), a video game usage questionnaire, and the Problem Video Game Playing Test (PVGT). A significant positive correlation was found between the ASRS and the PVGT. More specifically, inattention symptoms and time spent playing video games were the best predictors of PVGP. No relationship was found between frequency and duration of play and ADHD traits. Hyperactivity symptoms were not associated with PVGP. Our results suggest that there is a positive relationship between ADHD traits and problematic video game play. In particular, adults with higher level of self-reported inattention symptoms could be at higher risk of PVGP.
High-Efficiency Video Coding (HEVC) is one of the most widely studied coding standards. It still uses the block-based hybrid coding framework of Advanced Video Coding (AVC), and compared to AVC, it ...can double the compression ratio while maintaining the same quality of reconstructed video. The quantization module is an important module in video coding. In the process of quantization, quantization parameter is an important factor in determining the bitrate in video coding, especially in the case of limited channel bandwidth. It is particularly important to select a reasonable quantization parameter to make the bitrate as close as possible to the target bitrate. Aiming at the problem of unreasonable selection of quantization parameters in codecs, this paper proposes using a differential evolution algorithm to assign quantization parameter values to the coding tree unit (CTU) in each frame of 360-degree panoramic video based on HEVC so as to strike a balance between bitrate and distortion. Firstly, the number of CTU rows in a 360-degree panoramic video frame is considered as the dimension of the optimization problem. Then, a trial vector is obtained by randomly selecting the vectors in the population for mutation and crossover. In the mutation step, the algorithm generates a new parameter vector by adding the weighted difference between two population vectors to a third vector. And the elements in the new parameter vector are selected according to the crossover rate. Finally, the trial vector is regarded as the quantization parameter of each CTU in the CTU row to encode, and the vector with the least rate distortion is selected. The algorithm will produce the optimal quantization parameter combination for the current video. The experimental results show that compared to the benchmark algorithm of HEVC reference software HM-16.20, the proposed algorithm can provide a bit saving of 1.86%, while the peak signal-to-noise ratio (PSNR) can be improved by 0.07 dB.
We describe PRISM, a video coding paradigm based on the principles of lossy distributed compression (also called source coding with side information or Wyner-Ziv coding) from multiuser information ...theory. PRISM represents a major departure from conventional video coding architectures (e.g., the MPEGx, H.26x families) that are based on motion-compensated predictive coding, with the goal of addressing some of their architectural limitations. PRISM allows for two key architectural enhancements: (1) inbuilt robustness to "drift" between encoder and decoder and (2) the feasibility of a flexible distribution of computational complexity between encoder and decoder. Specifically, PRISM enables transfer of the computationally expensive video encoder motion-search module to the video decoder. Based on this capability, we consider an instance of PRISM corresponding to a near reversal in codec complexities with respect to today's codecs (leading to a novel light encoder and heavy decoder paradigm), in this paper. We present encouraging preliminary results on real-world video sequences, particularly in the realm of transmission losses, where PRISM exhibits the characteristic of rapid recovery, in contrast to contemporary codecs. This renders PRISM as an attractive candidate for wireless video applications.
There is an abundance of digital video content due to the cloud’s phenomenal growth and security footage; it is therefore essential to summarize these videos in data centers. This paper offers ...innovative approaches to the problem of key frame extraction for the purpose of video summarization. Our approach includes the extraction of feature variables from the bit streams of coded videos, followed by optional stepwise regression for dimensionality reduction. Once the features are extracted and their dimensionality is reduced, we apply innovative frame-level temporal subsampling techniques, followed by training and testing using deep learning architectures. The frame-level temporal subsampling techniques are based on cosine similarity and the PCA projections of feature vectors. We create three different learning architectures by utilizing LSTM networks, 1D-CNN networks, and random forests. The four most popular video summarization datasets, namely, TVSum, SumMe, OVP, and VSUMM, are used to evaluate the accuracy of the proposed solutions. This includes the precision, recall, F-score measures, and computational time. It is shown that the proposed solutions, when trained and tested on all subjective user summaries, achieved F-scores of 0.79, 0.74, 0.88, and 0.81, respectively, for the aforementioned datasets, showing clear improvements over prior studies.
Video affective content analysis has been an active research area in recent decades, since emotion is an important component in the classification and retrieval of videos. Video affective content ...analysis can be divided into two approaches: direct and implicit. Direct approaches infer the affective content of videos directly from related audiovisual features. Implicit approaches, on the other hand, detect affective content from videos based on an automatic analysis of a user's spontaneous response while consuming the videos. This paper first proposes a general framework for video affective content analysis, which includes video content, emotional descriptors, and users' spontaneous nonverbal responses, as well as the relationships between the three. Then, we survey current research in both direct and implicit video affective content analysis, with a focus on direct video affective content analysis. Lastly, we identify several challenges in this field and put forward recommendations for future research.
The application of 360° videos raised the attention of educators and researchers, as it appears to be an approachable option to mediate complete environments in educational settings. However, ...challenges emerge from the perspective of educational psychology. Learning irrelevant cognitive strains might be imposed because it is necessary to navigate through spherical material. However, these potential downsides could be compensated for using signaling techniques. In a two (macrolevel vs. no macrolevel signaling) × two (microlevel vs. no microlevel signaling) factorial between‐subjects design plus control group, 215 fifth‐and sixth‐grade students will watch a 360° video about visual and behavioral characteristics of animals. Learning outcomes, cognitive load, disorientation, and presence will be investigated. It is expected that macrolevel signaling will enhance learning and presence and reduce cognitive load and disorientation. Microlevel signaling will have comparable advantages, but these effects will be more pronounced when macrolevel signaling is implemented.