Deep Reinforcement Learning (DRL) has been effectively performed in various complex environments, such as playing video games. In many game environments, DeepMind’s baseline Deep Q-Network (DQN) game ...agents performed at a level comparable to that of humans. However, these DRL models require many experience samples to learn and lack the adaptability to changes in the environment and handling complexity. In this study, we propose Attention-Augmented Deep Q-Network (AADQN) by incorporating a combined top-down and bottom-up attention mechanism into the DQN game agent to highlight task-relevant features of input. Our AADQN model uses a particle-filter -based top-down attention that dynamically teaches an agent how to play a game by focusing on the most task-related information. In the evaluation of our agent’s performance across eight games in the Atari 2600 domain, which vary in complexity, we demonstrate that our model surpasses the baseline DQN agent. Notably, our model can achieve greater flexibility and higher scores at a reduced number of time steps.Across eight game environments, AADQN achieved an average relative improvement of 134.93%. Pong and Breakout games both experienced improvements of 9.32% and 56.06%, respectively. Meanwhile, SpaceInvaders and Seaquest, which are more intricate games, demonstrated even higher percentage improvements, with 130.84% and 149.95%, respectively. This study reveals that AADQN is productive for complex environments and produces slightly better results in elementary contexts.
Eye trackers are non-invasive devices that can be integrated into VR head-mounted displays and the data they seamlessly provide can be instrumental in mitigating cybersickness. However, the ...connection of eye-activity to cybersickness has not been studied in a broad sense, where the effects of different VR content factors causing cybersickness are examined together. Addressing this gap, we present an extensive investigation of the relationship between eye-activity and cybersickness in response to three major cybersickness factors – navigation speed, scene complexity and stereoscopic rendering – simulated in varied severity. Our findings reveal multiple links between several eye-activity features and user-reported discomfort reports, the most significant of which are associated with speed levels, highlighting the relationship between feeling of vection and eye-activity. The evaluation also established significant differences in eye-activity response with different stimulus types and time spent in VR, suggesting an accumulation effect. Furthermore, the regression analysis hints that blink frequency can be utilized as a significant predictor of cybersickness, regardless of time spent in VR.
•Conjoint evaluation of the effects of speed, complexity and stereoscopic VR factors.•Cybersickness assessed by SSQ and single-question discomfort measures.•Established screen time and factor effect on eye-activity in case of cybersickness.•Blink frequency detected as a reliable cybersickness predictor.
Cybersickness remains a major issue that can severely impact the user’s comfort, performance, and enjoyment of VR. While there are various approaches to combat cybersickness, only a few have been ...developed for real-time mitigation based on user biofeedback, and these do not aim to distinguish causal factors and apply mitigation accordingly. In this paper, we propose a novel real-time cybersickness detection and mitigation system (CDMS) that leverages a two-stage shallow convolutional network to detect cybersickness and identify the contributing factors from the user’s electroencephalogram (EEG) activity. Based on the output of the convolutional network, CDMS adaptively modifies the parameters of the identified factor in the generated virtual environment to mitigate the onset cybersickness. For this, we conjointly consider three major content factors of cybersickness: navigation speed, scene complexity, and stereoscopic rendering. To train the network, we collected EEG data and self-reports of cybersickness from the subjects by simulating these factors in varying degrees of severity. For the performance evaluation of CDMS, we conducted a user study comprising one CDMS session and two different control sessions. The results show that the users experienced significantly less cybersickness after the CDMS session. Also, CDMS effectively avoided false positives that could otherwise degrade the VR experience.
•Introducing a real-tme Cybersickness Detecton and Mitgaton System (CDMS).•Leveraging mobile EEG acquisiton to capture brain actvity.•Considering speed, scene complexity and stereoscopic factors of cybersickness.•CDMS identies the contributng factor and adaptvely adjusts its parameters.•Provides signiicant reducton in cybersickness and avoidance of false-alarms.
Recognition of human actions using machine learning requires extensive datasets to develop robust models. Nevertheless, obtaining real-world data presents challenges due to the costly and ...time-consuming process involved. Additionally, existing datasets mostly contain indoor videos due to the challenges of capturing pose data outdoors. Synthetic data have been used to overcome these difficulties, yet the currently available synthetic datasets for human action recognition lack photorealism and diversity in their features. Addressing these shortcomings, we develop the NOVAction engine to generate highly diversified and photorealistic synthetic human action sequences. We use NOVAction to create the NOVAction23 dataset comprising 25,415 human action sequences with corresponding poses and labels (available at https://github.com/celikcan-cglab/NOVAction23). In NOVAction23, the performed motions and viewpoints are varied on the fly through procedural generation, to ensure that, for a given action class, each generated sequence features a distinct motion performed by one of the 1,105 synthetic humans captured from a unique viewpoint. Moreover, each synthetic human is unique in terms of body shape (height and weight), skin tone, gender, hair, facial hair, clothing, shoes and accessories. To further increase data diversity, the motion sequences are rendered under various weather conditions and at different times of day, across three outdoor and two indoor settings. We evaluate NOVAction23 by training three state-of-the-art recognizers on it, in addition to the NTU 120 dataset, and corroborating using real-world videos from YouTube. Our results confirm that the NOVAction23 dataset can improve the performance of state-of-the-art human action recognition.
Display omitted
•NOVAction engine automatically generates massively diverse human action data.•NOVAction23, created by NOVAction, novel dataset of human action data.•Features 25,415 unique synthetic human action sequences with poses and labels.•Performed by 1,105 synthetic humans, each captured from a unique viewpoint.•Improved human action recognition performance using state-of-the-art classifiers.
Our paper introduces a novel approach for controlling stereo camera parameters in interactive 3D environments in a way that specifically addresses the interplay of binocular depth perception and ...saliency of scene contents. Our proposed Dynamic Attention-Aware Disparity Control (DADC) method produces depth-rich stereo rendering that improves viewer comfort through joint optimization of stereo parameters. While constructing the optimization model, we consider the importance of scene elements, as well as their distance to the camera and the locus of attention on the display. Our method also optimizes the depth effect of a given scene by considering the individual user’s stereoscopic disparity range and comfortable viewing experience by controlling accommodation/convergence conflict. We validate our method in a formal user study that also reveals the advantages, such as superior quality and practical relevance, of considering our method.
This study evaluated the interplay between environmental cues in virtual reality (VR) and cybersickness as experienced by users of head-mounted displays (HMDs). Utilizing electroencephalogram (EEG) ...data and self-reported discomfort measures, the effects of three major VR cues – speed, scene complexity, and stereoscopic rendering – on cybersickness were examined, with the latter being of particular interest as it had not previously been studied explicitly in the context of VR-HMDs. Self-reported discomfort was assessed through in-VR single-item queries and post-VR simulator sickness questionnaires, accounting for both immediate and persistent cybersickness, respectively, and over three experiment sessions, accounting for the effects of accumulation. Analysis revealed connections that indicate a relationship between EEG data and the presence of cybersickness for all three cue types. Significant differences were observed in EEG relative power changes between the trials where cybersickness was and was not reported. EEG relative power changes were also linked to both immediate and persistent cybersickness, especially in the theta and gamma frequency bands. The increase in immediate discomfort with the stereoscopic rendering cues over successive sessions suggests a decrease in tolerance to these effects over time.
•Evaluating the effects of speed, complexity and stereoscopic cues on cybersickness.•As reflected in brain activity feedback and self-reported measures.•Accounting for both immediate and persistent cybersickness.•Depicting the relationship between EEG markers and cybersickness.
Robust visual tracking plays a vital role in many areas such as autonomous cars, surveillance and robotics. Recent trackers were shown to achieve adequate results under normal tracking scenarios with ...clear weather conditions, standard camera setups and lighting conditions. Yet, the performance of these trackers, whether they are correlation filter-based or learning-based, degrade under adverse weather conditions. The lack of videos with such weather conditions, in the available visual object tracking datasets, is the prime issue behind the low performance of the learning-based tracking algorithms. In this work, we provide a new person tracking dataset of real-world sequences (PTAW172Real) captured under foggy, rainy and snowy weather conditions to assess the performance of the current trackers. We also introduce a novel person tracking dataset of synthetic sequences (PTAW217Synth) procedurally generated by our NOVA framework spanning the same weather conditions in varying severity to mitigate the problem of data scarcity. Our experimental results demonstrate that the performances of the state-of-the-art deep trackers under adverse weather conditions can be boosted when the available real training sequences are complemented with our synthetically generated dataset during training.
Display omitted
•A novel real dataset for pedestrian tracking under adverse weather conditions•Showed state-of-the-art trackers perform poorly under adverse weather conditions•Procedurally generated a synthetic dataset covering adverse weather conditions•Our synthetic dataset boosts trackers performance with adverse weather videos
Abstract Extended exposure to virtual reality displays has been linked to the emergence of cybersickness, characterized by symptoms such as nausea, dizziness, fatigue, and disruptions in eye ...movements. The main objective of our study is to examine the effects of real-time fine-tuning of stereo parameters and blurriness in virtual reality on the discomfort level of users who are experiencing motion sickness triggered by the display. Our hypothesis proposes that by dynamically correcting the rendering settings, the symptoms of motion sickness can be relieved and the overall VR user experience can be improved. Our methodology commences with a prediction model for the comfort level of the viewer based on their gaze parameters, such as pupil diameter, blink count, gaze position, and fixation duration. We then propose a method to dynamically adapt the stereoscopic rendering parameters by considering the predicted comfort level of the viewer.
Despite remarkable advances in virtual reality (VR) technologies, serious challenges remain in making extended VR sessions with head-mounted displays (HMDs) thoroughly comfortable. 3D stereo imagery ...can cause discomfort and eye fatigue due to poor stereo camera settings that result in extreme disparities and vergence-accommodation conflicts. The default stereoscopic parameters of consumer HMDs produce images with shallow depth to circumvent these issues. In this work, we propose a methodology to utilize the gaze-directed and visual saliency-guided paradigms for automatic stereo camera control in real-time interactive VR by employing the basics of stereo grading. We evaluate these two approaches at different levels of interaction, first through a user study and then through a performance benchmark. The results show that the gaze-directed approach outperforms the saliency-guided approach in the VEs tested and both methods are able to convey a better overall depth feeling than the default HMD setting without hindering visual comfort. It is also shown that both approaches lead to a significant overall enhancement of the VR experience in the more interactive VE.
Display omitted
•A methodology for gaze-directed and saliency-guided stereo camera control in VR.•Gaze-directed approach outperforms the saliency-guided approach.•Both approaches convey a better overall depth feeling than the default HMD setting.•Both lead to a significantly improved experience in the more interactive VE.•Both can surpass 80 FPS using a modest setup.
Preliminary studies have provided promising results on the feasibility of virtual reality (VR) interventions for Obsessive-Compulsive Disorder. The present study investigated whether VR scenarios ...that were developed for contamination concerns evoke anxiety, disgust, and the urge to wash in individuals with high (n = 33) and low (n = 33) contamination fear. In addition, the feasibility of VR exposure in inducing disgust was examined through testing the mediator role of disgust in the relationship between contamination anxiety and the urge to wash. Participants were immersed in virtual scenarios with varying degrees of dirtiness and rated their level of anxiety, disgust, and the urge to wash after performing the virtual tasks. Data were collected between September and December 2019. The participants with high contamination fear reported higher contamination-related ratings than those with low contamination fear. The significant main effect of dirtiness indicated that anxiety and disgust levels increased with increasing overall dirtiness of the virtual scenarios in both high and low contamination fear groups. Moreover, disgust elicited by VR mediated the relationship between contamination fear and the urge to wash. The findings demonstrated the feasibility of VR in eliciting emotional responses that are necessary for conducting exposure in individuals with high contamination fear. In conclusion, VR can be used as an alternative exposure tool in the treatment of contamination-based OCD.