Mobile virtual reality (VR) is increasingly becoming popular and accessible to everyone that holds a smartphone. In particular, digital didactics can take advantage of natural interaction and ...immersion in virtual environments, starting from primary education. This paper investigates the problem of enhancing music learning in primary education through the use of mobile VR. To this end, technical and methodological frameworks were developed, and were tested with two classes in the last year of a primary school (10 years old children). The classes were involved in an evaluation study on music genre identification and learning with a multi-platform mobile application called VR4EDU. Students were immersed in music performances of different genres (e.g., classical, country, jazz, and swing), navigating inside several musical rooms. The evaluation of the didactic protocol shows a statistically significant improvement in learning genre characterization (i.e., typical instruments and their spatial arrangements on stage) compared to traditional lessons with printed materials and passive listening. These results show that the use of mobile VR technologies in synergy with traditional teaching methodologies can improve the music learning experience in primary education, in terms of active listening, attention, and time. The inclusion of pupils with certified special needs strengthened our results.
Individual head-related transfer functions (HRTFs) are critical for binaural spatial audio rendering. In contrast to anthropometric parameters and pinnae images, 3D meshes allow for a more direct and ...comprehensive representation of the anthropometric structure, which provides highly effective inputs for modeling individualized HRTFs. This paper presents a neural network-based method for predicting individualized HRTFs in full space based on 3D meshes. Unlike many previous methods that estimate HRTF spectra at sampling grids or frequencies separately, the proposed model predicts the HRTF spectra of each vertical plane by considering the spectral correlation and continuity across adjacent sampling grids and frequencies. Evaluation results indicate that the proposed method enhances the prominence of peaks and notches in the obtained HRTF spectra and improves the speed and accuracy of HRTF individualization. The log spectral distortion of the proposed method is lower than that of state-of-the-art methods using anthropometric parameters and pinnae images. Further evaluation confirms that the proposed method requires significantly fewer points in 3D meshes when compared to numerical simulation methods. The evaluation based on localization models demonstrates that the HRTFs predicted by the proposed method are perceptually similar to the measured HRTFs.
•Performance of the neural network-based method for HRTF individualization using 3D meshes.•The vertical-plane shared feature to represent the features of HRTFs across all vertical planes.•The influence of varying mesh densities on performance.
This paper presents the progress implemented during the AURA project, funded by the Creative-Europe program with project partners from Germany, Italy, and Ukraine. The project aims to create ...auralized applications for three music venues in each of the project countries, namely the Konzerthaus Berlin, the Teatro del Maggio in Florence, and the Opera House Lviv. Each will be digitally recreated and auralized before they are then used to conduct case studies. This paper gives insights into current digitalization and auralization techniques. The results of a digital survey will be laid out and the conception and implementation of a first auralized prototype using a hand-modeled 3D object from the Great Hall of the Konzerthaus Berlin will be demonstrated. Furthermore, the usage of auralization for touristic purposes will be investigated using artificial intelligence for an audience preference analysis. A conclusion will be drawn and a short outlook into the ongoing course of the AURA project will be given.
People often use audio-only communication to connect with others. Spatialization of audio has been previously found to improve immersion, presence, and social presence during conversations. We ...propose that spatial audio improves social connectedness between dyads. Participants engaged in three 8-min semi-structured conversations with an acquainted partner in three conditions: in-person communication, monaural audio communication, and spatial audio communication. Using Media Naturalness Theory as our theoretical framework, we found that the use of spatial audio benefited aspects of social connectedness. While in-person communication yielded the greatest social connectedness, spatial audio better facilitated social connectedness than traditional monaural communication. Spatial audio improved feelings of being physically in the same room and being on the same wavelength and produced more nonverbal behaviors associated with rapport building than monaural communication.
Real-life acoustic scenes may be recorded with microphone arrays for spatial audio applications, especially for the purpose of reproducing binaural signals for headphone listening. However, the ...presence of noise and interference may necessitate preprocessing to enhance the desired signal and improve the listener experience. Various methods have been developed to reduce noise while preserving the desired signal component with minimal distortion. The additional challenges posed by time-varying acoustic scenes are commonly addressed by segmenting the recorded signals into short time frames. Then, the short-time Fourier transform (STFT) is employed with multi-channel Wiener filter (MWF) and assuming the multiplicative transfer function (MTF) approximation. This approximation may not apply in the presence of long reverberation times and/or short STFT frames, so alternative techniques are required. This paper explores MWF-based enhancement in time-varying acoustic scenes where the MTF approximation is inapplicable, both analytically and experimentally with normal-hearing listeners. The investigated scene comprises a single desired source in a reverberant environment, and the impact of frame length and acoustic parameters on the rank of the spatial covariance matrix is studied. It is revealed that superior results in terms of reduced distortion and improved listener experience are achieved when using a full-rank spatial covariance matrix.
There is increasing effort to characterize the soundscapes around us so that we can design more compelling and immersive experiences. This review paper focuses on the challenges and opportunities ...around sound perception, with a particular focus on spatial sound perception in a virtual reality (VR) cityscape. We review how research on temporal aspects has recently been extended to evaluating spatial factors when designing soundscapes. In particular, we discuss key findings on the human capability of localizing and distinguishing spatial sound cues for different technical setups. We highlight studies carried out in both real-world and virtual reality settings to evaluate spatial sound perception. We conclude this review by highlighting the opportunities offered by VR technology and the remaining open questions for virtual soundscape designers, especially with the advances in spatial sound stimulation.
In this paper, we present a novel multi-modal attention guidance method designed to address the challenges of turn-taking dynamics in meetings and enhance group conversations within virtual reality ...(VR) environments. Recognizing the difficulties posed by a confined field of view and the absence of detailed gesture tracking in VR, our proposed method aims to mitigate the challenges of noticing new speakers attempting to join the conversation. This approach tailors attention guidance, providing a nuanced experience for highly engaged participants while offering subtler cues for those less engaged, thereby enriching the overall meeting dynamics. Through group interview studies, we gathered insights to guide our design, resulting in a prototype that employs light as a diegetic guidance mechanism, complemented by spatial audio. The combination creates an intuitive and immersive meeting environment, effectively directing users' attention to new speakers. An evaluation study, comparing our method to state-of-the-art attention guidance approaches, demonstrated significantly faster response times (p < 0.001), heightened perceived conversation satisfaction (p < 0.001), and preference (p < 0.001) for our method. Our findings contribute to the understanding of design implications for VR social attention guidance, opening avenues for future research and development.
The domain of spatial audio comprises methods for capturing, processing, and reproducing audio content that contains spatial information. Data-based methods are those that operate directly on the ...spatial information carried by audio signals. This is in contrast to model-based methods, which impose spatial information from, for example, metadata like the intended position of a source onto signals that are otherwise free of spatial information. Signal processing has traditionally been at the core of spatial audio systems, and it continues to play a very important role. The irruption of deep learning in many closely related fields has put the focus on the potential of learning-based approaches for the development of data-based spatial audio applications. This article reviews the most important application domains of data-based spatial audio including well-established methods that employ conventional signal processing while paying special attention to the most recent achievements that make use of machine learning. Our review is organized based on the topology of the spatial audio pipeline that consist in capture, processing/manipulation, and reproduction. The literature on the three stages of the pipeline is discussed, as well as on the spatial audio representations that are used to transmit the content between them, highlighting the key references and elaborating on the underlying concepts. We reflect on the literature based on a juxtaposition of the prerequisites that made machine learning successful in domains other than spatial audio with those that are found in the domain of spatial audio as of today. Based on this, we identify routes that may facilitate future advancement.
Navigating Hierarchical Menus Through Sound Perry, Andrew C.; Terry, Lucas A.; Werner, Steffen
Proceedings of the Human Factors and Ergonomics Society Annual Meeting,
09/2023, Letnik:
67, Številka:
1
Journal Article
Recenzirano
Much of the current user experience navigating digital information relies on visual displays. Providing auditory alternatives enables access for visually impaired and blind users. The current study ...evaluated hierarchical navigation in a novel spatialized auditory interface compared to a screen reader and visual navigation mode. Past studies suggest that spatialized audio may provide performance improvements. Sixteen participants navigated menu structures of varying depth / breadth to select a target item with three different interface styles (spatial audio, screen reader, and visual). Time-to-completion, errors, and SUS scores were compared across interfaces. Results showed that spatial audio was significantly slower, more error prone, and less usable than the other conditions. However, the problems of spatial audio might be overcome with simple changes in the interaction mode and optimization of display space. Our experience shows that designers must solve the problem of auditory clutter and spatial selection to achieve usable auditory navigation.