Head-related transfer function (HRTF) individualization can improve the perception of binaural sound. The interaural time difference (ITD) of the HRTF is a relevant cue for sound localization, ...especially in azimuth. Therefore, individualization of the ITD is likely to result in better sound spatial localization. A study of ITD has been conducted from a perceptual point of view using data from individual HRTF measurements and subjective perceptual tests. Two anthropometric dimensions have been demonstrated in relation to the ITD, predicting the subjective behavior of various subjects in a perceptual test. With this information, a method is proposed to individualize the ITD of a generic HRTF set by adapting it with a scale factor, which is obtained by a linear regression formula dependent on the two previous anthropometric dimensions. The method has been validated with both objective measures and another perceptual test. In addition, practical regression formula coefficients are provided for fitting the ITD of the generic HRTFs of the widely used Brüel & Kjær 4100 and Neumann KU100 binaural dummy heads.
Virtual reality provides the possibility for interactive visits to historic buildings and sites. The majority of current virtual reconstructions have focused on creating realistic virtual ...environments, by concentrating on the visual component. However, by incorporating more authentic acoustical properties into visual models, a more realistic rendering of the studied venue is achieved. In historic auralizations, calibration of the studied building’s room acoustic simulation model is often necessary to come to a realistic representation of its acoustical environment. This paper presents a methodical calibration procedure for geometrical acoustics models using room acoustics prediction programs based on geometrical acoustics to create realistic virtual audio realities, or auralizations. To develop this procedure, a small unfinished amphitheater was first chosen due to its general simplicity and considerable level of reverberation. A geometrical acoustics model was calibrated according to the results of acoustical measurements. Measures employed during the calibration of this model were analyzed to come to a methodical calibration procedure. The developed procedure was then applied to a more complex building, the abbey church Saint-Germain-des-Prés. A possible application of the presented procedure is to enable interactive acoustical visits of former configurations of buildings. A test case study was carried out for a typical seventeenth-century configuration of the Saint-Germain-des-Prés.
In literature, particle velocity has been introduced to improve performance of spatial sound field reproduction systems. However, all existing work requires to have accurate particle velocity ...measurements at all of the discrete control points, which is difficult to obtain in real-world applications. In this work, we formulate continuous particle velocity expressions over space as a function of pressure coefficients in the modal domain that can be easily extracted by using a higher order microphone. The sound field within a target region is controlled by a weighted cost function we built to optimize the continuous particle velocity, as well as sound pressure, on the boundary of the region. In contrast to the conventional spatial sound field reproduction methods in the modal domain, the proposed method allows for non-uniform loudspeaker geometry with a limited number of loudspeakers, thus providing a flexible array arrangement. The performance of the proposed method is evaluated through numerical simulations in both a free field and a reverberant room. Finally, we prove the proposed method in an objective experiment with real-world measurements of room impulse response.
The availability of robust and cheaper hardware tools in recent years has allowed the possibility of effectively integrating spatial sound as an organic component of dance and music performance ...projects. Inspired by these recent developments, an original body-worn sound system was designed, implemented and acoustically optimized aiming to effectively integrate performers' movements and amplified sounds on stage. An innovative acoustic measurement method was used in connection with tests and interviews with practitioners to assess various sonic and artistic features of the implemented sound vest prototype. Audience questionnaires were employed to assess the perceived acoustic performance of the system in a medium size dance studio. Survey results showed that the perceived acoustic performance of the sound vest is highly dependent on the types of sound materials radiated by the system, as well as on the position of the performer in a room. Future work will consider the implementation and assessment of an extended hybrid spatial system consisting of several mobile sound sources synchronized with a fixed multi-channel sound reproduction system.
Reproduction of high quality spatial sound has gained considerable importance with the recent technology developments in the fields of virtual and augmented reality. Recently, the reproduction of ...binaural signals in the Spherical-Harmonics (SH) domain has been proposed. This is performed by using SH representations of the sound-field and the Head-Related Transfer Function (HRTF). These processes offer the flexibility to control the reproduced binaural signals, by manipulating the sound-field or the HRTFs using algorithms that operate directly in the SH domain. However, in most practical cases, the binaural reproduction is order-limited, which introduces truncation error that has a detrimental effect on the perception of the reproduced signals, mainly due to the truncation of the HRTF. A recent study showed that pre-processing of the HRTF by ear-alignment reduces its effective SH order, which may be beneficial for alleviating the above effect. In this paper, a method to incorporate the pre-processed ear-aligned HRTF into the binaural reproduction process is presented. The method uses Ambisonics representation of the sound-field formulated at the two ears, and is denoted here as Bilateral Ambisonics. The proposed method leads to a significant reduction in errors due to the limited-order reproduction, which yields a substantial improvement in perceived binaural reproduction quality even with SH as low as first order.
Precise and timely evaluation of an individual's hearing loss plays an important role in determining appropriate treatment strategies, including medication and aural rehabilitation. However, ...currently available hearing assessment systems do not satisfy the need for an objective assessment tool with a simple and non-invasive procedure. In this paper, we propose a new method for pure-tone audiometry, which may potentially be used to assess an individual's hearing ability objectively and quantitatively, without need for the user's active response. The proposed method is based on the auditory oculogyric reflex, where the eyes involuntary rotate towards the source of a sound, in response to spatially moving pure-tone audio stimuli modulated at specific frequencies and intensities. We quantitatively analyzed horizontal electrooculograms (EOG) recorded with a pair of electrodes under two conditions-when pure-tone stimuli were (1) "inaudible" or (2) "audible" to a participant. Preliminary experimental results showed significantly increased EOG amplitude in the audible condition compared to the inaudible condition for all ten healthy participants. This demonstrates potential use of the proposed method as a new non-invasive hearing assessment tool.
Opažanje prostora se vrši na osnovu sinteze informacija koje dolaze iz više čula. Slepi se pri opažanju prostora najviše oslanjaju na informacije dobijene čulom sluha i dodira. Cilj ovog rada je ...ispitivanje razlika u prostornoj lokalizaciji zvuka između slepih i osoba koje vide, pod pretpostavkom da će slepi pokazati veću preciznost lokalizacije zbog učestalijeg oslanjanja na čulo sluha. U istraživanju je učestvovalo 15 slepih i 15 ispitanika normalnog vida. Ispitanicima je puštan zvuk u 3 serije sa 10 različitih pozicija dobijenih kombinacijom dve udaljenosti (1 m i 3 m) i pet pravaca (pravo ispred ispitanika, 15° i 30° levo i desno). Zadatak ispitanika je bio da, pošto čuju zvuk, kažu da li se on čuje sa udaljenosti koja je bliže ili dalje i da li je pravo, levo ili desno od njih. Analize su pokazale da tačnost procene daljine stimulusa zavisi od daljine stimulusa, i to tako da, što je stimulus dalji, ispitanici su sigurniji da je on dalji. Tačnost procene pravca stimulusa zavisi od pravca njegovog izvora, i to tako da, što je stimulus ekstremnije postavljen u odnosu na posmatrača ispitanici su sigurniji gde je on. Međutim, tačnost procene pravca zavisi i od toga da li je ispitanik slep ili ne i to tako da slepi ispitanici bolje lokalizuju pravac izvora zvuka. Dakle, između slepih i ispitanika normalnog vida postoje delimične razlike u tačnosti lokalizacije izvora zvuka, što govori da postoje osnove za tvrdnju da se kod slepih ispitanika osetljivost čula sluha povećava.
Omnidirectional videos (ODVs) with spatial audio enable viewers to perceive 360° directions of audio and visual signals during the consumption of ODVs with head-mounted displays (HMDs). By predicting ...salient audio-visual regions, ODV systems can be optimized to provide an immersive sensation of audio-visual stimuli with high-quality. Despite the intense recent effort for ODV saliency prediction, the current literature still does not consider the impact of auditory information in ODVs. In this work, we propose an audio-visual saliency (AVS360) model that incorporates 360° spatial-temporal visual representation and spatial auditory information in ODVs. The proposed AVS360 model is composed of two 3D residual networks (ResNets) to encode visual and audio cues. The first one is embedded with a spherical representation technique to extract 360° visual features, and the second one extracts the features of audio using the log mel-spectrogram. We emphasize sound source locations by integrating audio energy map (AEM) generated from spatial audio description (i.e., ambisonics) and equator viewing behavior with equator center bias (ECB). The audio and visual features are combined and fused with AEM and ECB via attention mechanism. Our experimental results show that the AVS360 model has significant superiority over five state-of-the-art saliency models. To the best of our knowledge, it is the first w ork that develops the audio-visual saliency model in ODVs. The code will be publicly available to foster future research on audio-visual saliency in ODVs.
Traditional spatial sound acquisition aims at capturing a sound field with multiple microphones such that at the reproduction side a listener can perceive the sound image as it was at the recording ...location. Standard techniques for spatial sound acquisition usually use spaced omnidirectional microphones or coincident directional microphones. Alternatively, microphone arrays and spatial filters can be used to capture the sound field. From a geometric point of view, the perspective of the sound field is fixed when using such techniques. In this paper, a geometry-based spatial sound acquisition technique is proposed to compute virtual microphone signals that manifest a different perspective of the sound field. The proposed technique uses a parametric sound field model that is formulated in the time-frequency domain. It is assumed that each time-frequency instant of a microphone signal can be decomposed into one direct and one diffuse sound component. It is further assumed that the direct component is the response of a single isotropic point-like source (IPLS) of which the position is estimated for each time-frequency instant using distributed microphone arrays. Given the sound components and the position of the IPLS, it is possible to synthesize a signal that corresponds to a virtual microphone at an arbitrary position and with an arbitrary pick-up pattern.
The accuracy and perception of soundfields produced by loudspeaker arrays are strongly influenced by the inherent characteristics of the commercial loudspeakers. This paper analyzes such ...characteristics of loudspeakers by deriving equivalent theoretical models, and by studying their impact on soundfield reproduction. A number of acoustic models are investigated, including plane waves decomposition, point source decomposition and mixed source decomposition. Each proposed model employs three effective sparse decomposition algorithms for optimized solutions, including iteratively reweighted least squares (IRLS), matching pursuit (MP) and least absolute shrinkage and selection operator (LASSO). A successful model shall enable the prediction of the soundfield outside the original recording region. Therefore, we validate the effectiveness of the models by comparing the simulated soundfield with secondary measurements obtained beyond the original area. Experimental results have confirmed that both the plane wave and mixed source model achieve promising performance with respect to the proposed metrics.