Spatial Sound Modulators
In article number 2301459, James Hardwick and co‐workers introduce spatial sound modulators (SSMs), novel devices that combine traditional acoustic metasurface pixel units ...into custom‐shaped segmented elements. An SSM fabrication method is detailed using off‐the‐shelf 3D printers and bespoke control electronics, completing an end‐to‐end methodology from conception to realization. Further enhancements to the technique are explored through hybrid SSM devices with both static and dynamic elements.
Precise control of ultrasonic acoustic waves with frequencies f ≳ 20 kHz is useful in a range of applications from ultrasonic scanners to nondestructive testing and consumer haptic devices. A spatial ...sound modulator (SSM) is the acoustic analogy to the spatial light modulator (SLM) in optics and is highly sought after by acoustics researchers. A spatial sound modulator is constrained by very distinct practical conditions. Namely, it must be a reconfigurable device which modulates sound arbitrarily from a decoupled source. Here a reflective phase modulating device is realized, whose local units can be tuned to imprint a phase signature to an incoming wave. It is manually reconfigurable and consists of 1024 rigidly ended square waveguides with sliding bottom surfaces to provide variable phase delays. Experiments demonstrate the ability of this device to focus ultrasonic waves in air at different points in space, generate accurate pressure landscapes, and perform multiplane holography. Moreover, thanks to the subwavelength nature of the unit cells, this device outperforms state‐of‐the‐art phased‐array transducers of the same size in the quality and energy distribution of generated acoustic holographic images. These results pave the way for the construction of electronically controlled reflective SSMs.
In this work, a manually reconfigurable spatial sound modulator for ultrasonic waves in air is presented. This device can locally modulate the phase of incoming waves due to its individually tuned unit cells. Experiments demonstrate the ability of this device to focus ultrasonic waves in air at different points in space, generate accurate pressure landscapes, and perform multiplane holography.
This paper reviews the current state of loudspeaker-based spatial sound reproduction methods from technical perspective as well as perceptual perspective. A nomenclature is developed that allows for ...a strict separation between these two perspectives. The physical fundamentals, practical realization, and results from perceptual studies are discussed for a number of well-established and emerging reproduction techniques. Further, the paper outlines novel approaches to spatial sound evaluation in terms of perceived quality and provides a comparison of current approaches.
Spatial audio reproduction using the loudspeaker array introduces the curvature effect leading to a distorted listening experience when the listener is in the near field. In the near-field, the ...loudspeakers are approximated as point sources (spherical wave) and amplify the mode vectors. Further, the problem becomes more challenging for the irregular loudspeaker arrangement, which causes uneven energy distribution in the reproduction region. In this context, a near-field compensation is applied to the encoded ambisonics coefficients. An optimization problem is formulated, such as the loudspeaker gains encoded with spherical harmonics basis coefficients should match the target ambisonics coefficients. Further, the in-phase and quadrature components of the energy localization vector are imposed as the constraints to direct maximum energy in the reproduction region. The solution to the optimization problem is obtained using a derivative-free optimization solver. The performance of the proposed methods is evaluated for ITU-R recommended loudspeaker layouts using the technical and perceptual evaluation attributes.
Microperforated panel (MPP) absorbers have been widely used in noise reduction and are regarded as a promising alternative to the traditional porous materials. However, the flat panel-like shape ...restricts their practical use in actual rooms or buildings. To overcome these limitations, three-dimensional MPP spatial sound absorbers have been proposed. MPP are mostly metal materials with high cost. It is relatively easy to make cylinders or cubes, and difficult to make other complex structures. In view of this, a kind of non-woven material which can replace micro-perforated board is proposed in this paper. This kind of non-woven material has the characteristics of low cost, flexibility, convenient moulding and good sound absorption performance. The impedance tube experiments has been employed to study the normal incidence sound absorption coefficient for the non-woven fabric materials. The experimental results show that the sound absorption properties of non-woven fabrics are similar to those of ultra-micro MPP. Three kinds of spatial sound absorbers have been made by non-woven materials. They are the hollow cylindrical, fan-shaped, and honeycomb-like spatial sound absorbers, respectively. The measurement of reverberation chamber results show that the sound absorption capacity of the honeycomb-like spatial sound absorber is better than the hollow cylindrical and fan-shaped spatial sound absorbers. Adding non-woven cylindrical boundary layer can further improve the sound absorption performance of the fan-shaped and honeycomb-like space absorber.
To realize 3D spatial sound rendering with a two-channel headphone, one needs head-related transfer functions (HRTFs) tailored for a specific user. However, measurement of HRTFs requires a tedious ...and expensive procedure. To address this, we propose a fully perceptual-based HRTF fitting method for individual users using machine learning techniques. The user only needs to answer pairwise comparisons of test signals presented by the system during calibration. This reduces the efforts necessary for the user to obtain individualized HRTFs. Technically, we present a novel adaptive variational AutoEncoder with a convolutional neural network. In the training, this AutoEncoder analyzes publicly available HRTFs dataset and identifies factors that depend on the individuality of users in a nonlinear space. In calibration, the AutoEncoder generates high-quality HRTFs fitted to a specific user by blending the factors. We validate the feasibilities of our method through several quantitative experiments and a user study.
Visually guided spatial sound generation (VGSSG) is a well-suited multimodal learning method for dealing with recorded videos. However, existing methods are difficult to be directly applied to ...spatial sound generation for movie clips. This is mainly due to (1) the existence of Cinematic Audiovisual Language (CAL) in movies, which makes it difficult to construct spatial sound mapping models directly through data-driven based methods. (2) The problem of the inadequate model performance, which is caused by the excessive heterogeneous gap between audiovisual modal information. To solve the aforementioned problems, we propose a VGSSG method based on CAL decision-making and hierarchical feature coding and decoding, which effectively accomplishes spatial sound generation based on the CAL of movies. Specifically, to solve the problem of CAL modeling, a multimodal information-guided movie audio rendering decision maker is established, which can decide the rendering strategy based on the CAL of the current clip. To narrow the heterogeneous gap that hinders the fusion between audiovisual modal data, we propose a codec structure based on hierarchical fusion of audiovisual features and full-scale skip-connections, which improves the efficiency of the comprehensive utilization of audiovisual modal data, and demonstrates the effectiveness of adopting shallow features in VGSSG task. We integrate both 2-channel and 6-channel spatial audio generation into a unified framework. In addition, we establish a movie audiovisual bimodal dataset with hand-crafted CAL annotations. Experimentally, we demonstrate that compared with the existing methods, our method has higher performance in terms of reducing generation distortion.
•The existence of CAL makes it difficult to construct spatial sound rendering models through data-driven methods.•Neglecting the utilization of shallow structure features limits spatial sound generation performance.•The codec structure based on hierarchical fusion of audiovisual features and full-scale skip-connections can narrow the heterogeneous gap.•Propose a multimodal information-guided movie audio rendering decision maker to solve the problem of CAL modeling.
The aim of spatial sound or spatial audio is to reproduce the spatial information of sound, so as to recreate the desired spatial auditory events or perceptions. Recently, spatial sound becomes a hot ...topic in the fields of acoustics, signal processing, and communication. A series of spatial sound techniques have been developed and applied to a wide area of scientific research, engineering, and amusement. The history, principle, progress and applications of spatial sound technique are comprehensively reviewed in this article. Especially, spatial sound techniques based on different principles are united within the framework of spatial sampling and reconstruction theorem of sound field. The challenges and prospects of spatial sound are also addressed.
Augmented Reality (AR) technologies are increasingly utilized as a means of stimulating immersive experiences to cultural site visitors, mainly through visual superimposition of interactive digital ...elements onto the physical world. Recent research has investigated the use of Audio AR (AAR) in heritage sites, wherein visitors listen to spatially registered sound which could be attributed to ‘talking’ physical artefacts. A parallel trend in the audience engagement programs of cultural institutions involves the employment of AI chatbots which are engaged in dialogues with followers or visitors to provide meaningful responses to a number of user questions. Herein, we present Exhibot, an intelligent audio guide system aiming at enhancing the user experience of cultural site visitors. Exhibot involves the combination of AAR and chatbot technologies to enable natural visitor-exhibit interaction, while also leveraging IoT devices to contextualize the delivered information. The key contribution of the proposed system lies in the interplay of AAR, chatbot and IoT technologies to create immersive learning experiences in the context of an integrated cultural guide system. Exhibot has undergone field trials to validate its usability and utility in realistic operational conditions. As a case study, we have chosen the statue of a prominent politician situated at a central square in Heraklion, Greece. The evaluation results indicated a very positive attitude of users, which is attributed both to the sense of immersion evoked by the AAR-powered storytelling and the natural human-like conversation enabled by the chatbot.