In recent years, substantial progress has been made in the field of reverberant speech signal processing, including both single- and multichannel dereverberation techniques and automatic speech ...recognition (ASR) techniques that are robust to reverberation. In this paper, we describe the REVERB challenge, which is an evaluation campaign that was designed to evaluate such speech enhancement (SE) and ASR techniques to reveal the state-of-the-art techniques and obtain new insights regarding potential future research directions. Even though most existing benchmark tasks and challenges for distant speech processing focus on the noise robustness issue and sometimes only on a single-channel scenario, a particular novelty of the REVERB challenge is that it is carefully designed to test robustness against
reverberation
, based on
both real, single-channel, and multichannel recordings
. This challenge attracted 27 papers, which represent 25 systems specifically designed for SE purposes and 49 systems specifically designed for ASR purposes. This paper describes the problems dealt within the challenge, provides an overview of the submitted systems, and scrutinizes them to clarify what current processing strategies appear effective in reverberant speech processing.
Inference of Room Geometry From Acoustic Impulse Responses Antonacci, F.; Filos, J.; Thomas, M. R. P. ...
IEEE transactions on audio, speech and language processing/IEEE transactions on audio, speech, and language processing,
12/2012, Volume:
20, Issue:
10
Journal Article
Peer reviewed
Open access
Acoustic scene reconstruction is a process that aims to infer characteristics of the environment from acoustic measurements. We investigate the problem of locating planar reflectors in rooms, such as ...walls and furniture, from signals obtained using distributed microphones. Specifically, localization of multiple two- dimensional (2-D) reflectors is achieved by estimation of the time of arrival (TOA) of reflected signals by analysis of acoustic impulse responses (AIRs). The estimated TOAs are converted into elliptical constraints about the location of the line reflector, which is then localized by combining multiple constraints. When multiple walls are present in the acoustic scene, an ambiguity problem arises, which we show can be addressed using the Hough transform. Additionally, the Hough transform significantly improves the robustness of the estimation for noisy measurements. The proposed approach is evaluated using simulated rooms under a variety of different controlled conditions where the floor and ceiling are perfectly absorbing. Results using AIRs measured in a real environment are also given. Additionally, results showing the robustness to additive noise in the TOA information are presented, with particular reference to the improvement achieved through the use of the Hough transform.
Noise fields encountered in real-life scenarios can often be approximated as spherical or cylindrical noise fields. The characteristics of the noise field can be described by a spatial coherence ...function. For simulation purposes, researchers in the signal processing community often require sensor signals that exhibit a specific spatial coherence function. In addition, they often require a specific type of noise such as temporally correlated noise, babble speech that comprises a mixture of mutually independent speech fragments, or factory noise. Existing algorithms are unable to generate sensor signals such as babble speech and factory noise observed in an arbitrary noise field. In this paper an efficient algorithm is developed that generates multisensor signals under a predefined spatial coherence constraint. The benefit of the developed algorithm is twofold. Firstly, there are no restrictions on the spatial coherence function. Secondly, to generate M sensor signals the algorithm requires only M mutually independent noise signals. The performance evaluation shows that the developed algorithm is able to generate a more accurate spatial coherence between the generated sensor signals compared to the so-called image method that is frequently used in the signal processing community.
In general, the signal-to-noise ratio as well as the signal-to-reverberation ratio of speech received by a microphone decrease when the distance between the talker and microphone increases. ...Dereverberation and noise reduction algorithm are essential for many applications such as videoconferencing, hearing aids, and automatic speech recognition to improve the quality and intelligibility of the received desired speech that is corrupted by reverberation and noise. In the last decade, researchers have aimed at estimating the reverberant desired speech signal as received by one of the microphones. Although this approach has let to practical noise reduction algorithms, the spatial diversity of the received desired signal is not exploited to dereverberate the speech signal. In this paper, a two-stage beamforming approach is presented for dereverberation and noise reduction. In the first stage, a signal-independent beamformer is used to generate a reference signal which contains a dereverberated version of the desired speech signal as received at the microphones and residual noise. In the second stage, the filtered microphone signals and the noisy reference signal are used to obtain an estimate of the dereverberated desired speech signal. In this stage, different signal-dependent beamformers can be used depending on the desired operating point in terms of noise reduction and speech distortion. The presented performance evaluation demonstrates the effectiveness of the proposed two-stage approach.
Researchers in the signal processing community often require sensor signals that result from a spherically or cylindrically isotropic noise field for simulation purposes. Although it has been shown ...that these signals can be generated using a number of uncorrelated noise sources that are uniformly spaced on a sphere or cylinder, this method is seldom used in practice. In this paper algorithms that generate sensor signals of an arbitrary one- and three-dimensional array that result from a spherically or cylindrically isotropic noise field are developed. Furthermore, the influence of the number of noise sources on the accuracy of the generated sensor signals is investigated.
Time difference of arrival (TDOA) based indoor ultrasound localization systems are prone to multiple disruptions and demand reliable, and resilient position accuracy during operation. In this ...challenging context, a missing link to evaluate the performance of such systems is a simulation approach to test their robustness in the presence of disruptions. This approach cannot only replace experiments in early phases of development but could also be used to study susceptibility, robustness, response, and recovery in case of disruptions. The paper presents a simulation framework for a TDOA-based indoor ultrasound localization system and ways to introduce different types of disruptions. This framework can be used to test the performance of TDOA-based localization algorithms in the presence of disruptions. Resilience quantification results are presented for representative disruptions. Based on these quantities, it is found that localization with arc-tangent cost function is approximately 30% more resilient than the linear cost function. The simulation approach is shown to apply to resilience engineering and can be used to increase the efficiency and quality of indoor localization methods.