Speech Dereverberation gathers together an overview, a mathematical formulation of the problem and the state-of-the-art solutions for dereverberation. Speech Dereverberation presents current ...approaches to the problem of reverberation. It provides a review of topics in room acoustics and also describes performance measures for dereverberation. The algorithms are then explained with mathematical analysis and examples that enable the reader to see the strengths and weaknesses of the various techniques, as well as giving an understanding of the questions still to be addressed. Techniques rooted in speech enhancement are included, in addition to a treatment of multichannel blind acoustic system identification and inversion. The TRINICON framework is shown in the context of dereverberation to be a generalization of the signal processing for a range of analysis and enhancement techniques. Speech Dereverberation is suitable for students at masters and doctoral level, as well as established researchers. TOC:Models, Measurement and Evaluation.- Speech Dereverberation Using Statistical Reverberation Models.- Dereverberation Using LPC-based Approaches.- Multi-Microphone Speech Dereverberation Using Eigen-decomposition.- Adaptive Blind Multichannel System Identification.- Subband Inversion of Multichannel Acoustic Systems.- Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker.- Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information.- TRINICON for Dereverberation of Speech and Audio Signals.
This is the first book to provide a cutting edge reference to the fascinating topic of blind source separation (BSS) for convolved speech mixtures. Through contributions by the foremost experts on ...the subject, the book provides an up-to-date account of research findings, explains the underlying theory, and discusses potential applications. The individual chapters are designed to be tutorial in nature with specific emphasis on an in-depth treatment of state of the art techniques. Blind Speech Separation is divided into three parts: Part 1 presents overdetermined or critically determined BSS. Here the main technology is independent component analysis (ICA). ICA is a statistical method for extracting mutually independent sources from their mixtures. This approach utilizes spatial diversity to discriminate between desired and undesired components, i.e., it reduces the undesired components by forming a spatial null towards them. It is, in fact, a blind adaptive beamformer realized by unsupervised adaptive filtering. Part 2 addresses underdetermined BSS, where there are fewer microphones than source signals. Here, the sparseness of speech sources is very useful, we can utilize time-frequency diversity, where sources are active in different regions of the time-frequency plane. Part 3 presents monaural BSS where there is only one microphone. Here, we can separate a mixture by using the harmonicity and temporal structure of the sources. We can build a probabilistic framework by assuming a source model, and separate a mixture by maximizing the a posteriori probability of the sources.
Noise is everywhere and in most applications that are related to audio and speech, such as human-machine interfaces, hands-free communications, voice over IP (VoIP), hearing aids, ...teleconferencing/telepresence/telecollaboration systems, and so many others, the signal of interest (usually speech) that is picked up by a microphone is generally contaminated by noise. As a result, the microphone signal has to be cleaned up with digital signal processing tools before it is stored, analyzed, transmitted, or played out. This cleaning process is often called noise reduction and this topic has attracted a considerable amount of research and engineering attention for several decades. One of the objectives of this book is to present in a common framework an overview of the state of the art of noise reduction algorithms in the single-channel (one microphone) case. The focus is on the most useful approaches, i.e., filtering techniques (in different domains) and spectral enhancement methods. The other objective of Noise Reduction in Speech Processing is to derive all these well-known techniques in a rigorous way and prove many fundamental and intuitive results often taken for granted. This book is especially written for graduate students and research engineers who work on noise reduction for speech and audio applications and want to understand the subtle mechanisms behind each approach. Many new and interesting concepts are presented in this text that we hope the readers will find useful and inspiring.
Multilingual Speech Processing Tanja Schultz, Katrin Kirchhoff / Tanja Schultz, Katrin Kirchhoff
2006, 2006-04-21T00:00:00, 2006-12-31
eBook
Tanja Schultz and Katrin Kirchhoff have compiled a comprehensive overview of speech processing from a multilingual perspective. By taking this all- inclusive approach to speech processing, the ...editors have included theories, algorithms, and techniques that are required to support spoken input and output in a large variety of languages. Multilingual Speech Processing presents a comprehensive introduction to research problems and solutions, both from a theoretical as well as a practical perspective, and highlights technology that incorporates the increasing necessity for multilingual applications in our global community. Current challenges of speech processing and the feasibility of sharing data and system components across different languages guide contributors in their discussions of trends, prognoses and open research issues. This includes automatic speech recognition and speech synthesis, but also speech-to-speech translation, dialog systems, automatic language identification, and handling non-native speech. The book is complemented by an overview of multilingual resources, important research trends, and actual speech processing systems that are being deployed in multilingual human-human and human-machine interfaces. Researchers and developers in industry and academia with different backgrounds but a common interest in multilingual speech processing will find an excellent overview of research problems and solutions detailed from theoretical and practical perspectives. * State-of-the-art research with a global perspective by authors from the USA, Asia, Europe, and South Africa * The only comprehensive introduction to multilingual speech processing currently available * Detailed presentation of technological advances integral to security, financial, cellular and commercial applications
Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we ...have advanced computer technology that Kubrick never imagined, but we do not have computers that talk and understand speech as HAL did. Is it a failure of our technology that we have not gotten much further than an automated voice that tells us to "say or press 1"? Or is there something fundamental in human language and speech that we do not yet understand deeply enough to be able to replicate in a computer? In The Voice in the Machine , Roberto Pieraccini examines six decades of work in science and technology to develop computers that can interact with humans using speech and the industry that has arisen around the quest for these technologies. He shows that although the computers today that understand speech may not have HAL's capacity for conversation, they have capabilities that make them usable in many applications today and are on a fast track of improvement and innovation. Pieraccini describes the evolution of speech recognition and speech understanding processes from waveform methods to artificial intelligence approaches to statistical learning and modeling of human speech based on a rigorous mathematical model--specifically, Hidden Markov Models (HMM). He details the development of dialog systems, the ability to produce speech, and the process of bringing talking machines to the market. Finally, he asks a question that only the future can answer: will we end up with HAL-like computers or something completely unexpected?