scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Lombard reflex and its role on human listeners and automatic speech recognizers.

Jean-Claude Junqua1
01 Jan 1993-Journal of the Acoustical Society of America (Acoustical Society of America)-Vol. 93, Iss: 1, pp 510-524
TL;DR: Both acoustic and perceptual analyses suggest that the influence of the Lombard effect on male and female speakers is different and bring to light that, even if some tendencies across speakers can be observed consistently, the Lombardy reflex is highly variable from speaker to speaker.
Abstract: Automatic speech recognition experiments show that, depending on the task performed and how speech variability is modeled, automatic speech recognizers are more or less sensitive to the Lombard reflex. To gain an understanding about the Lombard effect with the prospect of improving performance of automatic speech recognizers, (1) an analysis was made of the acoustic‐phonetic changes occurring in Lombard speech, and (2) the influence of the Lombard effect on speech perception was studied. Both acoustic and perceptual analyses suggest that the influence of the Lombard effect on male and female speakers is different. The analyses also bring to light that, even if some tendencies across speakers can be observed consistently, the Lombard reflex is highly variable from speaker to speaker. Based on the results of the acoustic and perceptual studies, some ways of dealing with Lombard speech variability in automatic speech recognition are also discussed.
Citations
More filters
Book
01 Jan 2000
TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.
Abstract: From the Publisher: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter. Covers the fundamental algorithms of various fields, whether originally proposed for spoken or written language to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation. Emphasis on web and other practical applications. Emphasis on scientific evaluation. Useful as a reference for professionals in any of the areas of speech and language processing.

3,794 citations

Posted Content
TL;DR: Deep Speech, a state-of-the-art speech recognition system developed using end-to-end deep learning, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set.
Abstract: We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learns a function that is robust to such effects. We do not need a phoneme dictionary, nor even the concept of a "phoneme." Key to our approach is a well-optimized RNN training system that uses multiple GPUs, as well as a set of novel data synthesis techniques that allow us to efficiently obtain a large amount of varied data for training. Our system, called Deep Speech, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set. Deep Speech also handles challenging noisy environments better than widely used, state-of-the-art commercial speech systems.

1,761 citations


Cites background from "The Lombard reflex and its role on ..."

  • ...Using a combination of collected and synthesized data, our system learns robustness to realistic noise and speaker variation (including Lombard Effect [20])....

    [...]

Book
01 Jan 1994
TL;DR: In this article, technology and applications for the rendering of virtual acoustic spaces are reviewed, including applications to computer workstations, communication systems, aeronautics and space, and sonic arts.
Abstract: Technology and applications for the rendering of virtual acoustic spaces are reviewed. Chapter 1 deals with acoustics and psychoacoustics. Chapters 2 and 3 cover cues to spatial hearing and review psychoacoustic literature. Chapter 4 covers signal processing and systems overviews of 3-D sound systems. Chapter 5 covers applications to computer workstations, communication systems, aeronautics and space, and sonic arts. Chapter 6 lists resources. This TM is a reprint of the 1994 book from Academic Press.

960 citations

Book ChapterDOI
TL;DR: This chapter reviews recent advancements in studies of vocal adaptations to interference by background noise and relates these to fundamental issues in sound perception in animals and humans.
Abstract: Publisher Summary Environmental noise can affect acoustic communication through limiting the broadcast area, or active space, of a signal by decreasing signal-to-noise ratios at the position of the receiver. At the same time, noise is ubiquitous in all habitats and is, therefore, likely to disturb animals, as well as humans, under many circumstances. However, both animals and humans have evolved diverse solutions to the background noise problem, and this chapter reviews recent advancements in studies of vocal adaptations to interference by background noise and relate these to fundamental issues in sound perception. The chapter starts with the discussion of sender's side by considering potential evolutionary shaping of species-specific signal characteristics and individual short‐term adjustments of signal features. Subsequently, it focuses on the receivers of signals and reviews their sensory capacities for signal detection, recognition, and discrimination and relates these issues to auditory scene analysis and the ecological concept of signal space. The data from studies on insects, anurans, birds, and mammals, including humans, and to a lesser extent available work on fish and reptiles is also discussed in the chapter.

845 citations

Journal ArticleDOI
TL;DR: The survey indicates that the essential points in noisy speech recognition consist of incorporating time and frequency correlations, giving more importance to high SNR portions of speech in decision making, exploiting task-specific a priori knowledge both of speech and of noise, using class-dependent processing, and including auditory models in speech processing.

712 citations