Historical Perspective of the Field of ASR/NLU

doi:10.1007/978-3-540-49127-9_26

Book ChapterDOI

Historical Perspective of the Field of ASR/NLU

- pp 521-538

TLDR

The goal of this section is to document the history of research in speech recognition and natural language understanding, and to point out areas where great progress has been made, along with the challenges that remain to be solved in the future.

Abstract:

The quest for a machine that can recognize and understand speech, from any speaker, and in any environment has been the holy grail of speech recognition research for more than 70 years. Although we have made great progress in understanding how speech is produced and analyzed, and although we have made enough advances to build and deploy in the field a number of viable speech recognition systems, we still remain far from the ultimate goal of a machine that communicates naturally with any human being. It is the goal of this section to document the history of research in speech recognition and natural language understanding, and to point out areas where great progress has been made, along with the challenges that remain to be solved in the future.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Far-Field Automatic Speech Recognition

Reinhold Haeb-Umbach, +5 more

TL;DR: This tutorial article gives an account of the algorithms used to enable accurate speech recognition from a distance, and it will be seen that a clever combination with traditional signal processing can lead to surprisingly effective solutions.

...read moreread less

Posted Content

Far-Field Automatic Speech Recognition

Reinhold Haeb-Umbach, +5 more

- 20 Sep 2020 -

arXiv: Audio and Speech Processing

TL;DR: In this article, the authors describe an end-to-end approach for far-field automatic speech recognition (ASR) for close-talk speech recorded at a distance from the microphones, which has received a significant increase in science and industry, which caused or was caused by an equally significant improvement in recognition accuracy.

...read moreread less

Journal ArticleDOI

Word Play: A History of Voice Interaction in Digital Games:

Fraser Allison, +2 more

- 01 Mar 2020 -

Games and Culture

TL;DR: The use of voice interaction in digital games has a long and varied history of experimentation but has never achieved sustained, widespread success as discussed by the authors, and a review of the history of voice interactions in games can be found in this article.

...read moreread less

Journal ArticleDOI

Kernel power flow orientation coefficients for noise-robust speech recognition

Branislav Gerazov, +1 more

- 01 Feb 2015 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: KPOCs are a novel feature set based on spectro-temporal analysis that uses a bank of 2D kernels to extract the dominant orientation of the power flow at each point in the auditory spectrogram of the speech signal, which is innately resistant to the spectral masking introduced by the presence of noise and reverberation.

...read moreread less

Journal ArticleDOI

Revisiting Popular Speech Recognition Software for ESL Speech

Shannon McCrocklin, +1 more

- 29 Oct 2020 -

TESOL Quarterly

References

PDF

Open Access

More filters

Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

Lawrence R. Rabiner

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.

...read moreread less

Journal ArticleDOI

A logical calculus of the ideas immanent in nervous activity

Warren S. McCulloch, +1 more

- 01 Jan 1990 -

Bulletin of Mathematical Biology

TL;DR: In this article, it is shown that many particular choices among possible neurophysiological assumptions are equivalent, in the sense that for every net behaving under one assumption, there exists another net which behaves under another and gives the same results, although perhaps not in the same time.

...read moreread less

Journal ArticleDOI

An Algorithm for Vector Quantizer Design

Y. Linde, +2 more

- 01 Jan 1980 -

IEEE Transactions on Communications

TL;DR: An efficient and intuitive algorithm is presented for the design of vector quantizers based either on a known probabilistic model or on a long training sequence of data.

...read moreread less

Book

Vector Quantization and Signal Compression

Allen Gersho, +1 more

TL;DR: The author explains the design and implementation of the Levinson-Durbin Algorithm, which automates the very labor-intensive and therefore time-heavy and expensive process of designing and implementing a Quantizer.

...read moreread less

Journal ArticleDOI

Dynamic programming algorithm optimization for spoken word recognition

H. Sakoe, +1 more

- 01 Feb 1978 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.

...read moreread less

Collapse

Historical Perspective of the Field of ASR/NLU

Citations

Far-Field Automatic Speech Recognition

Far-Field Automatic Speech Recognition

Word Play: A History of Voice Interaction in Digital Games:

Kernel power flow orientation coefficients for noise-robust speech recognition

Revisiting Popular Speech Recognition Software for ESL Speech

References

A tutorial on hidden Markov models and selected applications in speech recognition

A logical calculus of the ideas immanent in nervous activity

An Algorithm for Vector Quantizer Design

Vector Quantization and Signal Compression

Dynamic programming algorithm optimization for spoken word recognition

Related Papers (5)

State of the art in continuous speech recognition

Toward speech as a knowledge resource

Double Layer Architectures for Automatic Speech Recognition Using HMM

Dynamic speech models

Implementation of speech recognition system for Bangla