Showing papers on "Dynamic time warping published in 1979"

PDF

Open Access

Journal Article•DOI•

Two-level DP-matching--A dynamic programming-based pattern matching algorithm for connected word recognition

[...]

H. Sakoe¹•Institutions (1)

01 Dec 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns and Computation time and memory requirement are both proved to be within reasonable limits.

...read moreread less

Abstract: This paper reports a pattern matching approach to connected word recognition. First, a general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns. Time-normalization capability is allowed by use of dynamic programming-based time-warping technique (DP-matching). Then, it is shown that the matching process is efficiently carried out by breaking it down into two steps. The derived algorithm is extensively subjected to recognition experiments. It is shown in a talker-adapted recognition experiment that digit data (one to four digits) connectedly spoken by five persons are recognized with as high as 99.6 percent accuracy. Computation time and memory requirement are both proved to be within reasonable limits.

...read moreread less

289 citations

Journal Article•DOI•

Speaker-independent recognition of isolated words using clustering techniques

[...]

Lawrence R. Rabiner¹, Stephen E. Levinson², Aaron E. Rosenberg², Jay G. Wilpon²•Institutions (2)

Bell Labs¹, Alcatel-Lucent²

01 Aug 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A speaker-independent isolated word recognition system is described which is based on the use of multiple templates for each word in the vocabulary, and shows error rates that are comparable to, or better than, those obtained with speaker-trained isolatedword recognition systems.

...read moreread less

Abstract: A speaker-independent isolated word recognition system is described which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analysis of a large database consisting of 100 replications of each word (i.e., once by each of 100 talkers). The recognition system, which accepts telephone quality speech input, is based on an LPC analysis of the unknown word, dynamic time warping of each reference template to the unknown word (using the Itakura LPC distance measure), and the application of a K-nearest neighbor (KNN) decision rule. Results for several test sets of data are presented. They show error rates that are comparable to, or better than, those obtained with speaker-trained isolated word recognition systems.

...read moreread less

245 citations

Proceedings Article•DOI•

Speaker independent recognition of isolated words using clustering techniques

[...]

Lawrence R. Rabiner¹, Stephen E. Levinson², Aaron E. Rosenberg², Jay G. Wilpon²•Institutions (2)

Bell Labs¹, Alcatel-Lucent²

01 Apr 1979

TL;DR: In this paper, a speaker independent, isolated word recognition system is proposed which is based on the use of multiple templates for each word in the vocabulary, which are obtained from a statistical clustering analysis of a large data base consisting of 100 replications of each word (i.e. once by each of 100 talkers).

...read moreread less

Abstract: A speaker independent, isolated word recognition system is proposed which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analysis of a large data base consisting of 100 replications of each word (i.e. once by each of 100 talkers). The recognition system, which uses telephone recordings, is based on an LPC analysis of the unknown word, dynamic time warping of each reference template to the unknown word (using the Itakura LPC distance measure), and the application of a K-nearest neighbor (KNN) decision rule to lower the probability of error. Results are presented on two test sets of data which show error rates that are comparable to, or better than, those obtained with speaker trained, isolated word recognition systems.

...read moreread less

120 citations

Two-level DP-matching algorithm-a dynamic programming based pattern matching algorithm for continuous speech recognition

[...]

H. Sakoe

01 Jan 1979

27 citations

Journal Article•DOI•

Performance trade‐offs in dynamic time warping algorithms for isolated word recognition

[...]

C. S. Myers, Lawrence R. Rabiner, Aaron E. Rosenberg

01 Nov 1979-Journal of the Acoustical Society of America

TL;DR: The purpose of this investigation is to study the effects of variations on the performance of different algorithms for a realistic speech data base, and the performance index is based on speed of operation, memory requirements, and recognition accuracy of the algorithm.

...read moreread less

Abstract: The technique of dynamic programming for time registration of a reference and a test utterance has found widespread use in the area of discrete word recognition. Recently a number of variations on the basic time warping algorithms have been proposed by Sakoe and Chiba, and Rabiner, Rosenberg, and Levinson. These algorithms all assume the test input is an isolated word whose endpoints are known (at least approximately). The major difference in the methods are the global path constraints (i.e., the region of possible paths), the local continuity constraints on the path, and the distance weighting and normalization used to give the overall minimum distance. The purpose of this investigation is to study the effects of such variations on the performance of different algorithms for a realistic speech data base. The performance index is based on speed of operation, memory requirements, and recognition accuracy of the algorithm. Preliminary results indicate, in most cases, only small differences in performance among the various methods.

...read moreread less

14 citations

Journal Article•DOI•

On time warping and the random delay channel

[...]

M. Blanco, F. Hill

01 Mar 1979-IEEE Transactions on Information Theory

TL;DR: It is shown that a first-in first-out (FIFO) assumption for channels that produce time warping, or delay modulation, in signals passing through them is compelling on physical grounds and vastly simplifies ensuing analysis.

...read moreread less

Abstract: Channels (i.e., operators) are studied that produce time warping, or delay modulation, in signals passing through them, and many interesting properties of these channels are developed. It is shown that a first-in first-out (FIFO) assumption for such channels is compelling on physical grounds and vastly simplifies ensuing analysis. Two descriptions of the channel, the "send-delay" and "receive-delay" functions, are compared, and it is shown that one is precisely the shape needed to equalize or unwarp signals warped by the other. A series expansion for time-warped signals is developed, and the unitary nature of the warp operators is exploited to generate rich sets of orthonormal signals. The random time-warp channel is then analyzed, and certain statistics such as the autocorrelation function of the output signals are developed, along with conditions on their stationarity. Finally, optimum linear filters for extracting a signal from a noisy and time-warped version are derived and compared with some previous results.

...read moreread less

7 citations

Proceedings Article•DOI•

An approach to speaker normalization in an automatic speech recognition system

[...]

J. Jaschul

01 Apr 1979

TL;DR: Under the present restriction to vowel spectra adaptation methods by spectral amplitude weighting and by spectral shifting are investigated, by a special method it was enabled to adapt test spectra class specifically.

...read moreread less

Abstract: An automatic speech recognition system based on the reference set of a single speaker can be extended for use by several speakers by applying appropriate preprocessing transformations. These transformations adapt the incoming patterns of a new speaker to the patterns of the reference set. Under the present restriction to vowel spectra adaptation methods by spectral amplitude weighting and by spectral shifting are investigated. By a special method it was enabled to adapt test spectra class specifically.

...read moreread less

4 citations

Journal Article•DOI•

New techniques for automatic speaker verification using telephone speech

[...]

Sadaoki Furui

01 Nov 1979-Journal of the Acoustical Society of America

TL;DR: In this paper, a set of functions of time obtained from acoustic analysis of a fixed, sentence-long utterance are expanded by orthogonal polynomial representations and compared with stored reference functions.

...read moreread less

Abstract: This paper describes new techniques for automatic speaker verification using telephone speech. The operation of the system is based on a set of functions of time obtained from acoustic analysis of a fixed, sentence‐long utterance. These time functions are expanded by orthogonal polynomial representations and compared with stored reference functions. After dynamic time warping, a decision is made to accept or reject an identity claim. Three sets of experimental utterances were used for the evaluation of the system. The first and second sets each comprises 50 utterances by 10 customers each and a single utterance by 40 imposters recorded over a conventional telephone connection. The third set comprises 26 utterances by 21 customers each and a single utterance by 55 imposters recorded over a high quality microphone. The first and third sets were uttered by male speakers, whereas the second set was uttered by female speakers. Reference functions and decision thresholds were updated for each customer. The eval...

...read moreread less

2 citations