TL;DR: The authors reviewed what is currently known about voice identification by human listeners and concluded that the caution and suspicion currently accorded to visual identification must be extended also, and perhaps more so, to voice identification.

...read moreread less

Abstract: This paper reviews what is currently known about voice identification by human listeners. Our own experimental data from a four-year research program into this topic is used to elucidate, support, and in some cases to contradict published work into the effects on voice identification of such factors as speech sample size and quality, voice disguise, delay in holding voice identification sessions, incidental as opposed to intentional memory for voices, the effects of the age of the witness, training in specific modes of encoding voices, and the relationship between objective accuracy and subjective feelings of certainty of correctness. It is concluded that the caution and suspicion currently accorded to visual identification must be extended also, and perhaps more so, to voice identification.

...read moreread less

81 citations

Proceedings Article•DOI•

An investigation of the use of dynamic time warping for word spotting and connected speech recognition

[...]

C. S. Myers¹, Lawrence R. Rabiner, Aaron E. Rosenberg•Institutions (1)

Bell Labs¹

01 Apr 1980

TL;DR: It is shown that, in several simple performance evaluations, the local minimum method performed considerably better then the fixed range method.

...read moreread less

Abstract: Several variations on algorithms for dynamic time warping have been proposed for speech processing applications. In this paper two general algorithms that have been proposed for word spotting and connected word recognition are studied. These algorithms are called the fixed range method and the local minimum method. The characteristics and properties of these algorithms are discussed. It is shown that, in several simple performance evaluations, the local minimum method performed considerably better then the fixed range method. Explanations of this behavior are given and an optimized method of applying the local minimum algorithm to word spotting and connected word recognition is described.

...read moreread less

44 citations

Journal Article•DOI•

Recognition of unaspirated plosives--A statistical approach

[...]

A. K. Datta¹, N. Ganguli, S. Ray•Institutions (1)

Indian Statistical Institute¹

01 Feb 1980-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Back vowels as targets have been found to give improved classification of the preceding consonants, and a comparison of the result of machine recognition with those of published results on perception tests has been included.

...read moreread less

Abstract: In this paper the results of a study of the computer recognition of unaspirated plosives in commonly used polysyllabic words uttered by three different informants are presented. The onglide transitions of the first two formants and their durations have been found to be an effective set of features for the recognition of unaspirated plosives. The rates of transition of these two formants as a feature set have been found to be significantly inferior to the features mentioned earlier. The maximum likelihood method, under the assumption of a normal distribution for the feature set, provides an adequate tool for classification. The assumption of both intergroup and intragroup independence of the features reduces recognition scores. A prior knowledge of target vowels is found necessary for attaining reasonable efficiency. A prior knowledge of voicing manner improves classification efficiency to some extent. The physiological factors responsible for the variation of the recognition score for the various plosives are discussed. For labials and velars the recognition score is very high, nearly 90 percent. An attempt to correlate the dynamics of tongue-body motion with the variations in recognition scores has been made. Back vowels as targets have been found to give improved classification of the preceding consonants. A comparison of the result of machine recognition with those of published results on perception tests has been included. The results are found to be of the same order.

...read moreread less

29 citations

Proceedings Article•DOI•

A comparison of four techniques for automatic speaker recognition

[...]

R. Wohlford, E. Wrench, B. Landell

01 Apr 1980

TL;DR: The results of this study indicate that LPC derived parameters perform better than do those derived from cepstral and spectral data.

...read moreread less

Abstract: Four automatic speaker recognition techniques were investigated with a contain speech data base to determine their effectiveness in a text independent mode. These four techniques used the correlation of short and long term spectral averages, cepstral measurements of long term spectral averages, orthogonal linear prediction of the speech waveform, and long term average LPC reflection coefficients combined with pitch and overall power. The results of this study indicate that LPC derived parameters perform better than do those derived from cepstral and spectral data. Recognition accuracies of 95% and 93% were obtained for LPC based techniques with 13 seconds of unknown speech. The corresponding recognition accuracies for the cepstral and spectral based systems were 79% and 54% respectively.

...read moreread less

22 citations

Journal Article•DOI•

Speech recognition and control system for the severely disabled.

[...]

Arnon D. Cohen¹, Daniel Graupe²•Institutions (2)

Ben-Gurion University of the Negev¹, Colorado State University²

01 Apr 1980-Journal of Biomedical Engineering

TL;DR: A microprocessor based speech recognition system for the voice control of wheelchair, touch-tone phone, typewriter and environmental control unit, which exhibits less than one percent substitutions and eleven percent rejections with the ten digit set.

...read moreread less

21 citations

Book Chapter•DOI•

Pattern Recognition Techniques for Speech Recognition

[...]

J. S. Bridle

01 Jan 1980

TL;DR: An approach to speech recognition which tries to avoid the problems of using a phoneme level of description and treats larger units such as words as patterns with a time axis is described.

...read moreread less

Abstract: This is an overview of techniques which have been developed for automatic pattern recognition, with an indication of their relevance to automatic speech recognition. The first part is concerned with data transformations, distance measures, cluster analysis and other aspects of what could be called ‘classic’ mathematical pattern recognition. The second part is more directly concerned with speech, and the term ‘pattern recognition’ is used to denote an approach to speech recognition which tries to avoid the problems of using a phoneme level of description and treats larger units such as words as patterns with a time axis.

...read moreread less

12 citations

Patent•

Voice pattern recognition system

[...]

Naoki Ishii, Riyouhei Nakatsu

16 Jun 1980

9 citations

Proceedings Article•DOI•

A speech recognition machine for connected words

[...]

R. Nakatsu

01 Apr 1980

TL;DR: In this machine, a new method for connected word recognition, namely inverse dynamic programming (DP) matching, is adopted, and the recognition rate of 99.3% is obtained.

...read moreread less

Abstract: Construction and performance of a machine for recognizing spoken connected words are described. In this machine, a new method for connected word recognition, namely inverse dynamic programming (DP) matching, is adopted. Two kinds of DP matching techniques are used in the inverse DP matching, one of which is the usual DP matching and the other is matching performed in a time reverse mode, starting from the end of speech. Combining the similarities obtained by these two kinds of matching, the similarities between input speech and word sequences are computed. Also a technique for rejecting candidates is used in the machine to reduce computation amount. The machine performance is tested by 1400 samples of connected digits. The recognition rate of 99.3% is obtained.

...read moreread less

5 citations

Book Chapter•DOI•

Self-Organized Continuous Speech Recognition

[...]

Frederick Jelinek¹•Institutions (1)

IBM¹

01 Feb 1980

TL;DR: Current efforts to recognize continuous (or “connected”) speech are aimed at constructing a voice-excited “typewriter” that automatically transcribes natural speech into ordinary (e.g. English) written form.

...read moreread less

Abstract: Current efforts to recognize continuous (or “connected”) speech are aimed at constructing a voice-excited “typewriter” that automatically transcribes natural speech into ordinary (e.g. English) written form. So far, however, only very restricted speech has been recognized. The sentences that are spoken must either be prescribed a priori by an artificial grammar which the experimenter has designed, or else limited by a vocabulary and a restricted area of discourse such as that used in business letters, book reviews, or airline reservation systems. These latter so-called natural tasks are generally much more difficult than the artificial ones (given a fixed vocabulary).

...read moreread less

Proceedings Article•DOI•

Frequency-axis warping to improve automatic word recognition

[...]

E. Neuburg

01 Apr 1980

TL;DR: Experiments show that parameters derived from casual speech improve vowel recognition markedly, and that method e) appears strongest.

...read moreread less

Abstract: Frequency normalization of talkers remains a problem in word recognition, especially where new talkers cannot be asked to provide samples (of their vowels, for example) in advance. Several methods were investigated; for each, parameters were derived by calculating their effect on formant histograms derived from casual speech. Methods tried were a) uniform multiplication of frequencies ("stretching" the vocal tract); b) "stretching" each formant region by a different amount; c) combined shift and stretch (affine mapping); d) different affine mappings for different formants (this includes warping each formant as a function of its range); e) warping each formant non-linearly as a function of its distribution. Experiments show that parameters derived from casual speech improve vowel recognition markedly, and that method e) appears strongest.

...read moreread less

Journal Article•DOI•

Words into action III: A commercial system: Dynamic programming handles varying word lengths in this pioneer continuous-speech recognizer

[...]

Y. Kato

01 Jun 1980-IEEE Spectrum

TL;DR: The DP-100 design overcomes two serious handicaps which cause inaccuracies in automatic speech recognition systems, namely the variation in the rate at which words are spoken and the general problem of continuous speech recognition.

...read moreread less

Abstract: Considers the Nippon Electric Co.'s DP-100 automatic continuous speech recognition system having an identification capability of approximately 100 words and aimed at application such as routing and inventory control in warehouses. The DP-100 design overcomes two serious handicaps which cause inaccuracies in automatic speech recognition systems, namely the variation in the rate at which words are spoken and the general problem of continuous speech recognition. The author gives details of the design and how these problems are overcome.

...read moreread less

Journal Article•DOI•

Words into action II: A task-oriented system: Harpy is an experimental, continuous-speech recognition system that exploits a low-cost minicomputer

[...]

R. Reddy¹•Institutions (1)

Carnegie Mellon University¹

01 Jun 1980-IEEE Spectrum

TL;DR: Discusses the Harpy experimental system using a low-cost minicomputer which is capable of automatic speech recognition with up to 98% accuracy when the vocabulary is restricted to 1011 words, and sentence structure limited to that used in the retrieval of abstracts of documents relating to computer technology.

...read moreread less

Abstract: Discusses the Harpy experimental system using a low-cost minicomputer which is capable of automatic speech recognition with up to 98% accuracy when the vocabulary is restricted to 1011 words, and sentence structure limited to that used in the retrieval of abstracts of documents relating to computer technology. The recognition process of the system is described.

...read moreread less

Dissertation•

Auditory speaker recognition: a theoretical and experimental study

[...]

Roger S. Brown

01 Jan 1980

Proceedings Article•DOI•

A speech/Speaker recognition and response system

[...]

G. Edwards

01 Apr 1980

TL;DR: The realization of a speech analyzer plus an LPC synthesizer in a single chip signal processing microprocessor that is able to process both algorithms in real time to create an interactive voice analyzer/response system operating under the control of a microprocessor and with the LPC speech data stored in a ROM.

...read moreread less

Abstract: The realization of a speech analyzer plus an LPC synthesizer in a single chip signal processing microprocessor is described. The chip is able to process both algorithms in real time to create an interactive voice analyzer/response system operating under the control of a microprocessor and with the LPC speech data stored in a ROM. The chip is a 16 bit microprocessor specially architectured for signal processing. It features all single cycle instructions with a 300nsec cycle time, and a 12 × 12 bit parallel multiplier pipelined to operate in a single cycle. It can be programmed to perform a wide variety of signal processing functions including speech processing.

...read moreread less

Patent•

Voice identification system using electroparatograph

[...]

Hori Seiji

18 Sep 1980

Proceedings Article•DOI•

Task-Oriented Approaches to Automatic Speech Recognition

[...]

Aaron E. Rosenberg¹, James L. Flanagan¹, Stephen E. Levinson¹, Lawrence R. Rabiner¹•Institutions (1)

Bell Labs¹

01 Feb 1980

Journal Article•DOI•

Access control by means of automatic speaker verification

[...]

M H Kuhn

01 Jan 1980-Journal of Physics E: Scientific Instruments

TL;DR: An access control system using speech samples for an automatic verification of a claimed identity using a suitable pattern recognition algorithm implemented on a minicomputer so that magnetic identity cards can be used as memory to be carried around by the user.

...read moreread less

Abstract: An access control system is described using speech samples for an automatic verification of a claimed identity. A hardware speech analysis processor extracts spectral information from the utterance and compares these features with a reference stored under the claimed identity using a suitable pattern recognition algorithm implemented on a minicomputer. This reference can be reduced in storage size per speaker such that magnetic identity cards can be used as memory to be carried around by the user. Thus the number of users is not limited by the system and no large memory is required in the minicomputer. The results of first experiments are reported.

...read moreread less

Patent•

Speaker or similar article

[...]

Lester M. Barcus, John F. Berry

19 Aug 1980

Book Chapter•DOI•

A Self-Adaptive Fuzzy Recognition System for Speech Sounds

[...]

Sankar K. Pal¹, D. Dutta Majumder¹•Institutions (1)

Indian Statistical Institute¹

01 Jan 1980

TL;DR: The paper presents a part of the investigations being carried out by the authors on methods and applications of pattern recognition and image analysis on problems on automatic computer recognition of speech sounds using fuzzy logic.

...read moreread less

Abstract: The paper presents a part of the investigations being carried out by the authors on methods and applications of pattern recognition and image analysis. A class of problems on automatic computer recognition of speech sounds using fuzzy logic is being dealt with. The input patterns are usually given as deterministic data although they may contain some fuzziness, and the output decision is also deterministic but the process of classification is fuzzy in nature.

...read moreread less

Book Chapter•DOI•

A Case of Speaker Identification

[...]

M. Koukias¹, G. Kokkinakis¹•Institutions (1)

University of Patras¹

01 Jan 1980

TL;DR: The possibility of identifying a speaker is examined, when strong differences exist, both in the type of speech and in the voice-recording conditions.

...read moreread less

Abstract: The possibility of identifying a speaker is examined, when strong differences exist, both in the type of speech and in the voice-recording conditions

...read moreread less