scispace - formally typeset
Search or ask a question

Showing papers on "Speaker recognition published in 1972"


Journal ArticleDOI
TL;DR: In this article, the authors describe an approach to selecting acoustic parameters that are closely related to voice characteristics that distinguish speakers, which are movitated by known relations between the voice signal and vocal tract shapes and gestures.
Abstract: In a scheme for the mechanical recognition of speakers, it is desirable to use acoustic parameters that are closely related to voice characteristics that distinguish speakers. This paper describes an investigation of an efficient approach to selecting such parameters, which are movitated by known relations between the voice signal and vocal‐tract shapes and gestures. Rather than general measurements over the extent of an utterance, only significant features of selected segments are used. A simulation of a speaker recognition system was performed by manually locating speech events within utterances and using parameters measured at these locations to classify the speakers. Useful parameters were found in fundamental frequency, features of vowel and nasal consonant spectra, estimation of glottal source spectrum slope, word duration, and voice onset time. These parameters were tested in speaker recognition paradigms using simple linear classification procedures. When only 17 such parameters were used, no errors were made in speaker identification from a set of 21 adult male speakers. Under the same conditions, speaker verification errors of the order of 2% were also obtained.

223 citations


Journal ArticleDOI
TL;DR: In this article, a speaker identification experiment was performed in a population of 10 female speakers, where the acoustical information was represented by sets of 12 predictor coefficients obtained by minimizing the mean-squared prediction error over speech segments 50 msec in duration.
Abstract: In automatic speaker recognition methods, the speaker to be recognized is usually required to speak the same utterance which was used to obtain the reference pattern for that speaker. However, such a restriction is not generally necessary for speaker recognition by humans. Is reliable automatic speaker recognition similarly possible? A speaker identification experiment was performed in a population of 10 female speakers. The acoustical information was represented by sets of 12 predictor coefficients obtained by minimizing the mean‐squared prediction error over speech segments 50 msec in duration. Each set of these predictor coefficients was represented by a 12‐dimensional vector. New sets of coordinates which minimized the intraspeaker variance were determined by linear transformation of the original vector space. The segment of the utterance used to identify an individual was different from the segments used to form the reference pattern for that individual. For each segment, the unknown vector was correlated with the reference vectors and the correlations were averaged over a number of segments—the speaker with the largest correlation was identified as the unknown speaker. The over‐all identification accuracy was 93% for 40 speech segments. These results suggest that successful automatic speaker identification is possible independently of the spoken text.

15 citations


Journal ArticleDOI
TL;DR: Three ordering methods are presented that appear intuitively reasonable for minimizing the miss probability and are applied to the problem of verifying the purported identity of a speaker from a sample of the speaker's voice.
Abstract: A nonparametric classification procedure based on distribution-free tolerance regions is presented. Without knlowledge of the class probability distributions, the procedure gives information about the expected performance of the classifier through use of only one sample of statistically independent observations from each class. With this procedure, a two-class discriminant can be designed for a given expected false alarm probability or for a given confidence that the false alarm probability is less than a given amount. Three ordering methods are presented that appear intuitively reasonable for minimizing the miss probability. Even though the methods do not, in general, meet this objective, they are easily implemented on a computer and can give good results. A procedure for obtaining a measure of the miss probability is also presented. These methods are applied to the problem of verifying the purported identity of a speaker from a sample of the speaker's voice.

12 citations



01 Mar 1972
TL;DR: Departures from visual pattern recognition techniques are introduced and proven effective and a relation to the human physiology is maintained through an elementary model.
Abstract: : Speech recognition is accomplished by off-line machine processes based on visual pattern recognition techniques. The fundamental system uses digitized data output from a KY-585 Vocoder, and two-dimensional discrete Fourier transforms with spatial frequency filters.. Two male speakers generated data for the computer processes which include a speaker adaptation routine. A relation to the human physiology is maintained through an elementary model. For a 39 word vocabulary, recognition rates reached 92% for the single speaker process, and 79% for an either-of-two-speaker process. Departures from visual pattern recognition techniques are introduced and proven effective. (Author)

1 citations