scispace - formally typeset
Search or ask a question

Showing papers on "Speaker diarisation published in 1972"


Journal ArticleDOI
TL;DR: In this article, the authors describe an approach to selecting acoustic parameters that are closely related to voice characteristics that distinguish speakers, which are movitated by known relations between the voice signal and vocal tract shapes and gestures.
Abstract: In a scheme for the mechanical recognition of speakers, it is desirable to use acoustic parameters that are closely related to voice characteristics that distinguish speakers. This paper describes an investigation of an efficient approach to selecting such parameters, which are movitated by known relations between the voice signal and vocal‐tract shapes and gestures. Rather than general measurements over the extent of an utterance, only significant features of selected segments are used. A simulation of a speaker recognition system was performed by manually locating speech events within utterances and using parameters measured at these locations to classify the speakers. Useful parameters were found in fundamental frequency, features of vowel and nasal consonant spectra, estimation of glottal source spectrum slope, word duration, and voice onset time. These parameters were tested in speaker recognition paradigms using simple linear classification procedures. When only 17 such parameters were used, no errors were made in speaker identification from a set of 21 adult male speakers. Under the same conditions, speaker verification errors of the order of 2% were also obtained.

223 citations


Journal ArticleDOI
TL;DR: In this article, a speaker identification experiment was performed in a population of 10 female speakers, where the acoustical information was represented by sets of 12 predictor coefficients obtained by minimizing the mean-squared prediction error over speech segments 50 msec in duration.
Abstract: In automatic speaker recognition methods, the speaker to be recognized is usually required to speak the same utterance which was used to obtain the reference pattern for that speaker. However, such a restriction is not generally necessary for speaker recognition by humans. Is reliable automatic speaker recognition similarly possible? A speaker identification experiment was performed in a population of 10 female speakers. The acoustical information was represented by sets of 12 predictor coefficients obtained by minimizing the mean‐squared prediction error over speech segments 50 msec in duration. Each set of these predictor coefficients was represented by a 12‐dimensional vector. New sets of coordinates which minimized the intraspeaker variance were determined by linear transformation of the original vector space. The segment of the utterance used to identify an individual was different from the segments used to form the reference pattern for that individual. For each segment, the unknown vector was correlated with the reference vectors and the correlations were averaged over a number of segments—the speaker with the largest correlation was identified as the unknown speaker. The over‐all identification accuracy was 93% for 40 speech segments. These results suggest that successful automatic speaker identification is possible independently of the spoken text.

15 citations


Journal ArticleDOI
TL;DR: In this article, the authors evaluated the susceptibility to mimicry of the speaker-verification technique discussed in the paper "Automatic Speaker Verification Using Phoneme Spectra" with the assistance of a professional performer specializing in impersonations.
Abstract: The susceptibility to mimicry of the speaker‐verification technique discussed in the paper “Automatic Speaker Verification Using Phoneme Spectra” was evaluated with the assistance of a professional performer specializing in impersonations. Subjects to be mimicked were six speakers selected from the previously recorded population. After becoming familiar with each speaker, the impersonator was permitted to mimic the speaker immediately after his utterance of each selected word. The subjects' utterances and the impersonator's mimic attempts were then processed for speaker verification on a single‐phoneme basis and using combinations of several phonemes. Spectral analysis of individual phoneme segments revealed that some increase in similarity was accomplished by the mimic for certain speakers and phonemes. However, when the verification procedure was applied using features from five phonemes, the impersonator was unsuccessful in all mimic attempts.

5 citations


Journal ArticleDOI
TL;DR: In this paper, an on-line version of the scheme for speaker verification reported previously is described, with a Honeywell DDP-516 computer, with microphone and keyboard input, disk storage, and graphic output.
Abstract: The implementation is described of an on‐line version of the scheme for speaker verification reported previously. A Honeywell DDP‐516 computer is used, with microphone and keyboard input, disk storage, and graphic output. Utterances are converted to pitch and gain contours and compared with similar functions fetched from disk that represent past vocal behavior of the speaker whose identity is claimed. The comparison process includes automatic temporal registration. Special features of the implementation include the ability to update stored reference patterns at will so as to incorporate the features of new utterances, and a graphic display that effectively illustrates which portions of a test utterance are within the programmed limits of acceptability.

3 citations


Journal ArticleDOI
TL;DR: Four professional mimics were engaged to deliberately imitate each of the eight “true” speakers in a speaker verification task in which 32 “casual” impostors were pitted against eight ‘true’ speakers.
Abstract: In a previous experiment [Rosenberg, J. Acoust. Soc. Amer. 50, 106(A) (1971)] listener performance was evaluated in a speaker verification task in which 32 “casual” impostors were pitted against eight “true” speakers. A “casual” impostor is one who makes no attempt to mimic the “true” speaker but simply repeats the same test utterance in his own natural voice. In the present experiment four professional mimics were engaged to deliberately imitate each of the eight “true” speakers. After intensive training their recorded utterances were used in an experiment in which 10 listeners participated. Each stimulus presentation was a paired comparison consisting of a challenge and a reference utterance. The reference utterance was one of the “true” speaker utterances while the challenge was, with equal likelihood, a mimic utterance of the reference speaker, a natural utterance of one of the mimics, or another utterance from the reference speaker. The results of the present experiment are compared with the results of the previous experiment cited and the performance of an automatic system for speaker verification [Lummis, J. Acoust. Soc. Amer. 50, 106(A) (1971)].

2 citations


Journal ArticleDOI

1 citations