Showing papers on "Speaker diarisation published in 1981"

PDF

Open Access

Patent•

Speaker identification system using word recognition templates

[...]

John E. Holmgren¹, Aaron E. Rosenberg¹, John W. Upton¹•Institutions (1)

27 Mar 1981

TL;DR: In this paper, a set of signals representative of the correspondence of the identified speaker's features with the feature templates of said reference words is generated, and an unknown speaker is analyzed and the reference word sequence of the utterance is identified.

...read moreread less

Abstract: In a speaker recognition and verification arrangement, acoustic feature templates are stored for predetermined reference words. Each template is a standardized set of acoustic features for one word, formed for example by averaging the values of acoustic features from a plurality of speakers. Responsive to the utterances of identified speakers, a set of signals representative of the correspondence of the identified speaker's features with said feature templates of said reference words is generated. An utterance of an unknown speaker is analyzed and the reference word sequence of the utterance is identified. A set of signals representative of the correspondence of the unknown speaker's utterance features and the stored templates for the recognized words is generated. The unknown speaker is identified jointly responsive to the correspondence signals of the identified speakers and unknown speaker.

...read moreread less

65 citations

Patent•DOI•

Method and apparatus for generating speech pattern templates

[...]

Frank Christopher Pirz¹, Lawrence R. Rabiner¹, Jay G. Wilpon¹•Institutions (1)

Bell Labs¹

19 Nov 1981-Journal of the Acoustical Society of America

TL;DR: In this paper, a system for generating demisyllable templates from a reference first speaker using both manual and automatic analysis is presented. But the analysis for a second speaker is simplified and automated by comparing with the first speaker's templates.

...read moreread less

Abstract: A system for generating speech pattern templates for use with either speech recognition or speech synthesis. Reference demisyllable templates are first generated from a reference first speaker using both manual and automatic analysis. The analysis for a second speaker is simplified and automated by comparing with the first speaker's templates. The second speaker speaks the same words at a rate time-warped to match the first speakers rate and template. We define a demisyllable as each of the two halves of a syllable, assuming a syllable starts and ends with a noisy consonant, and the syllable is split at its vowel center, thereby simplifying concatenation and comparison. Key features of the invention include generating a set of signals representative of the time alignment between the first and second speaker's templates, and the time-of-occurence boundaries of each syllable in a word.

...read moreread less

28 citations

Proceedings Article•DOI•

Speaker identification and verification combined with speaker independent word recognition

[...]

Aaron E. Rosenberg¹, K. Shipley•Institutions (1)

Bell Labs¹

01 Apr 1981

TL;DR: In this study it is hypothesized that distributions of template distance scores are reasonably consistent for individual speakers and vary characteristically from speaker to speaker.

...read moreread less

Abstract: One method for providing speaker independent word recognition capability is to construct a small set of templates for each vocabulary word that typifies and spans individual speaker word reference templates over a large population of speakers. Word recognition decision functions are based on combinations of template distance scores obtained by processing an unknown input utterance and comparing it with the ensemble of reference templates. In this study it is hypothesized that distributions of template distance scores are reasonably consistent for individual speakers and vary characteristically from speaker to speaker. This property is exploited to provide a speaker recognition capability in combination with word recognition. It is shown that good speaker recognition performance depends on the input of a sequence of distinct words. For a 20-speaker population, on the average, the correct speaker is in the top 1% of the candidates in the identification made over a sequence of seven distinct words.

...read moreread less

10 citations

Proceedings Article•DOI•

A realtime implementation of a text independent speaker recognition system

[...]

E. Wrench

01 Apr 1981

TL;DR: The results of this investigation clearly show that Markel's technique is superior for applications using very short speech segments for both the speaker models and the recognition trials.

...read moreread less

Abstract: This paper describes the design and implementation of a realtime speaker recognition system. The system performs text independent, closed set speaker recognition with up to 30 talkers in realtime. In addition, the reference speech used to characterize the 30 talkers can be extracted from as little as 10 seconds of speech from each talker, and the actual recognition performed with less than one minute of speech from the unknown talker. Two speaker recognition algorithms previously developed by Markel and Pfeifer were investigated for use in the realtime system. The results of this investigation clearly show that Markel's technique is superior for applications using very short speech segments for both the speaker models and the recognition trials. Markel's technique was implemented in realtime in a high speed progranmable signal processor. A test of this implementation with a set of 30 male speakers resulted in recognition accuracies of 93-100% for models generated with only 10 seconds of speech, and recognition trials using only 10 seconds of unknown speech.

...read moreread less

7 citations

Proceedings Article•DOI•

Speaker adaptation for phoneme recognition

[...]

Yves Grenier, L. Miclet, J. Maurin, H. Michel

01 Apr 1981

TL;DR: A procedure for the adaptation of a phonetic recognition system to a new speaker, where the reference for each phoneme of the autoregressive model is transformed linearly in order to fit optimally the utterances of a new speakers.

...read moreread less

Abstract: A procedure for the adaptation of a phonetic recognition system to a new speaker, is described. The reference for each phoneme(cepstrum of the autoregressive model) is transformed linearly in order to fit optimally the utterances of a new speaker. In a first step, samples uttered by the new speaker are mapped onto corresponding utterances (reference) through a dynamic comparison. In a second step, the linear transformation is computed through canonical correlation analysis of the samples. Experimental simulations provided satisfactory results.

...read moreread less

6 citations

Journal Article•DOI•

Effect of reference set selection on speaker dependent speech recognition

[...]

Zongge Li, F. Alleva, Raj Reedy¹•Institutions (1)

Carnegie Mellon University¹

01 May 1981-Journal of the Acoustical Society of America

TL;DR: The authors presented an algorithm which chooses a reference template for each word in the vocabulary from a set of N exemplars, which minimizes the worst matching behavior and total error over the N sets of exemplars.

...read moreread less

Abstract: Presented here for a speaker dependent system, is an algorithm which chooses a reference template for each word in the vocabulary from a set of N exemplars. The goal of the algorithm is to produce a reference set that minimizes the worst matching behavior and total error over the N sets of exemplars. The results of the experiments presented here show a reduction in the average error rate from 16.4% to 10.2% over a set of 4 male speakers and 4 female speakers.

...read moreread less

5 citations