Showing papers on "Speaker diarisation published in 1982"

PDF

Open Access

Patent•DOI•

Method and apparatus for text-independent speaker recognition

[...]

03 Nov 1982-Journal of the Acoustical Society of America

TL;DR: In this article, a method and apparatus for recognizing an unknown speaker from a plurality of speaker candidates is presented, where portions of speech from the speaker candidates and from the unknown speaker are sampled and digitized.

...read moreread less

Abstract: A method and apparatus for recognizing an unknown speaker from a plurality of speaker candidates. Portions of speech from the speaker candidates and from the unknown speaker are sampled and digitized. The digitized samples are converted into frames of speech, each frame representing a point in an LPC-12 multi-dimensional speech space. Using a character covering algorithm, a set of frames of speech is selected, called characters, from the frames of speech of all speaker candidates. The speaker candidates' portions of speech are divided into smaller portions called segments. A smaller plurality of model characters for each speaker candidate is selected from the character set. For each set of model characters the distance from each speaker candidate's frame of speech to the closest character in the model set is determined and stored in a model histogram. When a model histogram is completed for a segment a distance D is found whereby at least a majority of frames have distances greater D. The mean distance value of D and variance across all segments for both speaker and imposter is then calculated. These values are added to the set of model characters to form the speaker model. To perform recognition the frames of the unknown speaker as they are received are buffered and compared with the sets of model characters to form model histograms for each speaker. A likelihood ratio is formed. The speaker candidate with the highest likelihood ratio is chosen as the unknown speaker.

...read moreread less

44 citations

Journal Article•DOI•

Speaker independent connected word recognition using a syntax-directed dynamic programming procedure

[...]

C. Myers¹, Stephen E. Levinson²•Institutions (2)

Massachusetts Institute of Technology¹, Alcatel-Lucent²

01 Aug 1982-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A method for speaker independent connected word recognition is described, based on a syntax-directed dynamic programming algorithm which matches the isolated word templates to sentence length utterances of a 100 speaker population.

...read moreread less

Abstract: A method for speaker independent connected word recognition is described. Speaker independence is achieved by clustering isolated word utterances of a 100 speaker population. Connected word recognition is based on a syntax-directed dynamic programming algorithm which matches the isolated word templates to sentence length utterances. The method has been tested on an artificial task-oriented language based on a 127 word vocabulary. Four subjects, two men and two women, spoke a total of 209 sentences comprising 1750 words. At an average speaking rate of 171 words/min over dialed-up telephone lines, a correct word recognition rate of 97 percent was observed.

...read moreread less

27 citations

Proceedings Article•DOI•

Speaker adaptation by a linear transformation with optimised parameters

[...]

J. Jaschul¹•Institutions (1)

Technische Universität München¹

01 May 1982

TL;DR: A grouping of phoneme is proposed so that one adaptation parameter set is used for all phonemes that belong to any one group, and the cost of phoneme class-specific adaptation is very high, but the method needs a large learning set.

...read moreread less

Abstract: Speaker dependence of automatic speech recognition systems can be reduced by applying speaker-specific transformations to adapt the speech signal of a new speaker to that of the reference speaker. Initial investigations showed that speaker adaptation can be performed by transformations using spectral weighting and spectral warping. These heuristic methods can be substituted by a general linear matrix transformation, the parameters of which are determined by mean square error optimisation. The improvement of the recognition rate achievable by this matrix transformation is very high, but the method needs a large learning set. This can be reduced by restriction of the matrix to a band including the main diagonal in the middle. This banded matrix yields results close to those of the general matrix. Adaptation can be performed speaker-specifically as well as speaker- and class-specifically. As the cost of phoneme class-specific adaptation is very high, a grouping of phonemes is proposed so that one adaptation parameter set is used for all phonemes that belong to any one group.

...read moreread less

15 citations

Journal Article•DOI•

Report of the Speaker

[...]

Lawrence Hartmann

01 Oct 1982-American Journal of Psychiatry

13 citations

Journal Article•DOI•

Text‐independent speaker recognition with short utterances

[...]

K. P. Li, E. H. Wrench

01 Nov 1982-Journal of the Acoustical Society of America

TL;DR: A new approach to text‐independent speaker recognition, developed to perform with short unknown utterances, models the spectral traits of a speaker with multiple sub‐models rather than using a single statistical distribution as done with previous approaches.

...read moreread less

Abstract: This paper presents a new approach to text‐independent speaker recognition. The technique, developed to perform with short unknown utterances, models the spectral traits of a speaker with multiple sub‐models rather than using a single statistical distribution as done with previous approaches. The recognition is based on the statistical distribution of the distances between the unknown speaker and each of the speaker models. Only frames that are close to one of the speaker's sub‐models are considered in the recognition decision, so that speech events not encountered in the training data do not bias the recognition. The technique has been tested on a conversational data base. Models were generated using 100 s of speech from each of 11 male talkers. Unknown speech was obtained one week after the model data. Recognition accuracies of 96%, 87%, and 79% were obtained for unknown speech durations of 10, 5, and 3 s, respectively. The use of multiple sub‐models to characterize spectral traits results in improved discrimination between speakers, particularly when short speech segments are recognized. [Work supported by U. S. Air Force, Rome Air Development Center.]

...read moreread less

12 citations

Journal Article•DOI•

Text-independent speaker recognition: A review and some new results

[...]

Malayappan Shridhar¹, N. Mohankrishnan¹•Institutions (1)

University of Windsor¹

01 Dec 1982-Speech Communication

TL;DR: The development of a high accuracy (about 99%) text-independent speaker recognition system is discussed in this paper and any two parameter sets of the first stage tests are combines logically to obtain a significantly higher recognition accuracy than is possible with any single-speaker-sensitive parameter set.

...read moreread less

12 citations

Book Chapter•DOI•

Speaker Recognition: A Survey

[...]

Patrick Corsi¹•Institutions (1)

IBM¹

01 Jan 1982

TL;DR: This paper presents a unified discussion of the scientific and practical issues in the field of speaker recognition, and distinguishes between the Verification and Identification tasks.

...read moreread less

Abstract: This paper presents a unified discussion of the scientific and practical issues in the field of speaker recognition. Besides some background on speaker recognition by listening and visual analysis of spectrograms, we survey the computer recognition methods, and briefly discuss some technical aspects of various speaker recognizers, Methods for selecting an efficient set of features, and examples of results of experimental studies are also presented. We then differentiate between the Verification and Identification tasks.

...read moreread less

7 citations

Book Chapter•DOI•

Speaker Independent Connected Word Recognition

[...]

Stephen E. Levinson¹•Institutions (1)

Bell Labs¹

01 Jan 1982

...read moreread less

Abstract: A method for speaker independent connected word recognition is described. Speaker independence is achieved by clustering isolated word utterances of a 100 speaker population. Connected word recognition is based on a syntax-directed dynamic programming algorithm which matches the isolated word templates to sentence length utterances. The method has been tested on a task oriented English-like language based on a 127 word vocabulary. Four subjects, two men and two women, spoke a total of 209 sentences comprising 1750 words. At an average speaking rate of 171 words per minute over dialed-up telephone lines, a correct word recognition rate of 97% was observed.

...read moreread less

5 citations

Trying for speaker independence in the use of speaker dependent voice recognition equipment

[...]

Entner Roland, B. Jay Martin, N. D. Schwalm, G. K. Poock

01 Dec 1982

TL;DR: An experiment to determine the possibilities of obtaining some speaker independence using speaker dependent voice recognition equipment revealed about 99% accuracy when the user's speech templates were in memory along with those of four other users.

...read moreread less

Abstract: : This report discusses the results of an experiment to determine the possibilities of obtaining some speaker independence using speaker dependent voice recognition equipment. The results revealed about 99% accuracy when the user's speech templates were in memory along with those of four other users. If the user's voice patterns were not in memory but those of the four other users still were in memory, recognition accuracy still hovered around 95%. (Author)

...read moreread less

4 citations

Journal Article•DOI•

Automatic speaker recognition using time alignment of spectrograms

[...]

Hermann Ney¹•Institutions (1)

Philips¹

01 Jan 1982-Speech Communication

TL;DR: New techniques for automatic speaker recognition from telephone speech are described, based on spectral analysis of fixed sentence-long utterances, which is carried out by a dynamic programming algorithm which minimizes timing differences between corresponding speech events.

...read moreread less

3 citations

Journal Article•DOI•

What is speaker recognition

[...]

Roger W. Brown

01 Jun 1982-Journal of the International Phonetic Association

Journal Article•DOI•

Ergonomic Aspects for Improving Recognition Performance of Voice Input Systems

[...]

H. Mutschler¹•Institutions (1)

Fraunhofer Society¹

01 Sep 1982-IFAC Proceedings Volumes

TL;DR: In a series of experiments 7 parameters (speaker training, long-term speech consistency, system training, speaker sex, vocabulary phonetics, -size, background noise) were tested in a simple word input task.

...read moreread less