Frame pruning for automatic speaker identification

Open AccessProceedings Article

Frame pruning for automatic speaker identification

- pp 1-4

TLDR

Validation of the pruning procedure on 567 speakers leads to a significative improvement on TIMIT and NTIMIT (up to 30% error rate reduction on TIM IT) and a prior frame level likelihood normalization in order to make comparison between frames meaningful.

Abstract:

In this paper, we propose a frame selection procedure for text-independent speaker identification Instead of averaging the frame likelihoods along the whole test utterance, some of these are rejected (pruning) and the final score is computed with a limited number of frames This pruning stage requires a prior frame level likelihood normalization in order to make comparison between frames meaningful This normalization procedure alone leads to a significative performance enhancement As far as pruning is concerned, the optimal number of frames pruned is learned on a tuning data set for normal and telephone speech Validation of the pruning procedure on 567 speakers leads to a significative improvement on TIMIT and NTIMIT (up to 30% error rate reduction on TIMIT)

Citations

PDF

Open Access

More filters

On the use of score pruning in speaker verification for speaker dependent threshold estimation.

Javier R. Saeta, +1 more

TL;DR: Before estimating the threshold, score pruning removes outliers and improves subsequent estimations, and to solve the problem of impostor data, a speaker dependent threshold estimation with only data from clients is suggested.

...read moreread less

Proceedings ArticleDOI

Reliable Speaker Identification Using Multiple Microphones in Ubiquitous Robot Companion Environment

Mikyong Ji, +4 more

TL;DR: The speaker identification system can provide human-robot interaction with a reliable basic interface with high classification accuracy and improve the recognition performance of speaker identification with multiple microphones on the robot side in adverse distant-talking environments.

...read moreread less

Decisión threshold estimation and model quality evaluation techniques for speaker verification.

Javier Rodríguez Saeta

TL;DR: In this article, a tesis doctoral se centra en las etapas de entrenamiento and decision of un sistema de verificacion de locutores.

...read moreread less

Journal ArticleDOI

The use of adaptive frame for speech recognition

Sam Kwong, +1 more

- 01 Jun 2001 -

EURASIP Journal on Advances in Signal Pr...

TL;DR: Word recognition experiments on the TIMIT and NON-TIMIT with discrete Hidden Markov Model (HMM) and continuous density HMM showed that steady performance improvement could be achieved for open set testing, proving the effectiveness of the proposed adaptive frame length feature extraction scheme especially for the open testing.

...read moreread less

Proceedings ArticleDOI

Text-Independent Speaker Iden tification using Soft Channel Selection in a Multi-Microphone Environment

Mikyong Ji, +3 more

TL;DR: In this article, a text-independent speaker identification system was proposed to improve speaker identification in a multi-microphone environment, which incorporates soft channel selection before the combination of the identification results obtained by multiple microphones.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Statistical Pattern Recognition

Alex M. Andrew

- 01 Apr 2000 -

Kybernetes

TL;DR: Introduction to statistical pattern recognition and nonlinear discriminant analysis - statistical methods.

...read moreread less

Book

Statistical pattern recognition

Keinosuke Fukunaga

Journal ArticleDOI

Text-independent speaker identification

H. Gish, +1 more

- 01 Oct 1994 -

IEEE Signal Processing Magazine

TL;DR: A robust speaker-identification system is presented that was able to deal with various forms of anomalies that are localized in time, such as spurious noise events and crosstalk.

...read moreread less

Proceedings ArticleDOI

NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database

Charles Jankowski, +3 more

TL;DR: The creation of the network TIMIT (NTIMIT) database, which is the result of transmitting the TIMIT database over the telephone network, is described, including characteristics useful for speech analysis and recognition.

...read moreread less

Journal ArticleDOI

Second-order statistical measures for text-independent speaker identification

Frédéric Bimbot, +2 more

- 01 Aug 1995 -

Speech Communication

TL;DR: The use of some of the proposed measures as a reference benchmark to evaluate the intrinsic complexity of a given database under a given protocol is suggested as a conclusion to this work.

...read moreread less

Frame pruning for automatic speaker identification

Citations

On the use of score pruning in speaker verification for speaker dependent threshold estimation.

Reliable Speaker Identification Using Multiple Microphones in Ubiquitous Robot Companion Environment

Decisión threshold estimation and model quality evaluation techniques for speaker verification.

The use of adaptive frame for speech recognition

Text-Independent Speaker Iden tification using Soft Channel Selection in a Multi-Microphone Environment

References

Statistical Pattern Recognition

Statistical pattern recognition

Text-independent speaker identification

NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database

Second-order statistical measures for text-independent speaker identification

Related Papers (5)

Frame pruning for speaker recognition

Normalizations and selection of speech segments for speaker recognition scoring

Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach

Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers.

On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition