scispace - formally typeset
Search or ask a question
Topic

Speaker recognition

About: Speaker recognition is a research topic. Over the lifetime, 14990 publications have been published within this topic receiving 310061 citations.


Papers
More filters
Proceedings Article
01 Sep 2003
TL;DR: This paper shows how this problem can be tackled using a data driven approach which selects appropriate speech examples as candidates for DTW-alignment, resulting in an explosion of the search space.
Abstract: The dominant acoustic modeling methodology based on Hidden Markov Models is known to have certain weaknesses Partial solutions to these flaws have been presented, but the fundamental problem remains: compression of the data to a compact HMM discards useful information such as time dependencies and speaker information In this paper, we look at pure example based recognition as a solution to this problem By replacing the HMM with the underlying examples, all information in the training data is retained We show how information about speaker and environment can be used, introducing a new interpretation of adaptation The basis for the recognizer is the wellknown DTW algorithm, which has often been used for small tasks However, large vocabulary speech recognition introduces new demands, resulting in an explosion of the search space We show how this problem can be tackled using a data driven approach which selects appropriate speech examples as candidates for DTW-alignment

66 citations

01 Jan 1999
TL;DR: A new method for automatic speech recognition is developed where the natural statistical properties of speech are used to determine the probabilistic model, and can be seen as a general discriminative structure-learning procedure for Bayesian networks.
Abstract: The performance of state-of-the-art speech recognition systems is still far worse than that of humans. This is partly caused by the use of poor statistical models. In a general statistical pattern classification task, the probabilistic models should represent the statistical structure unique to and distinguishing those objects to be classified. In many cases, however, model families are selected without verification of their ability to represent vital discriminative properties. For example, Hidden Markov Models (HMMs) are frequently used in automatic speech recognition systems even though they possess conditional independence properties that might cause inaccuracies when modeling and classifying speech signals. In this work, a new method for automatic speech recognition is developed where the natural statistical properties of speech are used to determine the probabilistic model. Starting from an HMM, new models are created by adding dependencies only if they are not already well captured by the HMM, and only if they increase the model's ability to distinguish one object from another. Based on conditional mutual information, a new measure is developed and used for dependency selection. If dependencies are selected to maximize this measure, then the class posterior probability is better approximated leading to a lower Bayes classification error. The method can be seen as a general discriminative structure-learning procedure for Bayesian networks. In a large-vocabulary isolated-word speech recognition task, test results have shown that the new models can result in an appreciable word-error reduction relative to comparable HMM systems.

66 citations

Patent
23 Oct 1998
TL;DR: In this paper, superwords are used to refer to those word combinations which are so often spoken that they are recognized as units or should have models to reflect them in the language model.
Abstract: This invention is directed to the selection of superwords based on a criterion relevant to speech recognition and understanding. Superwords are used to refer to those word combinations which are so often spoken that they are recognized as units or should have models to reflect them in the language model. The selected superwords are placed in a lexicon along with selected meaningful phrases. The lexicon is then used by a speech recognizer to improve recognition of input speech utterances for the proper routing of a user's task objectives.

66 citations

Proceedings ArticleDOI
04 May 2020
TL;DR: In this paper, the authors extend real-valued learned and parameterized filterbanks into complex-valued analytic filterbanks and define a set of corresponding representations and masking strategies, and evaluate these filterbanks on a newly released noisy speech separation dataset.
Abstract: Single-channel speech separation has recently made great progress thanks to learned filterbanks as used in ConvTasNet. In parallel, parameterized filterbanks have been proposed for speaker recognition where only center frequencies and bandwidths are learned. In this work, we extend real-valued learned and parameterized filterbanks into complex-valued analytic filterbanks and define a set of corresponding representations and masking strategies. We evaluate these filterbanks on a newly released noisy speech separation dataset (WHAM). The results show that the proposed analytic learned filterbank consistently outperforms the real-valued filterbank of ConvTasNet. Also, we validate the use of parameterized filterbanks and show that complex-valued representations and masks are beneficial in all conditions. Finally, we show that the STFT achieves its best performance for 2 ms windows.

66 citations

Proceedings Article
01 Jan 1999
TL;DR: A method for constructing "semantic similarities" between words and hence estimating a confidence is described, based on the construction of "metamodels," which generate alternative word hypotheses for an utterance.

66 citations


Network Information
Related Topics (5)
Feature vector
48.8K papers, 954.4K citations
83% related
Recurrent neural network
29.2K papers, 890K citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Signal processing
73.4K papers, 983.5K citations
81% related
Decoding methods
65.7K papers, 900K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023165
2022468
2021283
2020475
2019484
2018420