Feature combination using linear discriminant analysis and its pitfalls.

Open AccessProceedings Article

Feature combination using linear discriminant analysis and its pitfalls.

Chats0

TLDR

It is shown that the combination of acoustic features using LDA does not consistently lead to improvements in word error rate, and relative improvements inword error rate of up to 5% were observed for LDA-based combination of multiple acoustic features.

Abstract:

In this paper, Linear Discriminant Analysis (LDA) is investigated with respect to the combination of different acoustic features for automatic speech recognition. It is shown that the combination of acoustic features using LDA does not consistently lead to improvements in word error rate. A detailed analysis of the recognition results on the Verbmobil (VM II) and on the English portion of the European Parliament Plenary Sessions (EPPS) corpus is given. This includes an independent analysis of the effect of the dimension of the input to LDA, the effect of strongly correlated input features, as well as a detailed numerical analysis of the generalized eigenvalue problem underlying LDA. Relative improvements in word error rate of up to 5% were observed for LDA-based combination of multiple acoustic features.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition

Ralf Schlüter, +3 more

TL;DR: The gammatone features presented here lead to competitive results on the EPPS English task, and considerable improvements were obtained by subsequent combination to a number of standard acoustic features, i.e. MFCC, PLP, MF-PLP, and VTLN plus voicedness.

...read moreread less

BookDOI

Handbook of Natural Language Processing and Machine Translation

Joseph Olive, +2 more

TL;DR: This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program--The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation.

...read moreread less

Dissertation

A log-linear discriminative modeling framework for speech recognition.

Georg Heigold, +1 more

TL;DR: A log-linear modeling framework is established in the context of discriminative training criteria, with examples from continuous speech recognition, part-of-speech tagging, and handwriting recognition, and the focus will be on the theoretical and experimental comparison of different training algorithms.

...read moreread less

Journal ArticleDOI

Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition

Giulia Garau, +1 more

- 01 Mar 2008 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: The results indicate that combining conventional and pitch-synchronous acoustic feature sets using HLDA results in a consistent, significant decrease in word error rate across all three LVCSR tasks.

...read moreread less

Proceedings ArticleDOI

Hierarchical Bottle Neck Features for LVCSR

Christian Plahl, +2 more

TL;DR: Even though the hierarchical and bottle neck processing performs equally well, the combination of both topologies improves the system by 5% relative, and the MFCC baseline system is improved by up to 20% relative.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Matrix computations

Gene H. Golub

Journal ArticleDOI

Numerical Recipes in C: The Art of Scientific Computing

Mary C. Seiler, +1 more

- 01 Sep 1989 -

Risk Analysis

Journal ArticleDOI

Perceptual linear predictive (PLP) analysis of speech

Hynek Hermansky

- 01 Apr 1990 -

Journal of the Acoustical Society of Ame...

TL;DR: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, which uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum, and yields a low-dimensional representation of speech.

...read moreread less

Proceedings ArticleDOI

Linear discriminant analysis for improved large vocabulary continuous speech recognition

Reinhold Haeb-Umbach, +1 more

TL;DR: The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally and the largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be sub-phone units.

...read moreread less

Journal ArticleDOI

The generalized Schur decomposition of an arbitrary pencil A–lB—robust software with error bounds and applications. Part I: theory and algorithms

James Demmel, +1 more

- 01 Jun 1993 -

ACM Transactions on Mathematical Softwar...

TL;DR: Robust software with error bounds for computing the generalized Schur decomposition of an arbitrary matrix pencil A – λB (regular or singular) is presented.

...read moreread less

Speech Communication

Tandem connectionist feature extraction for conventional HMM systems

Hynek Hermansky, +2 more

Linear discriminant analysis for improved large vocabulary continuous speech recognition

Reinhold Haeb-Umbach, +1 more

Feature combination using linear discriminant analysis and its pitfalls.

Citations

Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition

Handbook of Natural Language Processing and Machine Translation

A log-linear discriminative modeling framework for speech recognition.

Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition

Hierarchical Bottle Neck Features for LVCSR

References

Matrix computations

Numerical Recipes in C: The Art of Scientific Computing

Perceptual linear predictive (PLP) analysis of speech

Linear discriminant analysis for improved large vocabulary continuous speech recognition

The generalized Schur decomposition of an arbitrary pencil A–lB—robust software with error bounds and applications. Part I: theory and algorithms

Related Papers (5)

A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)

Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

Tandem connectionist feature extraction for conventional HMM systems

Linear discriminant analysis for improved large vocabulary continuous speech recognition