Optimization of speech parameter weighting for CDHMM word recognition

Open AccessProceedings Article

Optimization of speech parameter weighting for CDHMM word recognition

Francisco Javier Hernando Pericás, +2 more

- pp 105-108

TLDR

A method to automatically estimate the optimum ponderation of static and dynamic features in a speech recognition system based on Continuous-Density Hidden Markov Modelling (CDHMM), widely used in speech recognition.

Abstract:

Speech dynamic feature are routinely used in current speech recognition systems in combination with short-term (static) spectral features. The aim of this paper is to propose a method to automatically estimate the optimum ponderation of static and dynamic features in a speech recognition system. The recognition system considered in this paper is based on Continuous-Density Hidden Markov Modelling (CDHMM), widely used in speech recognition. Our approach consists basically in 1) adding two new parameters for each state of each model that weight both kinds of speech features, and 2) estimating those parameters by means of a discriminative training algorithm that minimizes the recognition error using the recently proposed Generalized Probabilistic Descent (GPD) method. Experimental results in speaker independent digit recognition show an important increase of recognition accuracy.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Recent advances in the automatic recognition of audiovisual speech

Gerasimos Potamianos, +4 more

TL;DR: The main components of audiovisual automatic speech recognition (ASR) are reviewed and novel contributions in two main areas are presented: first, the visual front-end design, based on a cascade of linear image transforms of an appropriate video region of interest, and subsequently, audiovISual speech integration.

...read moreread less

Audio-Visual Automatic Speech Recognition: An Overview

Gerasimos Potamianos, +3 more

TL;DR: Novel, non-traditional approaches, that use orthogonal sources of information to the acoustic input, are needed to achieve ASR performance closer to the human speech perception level, and robust enough to be deployable in field applications.

...read moreread less

Journal ArticleDOI

Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method

Shigeru Katagiri, +2 more

TL;DR: This paper provides a comprehensive introduction to a novel approach to pattern recognition which is based on the generalized probabilistic descent method (GPD) and its related design algorithms.

...read moreread less

Proceedings ArticleDOI

Discriminative training of HMM stream exponents for audio-visual speech recognition

Gerasimos Potamianos, +1 more

TL;DR: The use of discriminative training by means of the generalized probabilistic descent (GPB) algorithm to estimate hidden Markov model (HMM) stream exponents for audio-visual speech recognition is proposed.

...read moreread less

Proceedings ArticleDOI

Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition

Javier Hernando

TL;DR: This paper proposes a method to automatically estimate an optimum state-dependent stream weighting in a continuous density hidden Markov model (CDHMM) recognition system by means of a maximum-likelihood based training algorithm.

...read moreread less

Optimization of speech parameter weighting for CDHMM word recognition

Citations

Recent advances in the automatic recognition of audiovisual speech

Audio-Visual Automatic Speech Recognition: An Overview

Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method

Discriminative training of HMM stream exponents for audio-visual speech recognition

Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition

Related Papers (5)

Performance of HMM-based speech recognizers with discriminative state-weights

Evaluation of soft segment modeling on a phoneme recognition system

A minimum error rate pattern recognition approach to speech recognition

Discriminative template training for dynamic programming speech recognition

An improved training algorithm in HMM-based speech recognition