scispace - formally typeset
Open AccessProceedings Article

Optimization of speech parameter weighting for CDHMM word recognition

TLDR
A method to automatically estimate the optimum ponderation of static and dynamic features in a speech recognition system based on Continuous-Density Hidden Markov Modelling (CDHMM), widely used in speech recognition.
Abstract
Speech dynamic feature are routinely used in current speech recognition systems in combination with short-term (static) spectral features. The aim of this paper is to propose a method to automatically estimate the optimum ponderation of static and dynamic features in a speech recognition system. The recognition system considered in this paper is based on Continuous-Density Hidden Markov Modelling (CDHMM), widely used in speech recognition. Our approach consists basically in 1) adding two new parameters for each state of each model that weight both kinds of speech features, and 2) estimating those parameters by means of a discriminative training algorithm that minimizes the recognition error using the recently proposed Generalized Probabilistic Descent (GPD) method. Experimental results in speaker independent digit recognition show an important increase of recognition accuracy.

read more

Citations
More filters
Journal ArticleDOI

Recent advances in the automatic recognition of audiovisual speech

TL;DR: The main components of audiovisual automatic speech recognition (ASR) are reviewed and novel contributions in two main areas are presented: first, the visual front-end design, based on a cascade of linear image transforms of an appropriate video region of interest, and subsequently, audiovISual speech integration.

Audio-Visual Automatic Speech Recognition: An Overview

TL;DR: Novel, non-traditional approaches, that use orthogonal sources of information to the acoustic input, are needed to achieve ASR performance closer to the human speech perception level, and robust enough to be deployable in field applications.
Journal ArticleDOI

Pattern recognition using a family of design algorithms based upon the generalized probabilistic descent method

TL;DR: This paper provides a comprehensive introduction to a novel approach to pattern recognition which is based on the generalized probabilistic descent method (GPD) and its related design algorithms.
Proceedings ArticleDOI

Discriminative training of HMM stream exponents for audio-visual speech recognition

TL;DR: The use of discriminative training by means of the generalized probabilistic descent (GPB) algorithm to estimate hidden Markov model (HMM) stream exponents for audio-visual speech recognition is proposed.
Proceedings ArticleDOI

Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition

TL;DR: This paper proposes a method to automatically estimate an optimum state-dependent stream weighting in a continuous density hidden Markov model (CDHMM) recognition system by means of a maximum-likelihood based training algorithm.