A Generative Modeling Framework for Structured Hidden Speech Dynamics

Open AccessProceedings Article

A Generative Modeling Framework for Structured Hidden Speech Dynamics

Chats0

TLDR

A structured speech model is outlined, equipped with long-contextual-span capabilities that are missing in the HMM approach, and the pros and cons of the structured generative modeling approach in comparison with the structured discriminative classification approach are discussed.

Abstract:

We outline a structured speech model, as a special and perhaps extreme form of probabilistic generative modeling. The model is equipped with long-contextual-span capabilities that are missing in the HMM approach. Compact (and physically meaningful) parameterization of the model is made possible by the continuity constraint in the hidden vocal tract resonance (VTR) domain. The target-directed VTR dynamics jointly characterize coarticulation and incomplete articulation (reduction). Preliminary evaluation results are presented on the standard TIMIT phonetic recognition task, showing the best result in this task reported in the literature without using many heterogeneous classifier combinations. The pros and cons of our structured generative modeling approach, in comparison with the structured discriminative classification approach, are discussed.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Discriminative learning in sequential pattern recognition

Xiaodong He, +2 more

- 26 Sep 2008 -

IEEE Signal Processing Magazine

TL;DR: The main goal of this article is to provide an underlying foundation for MMI, MCE, and MPE/MWE at the objective function level to facilitate the development of new parameter optimization techniques and to incorporate other pattern recognition concepts, e.g., discriminative margins [66], into the current discrim inative learning paradigm.

...read moreread less

Book ChapterDOI

Phoneme Recognition on the TIMIT Database

Carla Lopes, +1 more

TL;DR: Speech recognition based on phones is very attractive since it is inherently free from vocabulary limitations, but large Vocabulary ASR systems’ performance depends on the quality of the phone recognizer, so research teams continue developing phone recognizers, in order to enhance their performance as much as possible.

...read moreread less

Book

Discriminative learning for speech recognition

Xiaodong He, +1 more

TL;DR: This book introduces the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition and includes technical details on the derivation of the parameter optimization formulas for exponential-family distribut ons, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminating learning.

...read moreread less

Posted Content

Phoneme recognition in TIMIT with BLSTM-CTC

Santiago Fernández, +2 more

- 15 Apr 2008 -

arXiv: Computation and Language

TL;DR: The performance of a recurrent neural network is compared with the best results published so far on phoneme recognition in the TIMIT database and a single recurrent network is applied to the same task.

...read moreread less

Patent

Minimum classification error training with growth transformation optimization

Xiaodong He, +1 more

TL;DR: In this paper, the hidden Markov model (HMM) parameters are updated using update equations based on growth transformation optimization of a minimum classification error objective function using the list of N-best word sequences obtained by decoding the training data with the current-iteration HMM parameters.

...read moreread less

References

PDF

Open Access

More filters

Book

Fundamentals of speech recognition

Lawrence R. Rabiner, +1 more

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

...read moreread less

Journal ArticleDOI

From HMM's to segment models: a unified view of stochastic modeling for speech recognition

Mari Ostendorf, +2 more

- 01 Sep 1996 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: A general stochastic model is described that encompasses most of the models proposed in the literature for speech recognition, pointing out similarities in terms of correlation and parameter tying assumptions, and drawing analogies between segment models and HMMs.

...read moreread less

Proceedings ArticleDOI

Hidden conditional random fields for phone classification.

Asela Gunawardana, +3 more

TL;DR: This paper presents the results on the TIMIT phone classification task and shows that HCRFs outperforms comparable ML and CML/MMI trained HMMs and has the ability to handle complex features without any change in training procedure.

...read moreread less

Journal ArticleDOI

Structured language modeling

Ciprian Chelba, +1 more

- 01 Oct 2000 -

Computer Speech & Language

TL;DR: An attempt at using the syntactic structure in natural language for improved language models for speech recognition using an original probabilistic parameterization of a shift-reduce parser.

...read moreread less

Journal ArticleDOI

A probabilistic framework for segment-based speech recognition

James Glass

- 01 Apr 2003 -

Computer Speech & Language

TL;DR: This work examines a maximum a posteriori decoding strategy for feature-based recognizers and develops a normalization criterion useful for a segment-based speech recognizer.

...read moreread less