Book ChapterDOI
Audio-to-Visual Conversion Using Hidden Markov Models
Soonkyu Lee,Dongsuk Yook +1 more
- pp 563-570
Reads0
Chats0
TLDR
Two approaches in using HMMs (hidden Markov models) to convert audio signals to a sequence of visemes are compared and it is found that the error rates can be reduced to 20.5% and 13.9%, respectably.Citations
More filters
Journal ArticleDOI
Lipreading With Local Spatiotemporal Descriptors
TL;DR: Local spatiotemporal descriptors are presented to represent and recognize spoken isolated phrases based solely on visual input to include local processing and robustness to monotonic gray-scale changes.
Journal ArticleDOI
Audiovisual Fusion: Challenges and New Approaches
TL;DR: This review will address issues in AV fusion in the context of AV speech processing, and especially speech recognition, where one of the issues is that the modalities both interact but also sometimes appear to desynchronize from each other.
Proceedings ArticleDOI
Decision level combination of multiple modalities for recognition and analysis of emotional expression
TL;DR: This work model face, voice and head movement cues for emotion recognition and fuse classifiers using a Bayesian framework and suggests a positive correlation between the number of classifiers that performed well and the perceptual salience of the expressed emotion.
Journal ArticleDOI
A coupled HMM approach to video-realistic speech animation
Lei Xie,Zhi-Qiang Liu +1 more
TL;DR: The proposed coupled hidden Markov model (CHMM) approach to video-realistic speech animation indicates that explicitly modelling audio-visual speech is promising for speech animation.
Proceedings Article
Phoneme-to-viseme mapping for visual speech recognition
Luca Cappelletta,Naomi Harte +1 more
TL;DR: These initial experiments demonstrate that the choice of visual unit requires more careful attention in audio-visual speech recognition system development, and the best visual-only recognition on the VidTIMIT database is achieved using a linguistically motivated viseme set.
References
More filters
Journal ArticleDOI
A tutorial on hidden Markov models and selected applications in speech recognition
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Journal ArticleDOI
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
TL;DR: The upper bound is obtained for a specific probabilistic nonsequential decoding algorithm which is shown to be asymptotically optimum for rates above R_{0} and whose performance bears certain similarities to that of sequential decoding algorithms.
Book
Phoneme recognition using time-delay neural networks
TL;DR: The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation.
Journal ArticleDOI
Phoneme recognition using time-delay neural networks
TL;DR: In this article, the authors presented a time-delay neural network (TDNN) approach to phoneme recognition, which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the time delay arrangement enables the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time and therefore not blurred by temporal shifts in the input