Proceedings ArticleDOI
An improved automatic lipreading system to enhance speech recognition
E. Petajan,B. Bischoff,David Bodoff,N. M. Brooke +3 more
- pp 19-25
Reads0
Chats0
TLDR
An improved version of a previously described automatic lipreading system has been developed which uses vector quantization, dynamic time warping, and a new heuristic distance measure to improve acoustic speech recognition.Abstract:
Current acoustic speech recognition technology performs well with very small vocabularies in noise or with large vocabularies in very low noise. Accurate acoustic speech recognition in noise with vocabularies over 100 words has yet to be achieved. Humans frequently lipread the visible facial speech articulations to enhance speech recognition, especially when the acoustic signal is degraded by noise or hearing impairment. Automatic lipreading has been found to improve significantly acoustic speech recognition and could be advantageous in noisy environments such as offices, aircraft and factories.An improved version of a previously described automatic lipreading system has been developed which uses vector quantization, dynamic time warping, and a new heuristic distance measure. This paper presents visual speech recognition results from multiple speakers under optimal conditions. Results from combined acoustic and visual speech recognition are also presented which show significantly improved performance compared to the acoustic recognition system alone.read more
Citations
More filters
Book
Survey of the State of the Art in Human Language Technology
TL;DR: In this article, the authors present a glossary for language analysis and understanding in the context of spoken language input and output technologies, and evaluate their work with a set of annotated corpora.
Journal ArticleDOI
Extraction of visual features for lipreading
TL;DR: Three methods for parameterizing lip image sequences for recognition using hidden Markov models are compared and two are top-down approaches that fit a model of the inner and outer lip contours and derive lipreading features from a principal component analysis of shape or shape and appearance, respectively.
Journal ArticleDOI
Motion-based recognition a survey
Claudette Cédras,Mubarak Shah +1 more
TL;DR: A review of recent developments in the computer vision aspect of motionbased recognition and several methods for the recognition of objects and motions, including cyclic motion detection and recognition, lipreading, hand gestures interpretation, motion verb recognition and temporal textures classification are reported.
Proceedings ArticleDOI
"Eigenlips" for robust speech recognition
Christoph Bregler,Yochai Konig +1 more
TL;DR: This study improves the performance of a hybrid connectionist speech recognition system by incorporating visual information about the corresponding lip movements by using a new visual front end, and an alternative architecture for combining the visual and acoustic information.
Proceedings ArticleDOI
CUAVE: A new audio-visual database for multimodal human-computer interface research
TL;DR: A new audiovisual database that is flexible and fairly comprehensive, yet easily available to researchers on one DVD, and the inclusion of pairs of simultaneous speakers, the first documented database of this kind are introduced.
References
More filters
Journal ArticleDOI
Dynamic programming algorithm optimization for spoken word recognition
TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Automatic lipreading to enhance speech recognition (speech reading)
TL;DR: An automatic lipreading system which has been developed and the combination of the acoustic and visual recognition candidates is shown to yield a final recognition accuracy which greatly exceeds the acoustic recognition accuracy alone.
Journal ArticleDOI
Vector quantization: A pattern-matching technique for speech coding
Allen Gersho,Vladimir Cuperman +1 more
TL;DR: Recent results obtained in waveform coding of speech with vector quantization are reviewed, with Vector quantization appearing to be a suitable coding technique which caters to this dual requirement of effective speech coding.
Journal ArticleDOI
Coding of Two-Tone Images
TL;DR: The concepts and techniques of efficient coding for the transmission or storage of two-tone images, such as business documents and weather maps, are reviewed.
Proceedings ArticleDOI
Coding Of Two-Tone Images
TL;DR: This work gives a brief overview of efficient coding methods for two-tone images, especially: white block skipping and runlength coding.