Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Stream-based context-sensitive phone mapping for cross-lingual speech recognition.

[...]

Khe Chai Sim¹, Haizhou Li²•Institutions (2)

National University of Singapore¹, Agency for Science, Technology and Research²

06 Sep 2009

TL;DR: The probability of observing a context-dependent symbol is decomposed into the product of probabilities of observing the symbol and its contexts to allow wider contexts to be modelled without greatly compromising the model complexity.

...read moreread less

Abstract: Recently, a Probabilistic Phone Mapping (PPM) model was proposed to facilitate cross-lingual automatic speech recognition using a foreign phonetic system. Under this framework, discrete hidden Markov models (HMMs) are used to map a foreign phone sequence to a target phone sequence. Context-sensitive mapping is made possible by expanding the discrete observation symbols to include the contexts of the foreign phones in which they appear in the sequence. Unfortunately, modelling the context dependencies jointly results in dramatic increase in model parameters as wider contexts are used. In this paper, the probability of observing a context-dependent symbol is decomposed into the product of probabilities of observing the symbol and its contexts. This allows wider contexts to be modelled without greatly compromising the model complexity. This can be modelled conveniently using a multiple-stream discrete HMM system where the contexts are treated as independent streams. Experimental results are reported on TIMIT English phone recognition task using the Czech, Hungarian and Russion foreign phone recognisers.

...read moreread less

14 citations

Journal Article•DOI•

Characteristics of the use of coupled hidden Markov models for audio-visual polish speech recognition

[...]

Mariusz Kubanek¹, Janusz Bobulski, Lukasz Adrjanowicz•Institutions (1)

Częstochowa University of Technology¹

01 Oct 2012-Bulletin of The Polish Academy of Sciences-technical Sciences

TL;DR: A significant increase of recognition effectiveness and processing speed were noted during tests – for properly selected CHMM parameters and an adequate codebook size, besides the use of the appropriate fusion of audio-visual characteristics.

...read moreread less

Abstract: This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of the highly disturbed audio speech signal Recognition of audio-visual speech was based on combined hidden Markov models (CHMM) The described methods were developed for a single isolated command, nevertheless their effectiveness indicated that they would also work similarly in continuous audiovisual speech recognition The problem of a visual speech analysis is very difficult and computationally demanding, mostly because of an extreme amount of data that needs to be processed Therefore, the method of audio-video speech recognition is used only while the audiospeech signal is exposed to a considerable level of distortion There are proposed the authors’ own methods of the lip edges detection and a visual characteristic extraction in this paper Moreover, the method of fusing speech characteristics for an audio-video signal was proposed and tested A significant increase of recognition effectiveness and processing speed were noted during tests – for properly selected CHMM parameters and an adequate codebook size, besides the use of the appropriate fusion of audio-visual characteristics The experimental results were very promising and close to those achieved by leading scientists in the field of audio-visual speech recognition

...read moreread less

13 citations

Proceedings Article•

From Raw Images of the Lips to Articulatory Parameters : A Viseme-based Prediction

[...]

Lionel Reveret

01 Sep 1997

TL;DR: This paper presents a method for the extraction of articulatory parameters from direct processing of raw images of the lips using an HMMbased visual speech recogniser and recognition scores obtained are compared to reference scores.

...read moreread less

Abstract: This paper presents a method for the extraction of articulatory parameters from direct processing of raw images of the lips. The system architecture is made of three independent parts. First, a new greyscale mouth image is centred and downsampled. Second, the image is aligned and projected onto a basis of artificial images. These images are the eigenvectors computed from a PCA applied on a set of 23 reference lip shapes. Then, a multilinear interpolation predicts articulatory parameters from the image projection coefficients onto the eigenvectors. In addition, the projection coefficients and the predicted parameters were evaluated by an HMMbased visual speech recogniser. Recognition scores obtained with our method are compared to reference scores and discussed.

...read moreread less

13 citations

Proceedings Article•DOI•

Recent progress in spontaneous speech recognition and understanding

[...]

S. Furui¹•Institutions (1)

Tokyo Institute of Technology¹

09 Dec 2002

TL;DR: An overview of the large scale national project entitled "Spontaneous speech: corpus and processing technology" in Japan is given and the major results of experiments that have been conducted so far are reported, including spontaneous presentation speech recognition, automatic speech summarization, and message-driven speech recognition.

...read moreread less

Abstract: How to recognize and understand spontaneous speech is one of the most important issues in state-of-the-art speech recognition technology. In this context, a five-year large scale national project entitled "Spontaneous speech: corpus and processing technology" started in Japan in 1999. This paper gives an overview of the project and reports on the major results of experiments that have been conducted so far at Tokyo Institute of Technology, including spontaneous presentation speech recognition, automatic speech summarization, and message-driven speech recognition. The paper also discusses the most important research problems to be solved in order to achieve ultimate spontaneous speech recognition systems.

...read moreread less

13 citations

Patent•

Method and apparatus for speech recognition and generation of speech recognition engine

[...]

Seok-Jin Hong¹, Young Sang Choi¹, Heeyoul Choi¹, Sanghyun Yoo¹•Institutions (1)

Samsung¹

05 Feb 2015

TL;DR: In this article, a method and apparatus for speech recognition and for generation of speech recognition engine, and a Speech Recognition Engine (SRE) for speech generation, is presented, in which the speech recognition system obtains a phoneme sequence from the speech input and provides the recognition result based on the phonetic distance of the phoneme sequences.

...read moreread less

Abstract: A method and apparatus for speech recognition and for generation of speech recognition engine, and a speech recognition engine are provided. The method of speech recognition involves receiving a speech input, transmitting the speech input to a speech recognition engine, and receiving a speech recognition result from the speech recognition engine, in which the speech recognition engine obtains a phoneme sequence from the speech input and provides the speech recognition result based on a phonetic distance of the phoneme sequence.

...read moreread less

13 citations

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics