scispace - formally typeset
Search or ask a question
Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.


Papers
More filters
01 Jan 2003
TL;DR: This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only and shows that it is indeed possible to obtain visually relevant speech segmentation data directly from the purely acoustic speech signal.
Abstract: This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only. The system that was developed is based on hidden Markov models for the acoustic observation vectors of the speech sounds that correspond to each of 16 visually distinct mouth shapes (visemes). The acoustic variability with context was taken into account by building acoustic viseme models that are dependent on the left and right viseme contexts. Our experimental results show that it is indeed possible to obtain visually relevant speech segmentation data directly from the purely acoustic speech signal.

9 citations

Proceedings ArticleDOI
18 Sep 2006
TL;DR: The purpose of this paper is to introduce statistical methods for speech recognition by extracting formants of the speech and analyzing their behavior, a novel way to speech recognition.
Abstract: Automatic Speech Recognition (ASR) is one of the most developing fields of the modern science having a wide range of applications. The purpose of this paper is to introduce statistical methods for speech recognition by extracting formants of the speech and analyzing their behavior. It is a novel way to speech recognition. The whole analysis for speech recognition is based upon first five formants of the speech. The method has been tested for Urdu speech, and result obtained is of high accuracy. There are 80 checkpoints, as a result of the algorithm (method), before Urdu speech is recognized. The method can be used for any other language as well for speech recognition.

9 citations

Patent
03 Mar 2017
TL;DR: In this article, a system and method for animated lip synchronization is presented, which includes capturing speech input, parsing the speech input into phenomes, aligning the phonemes to the corresponding portions of the speech inputs, mapping the phonemees to visemes, synchronizing the viseme into viseme action units and outputting the viseme actions.
Abstract: A system and method for animated lip synchronization. The method includes: capturing speech input; parsing the speech input into phenomes; aligning the phonemes to the corresponding portions of the speech input; mapping the phonemes to visemes; synchronizing the visemes into viseme action units, the viseme action units comprising jaw and lip contributions for each of the phonemes; and outputting the viseme action units.

9 citations

Proceedings ArticleDOI
04 Oct 2004
TL;DR: The proposed Mel-LPC analysis method is an efficient time domain technique to estimate the warped predictors from input speech directly and leads to a significant improvement in recognition accuracy over conventional LPC analysis, and a slightly improvement of error rate.
Abstract: This paper describes a new speech analysis method, an adaptive Mel-LPC (AMLPC) analysis method, using human auditory characteristics. The Mel-LPC analysis method that we have proposed is an efficient time domain technique to estimate the warped predictors from input speech directly. However, the frequency resolution of spectrum obtained by Mel-LPC analysis is constant regardless of the characteristics of input speech at each analysis frame. In the AMLPC analysis, it is probable to estimate the spectrum coefficients with optimal frequency resolution according to the characteristics of the phoneme at each analysis frame, because the spectral slope or the formant is different according to phoneme (vowels, fricatives and so on). The recognition performance of melcepstrum parameters obtained by the AMLPC analysis was compared with those of mel-cepstrum parameters obtained by the conventional LPC analysis and the Mel-LPC analysis through gender-dependent phoneme and word recognition. The results show that the proposed method leads to a significant improvement in recognition accuracy over conventional LPC analysis, and a slightly improvement of error rate about 10% over the Mel-LPC analysis.

9 citations

Journal ArticleDOI
TL;DR: The proposed research integrated emotions by the consideration of Ekman model and Plutchik's wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model.
Abstract: Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model.

9 citations


Network Information
Related Topics (5)
Vocabulary
44.6K papers, 941.5K citations
78% related
Feature vector
48.8K papers, 954.4K citations
76% related
Feature extraction
111.8K papers, 2.1M citations
75% related
Feature (computer vision)
128.2K papers, 1.7M citations
74% related
Unsupervised learning
22.7K papers, 1M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20237
202212
202113
202039
201919
201822