Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Context dependent viseme models for voice driven animation

[...]

Xie Lei, Dongmei Jiang, Ilse Ravyse¹, Werner Verhelst¹, Hichem Sahli¹, Velina Slavova, Zhao Rongchun - Show less +3 more•Institutions (1)

Vrije Universiteit Brussel¹

01 Jan 2003

TL;DR: This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only and shows that it is indeed possible to obtain visually relevant speech segmentation data directly from the purely acoustic speech signal.

...read moreread less

Abstract: This paper addresses the problem of animating a talking figure, such as an avatar, using speech input only. The system that was developed is based on hidden Markov models for the acoustic observation vectors of the speech sounds that correspond to each of 16 visually distinct mouth shapes (visemes). The acoustic variability with context was taken into account by building acoustic viseme models that are dependent on the left and right viseme contexts. Our experimental results show that it is indeed possible to obtain visually relevant speech segmentation data directly from the purely acoustic speech signal.

...read moreread less

9 citations

Proceedings Article•DOI•

Formants Based Analysis for Speech Recognition

[...]

Ahmad Ali¹, S. Bhatti¹, M.S. Mian•Institutions (1)

University of Engineering and Technology, Lahore¹

18 Sep 2006

TL;DR: The purpose of this paper is to introduce statistical methods for speech recognition by extracting formants of the speech and analyzing their behavior, a novel way to speech recognition.

...read moreread less

Abstract: Automatic Speech Recognition (ASR) is one of the most developing fields of the modern science having a wide range of applications. The purpose of this paper is to introduce statistical methods for speech recognition by extracting formants of the speech and analyzing their behavior. It is a novel way to speech recognition. The whole analysis for speech recognition is based upon first five formants of the speech. The method has been tested for Urdu speech, and result obtained is of high accuracy. There are 80 checkpoints, as a result of the algorithm (method), before Urdu speech is recognized. The method can be used for any other language as well for speech recognition.

...read moreread less

9 citations

Patent•

System and method for animated lip synchronization

[...]

Pif Edwards¹, Chris Landreth¹, Eugene Fiume¹, Karan Singh¹•Institutions (1)

University of Toronto¹

03 Mar 2017

TL;DR: In this article, a system and method for animated lip synchronization is presented, which includes capturing speech input, parsing the speech input into phenomes, aligning the phonemes to the corresponding portions of the speech inputs, mapping the phonemees to visemes, synchronizing the viseme into viseme action units and outputting the viseme actions.

...read moreread less

Abstract: A system and method for animated lip synchronization. The method includes: capturing speech input; parsing the speech input into phenomes; aligning the phonemes to the corresponding portions of the speech input; mapping the phonemes to visemes; synchronizing the visemes into viseme action units, the viseme action units comprising jaw and lip contributions for each of the phonemes; and outputting the viseme action units.

...read moreread less

9 citations

Proceedings Article•DOI•

An adaptive MEL-LPC analysis for speech recognition.

[...]

Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada

04 Oct 2004

TL;DR: The proposed Mel-LPC analysis method is an efficient time domain technique to estimate the warped predictors from input speech directly and leads to a significant improvement in recognition accuracy over conventional LPC analysis, and a slightly improvement of error rate.

...read moreread less

Abstract: This paper describes a new speech analysis method, an adaptive Mel-LPC (AMLPC) analysis method, using human auditory characteristics. The Mel-LPC analysis method that we have proposed is an efficient time domain technique to estimate the warped predictors from input speech directly. However, the frequency resolution of spectrum obtained by Mel-LPC analysis is constant regardless of the characteristics of input speech at each analysis frame. In the AMLPC analysis, it is probable to estimate the spectrum coefficients with optimal frequency resolution according to the characteristics of the phoneme at each analysis frame, because the spectral slope or the formant is different according to phoneme (vowels, fricatives and so on). The recognition performance of melcepstrum parameters obtained by the AMLPC analysis was compared with those of mel-cepstrum parameters obtained by the conventional LPC analysis and the Mel-LPC analysis through gender-dependent phoneme and word recognition. The results show that the proposed method leads to a significant improvement in recognition accuracy over conventional LPC analysis, and a slightly improvement of error rate about 10% over the Mel-LPC analysis.

...read moreread less

9 citations

Journal Article•DOI•

Lip syncing method for realistic expressive 3D face model

[...]

Itimad Raheem Ali, Hoshang Kolivand¹, Mohammed Hazim Alkawaz², Mohammed Hazim Alkawaz³•Institutions (3)

Liverpool John Moores University¹, University of Kurdistan², Management and Science University³

01 Mar 2018-Multimedia Tools and Applications

TL;DR: The proposed research integrated emotions by the consideration of Ekman model and Plutchik's wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model.

...read moreread less

Abstract: Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model.

...read moreread less

9 citations

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics