scispace - formally typeset
Search or ask a question
Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.


Papers
More filters
Book ChapterDOI
01 Jan 2016
TL;DR: The aim of this chapter is to give a comprehensive overview of current state-of-the-art parametric methods for realistic facial modelling and animation.
Abstract: Facial modelling is a fundamental technique in a variety of applications in computer graphics, computer vision and pattern recognition areas. As 3D technologies evolved over the years, the quality of facial modelling greatly improved. To enhance the modelling quality and controllability of the model further, parametric methods, which represent or manipulate facial attributes (e.g. identity, expression, viseme) with a set of control parameters, have been proposed in recent years. The aim of this chapter is to give a comprehensive overview of current state-of-the-art parametric methods for realistic facial modelling and animation.
Journal ArticleDOI
30 Dec 2017
TL;DR: A model of the audiovisual system based on the hidden Markov models is proposed, which allows recognizing the language in real time and provides a language recognition tool that can be used in conditions where other means may not be possible.
Abstract: A model of the audiovisual system based on the hidden Markov models is proposed, which allows recognizing the language in real time. The model provides a language recognition tool that can be used in conditions where other means may not be possible, for example, in the absence of an audio component. The model is researched and tested on the example of digital recognition, expected results are obtained
01 Jan 2009
TL;DR: This article investigated older and younger control participants' susceptibility to an audio-visual speech illusion known as the McGurk effect that occurs when an incongruentphoneme [|ba|] and viseme [ga] are perceived as a new fused percept [da] and found that older persons integrate information from different senses more than younger, their susceptibility to the illusion should be higher.
Abstract: In the present work we investigated older and younger control participants’ susceptibility toan audio-visual speech illusion known as the McGurk effect that occurs when an incongruentphoneme [|ba|] and viseme [ga] are perceived as a new fused percept [‘da’]. We hypothesizedthat if older persons integrate information from different senses more than younger, theirsusceptibility to the illusion should be higher. The results confirmed this hypothesis. Wesuggest that difficulty in focusing on one channel (audition) while simultaneously perceivinginputs from other channels (vision) is the reason of this enhanced integration.
Journal ArticleDOI
TL;DR: In this article , the authors used the facial motion capture technology to obtain the dynamic lip viseme feature data, during the stop's forming block, continuing block, removing-block, and co-articulation with vowels in the CV structure.
Abstract: In the study of articulatory phonetics, lip shape and tongue position is the focus of linguists. In order to reveal the physiological characteristics of the lip shape during pronunciation, the author takes the Tibetan Xiahe dialect as the research object and defines the facial parameter feature points of the speaker according to the MPEG-4 international standard. Most importantly, the author uses the facial motion capture technology to obtain the dynamic lip viseme feature data, during the stop's forming-block, continuing-block, removing-block, and co-articulation with vowels in the CV structure. Through research and analysis, it is found that the distribution of lip shape change the characteristics of different parts' pronunciation is different during the stop's forming block. In the co-articulation with [a], the reverse effect is greater than the forward effect, which is consistent with the relevant conclusions in many languages obtained by many scholars through other experimental methods. The study also found that in the process of pronunciation, the movement of the lip physiological characteristics of each speaker is random to a certain extent, but when different speakers pronounce the same sound, they can always maintain the consistency of the changing trend of the lip shape characteristics.
Posted ContentDOI
15 Jan 2023
TL;DR: In this paper , a parametric viseme fitting algorithm was proposed to extract viseme parameters from speech videos, which can better correlate with phonemes, thus more controllable and friendly to animators.
Abstract: We present a novel audio-driven facial animation approach that can generate realistic lip-synchronized 3D facial animations from the input audio. Our approach learns viseme dynamics from speech videos, produces animator-friendly viseme curves, and supports multilingual speech inputs. The core of our approach is a novel parametric viseme fitting algorithm that utilizes phoneme priors to extract viseme parameters from speech videos. With the guidance of phonemes, the extracted viseme curves can better correlate with phonemes, thus more controllable and friendly to animators. To support multilingual speech inputs and generalizability to unseen voices, we take advantage of deep audio feature models pretrained on multiple languages to learn the mapping from audio to viseme curves. Our audio-to-curves mapping achieves state-of-the-art performance even when the input audio suffers from distortions of volume, pitch, speed, or noise. Lastly, a viseme scanning approach for acquiring high-fidelity viseme assets is presented for efficient speech animation production. We show that the predicted viseme curves can be applied to different viseme-rigged characters to yield various personalized animations with realistic and natural facial motions. Our approach is artist-friendly and can be easily integrated into typical animation production workflows including blendshape or bone based animation.

Network Information
Related Topics (5)
Vocabulary
44.6K papers, 941.5K citations
78% related
Feature vector
48.8K papers, 954.4K citations
76% related
Feature extraction
111.8K papers, 2.1M citations
75% related
Feature (computer vision)
128.2K papers, 1.7M citations
74% related
Unsupervised learning
22.7K papers, 1M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20237
202212
202113
202039
201919
201822