scispace - formally typeset
Search or ask a question
Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.


Papers
More filters
Proceedings ArticleDOI
25 May 2003
TL;DR: The results indicate that the two-channel training method provides better accuracy on separating similar visemes than the conventional Baum-Welch estimation.
Abstract: A novel two-channel algorithm is proposed in this paper for discriminative training of Hidden Markov Models (HMMs). It adjusts the symbol emission coefficients of an existing HMM to maximize the separable distance between a pair of confusable training samples. The method is applied to identify the visemes of visual speech. The results indicate that the two-channel training method provides better accuracy on separating similar visemes than the conventional Baum-Welch estimation.

3 citations

01 Jan 1998
TL;DR: A method for learning a mapping between signals is introduced, and this is used to drive facial animation directly from vocal cues, suitable for driving many different kinds of animation ranging from photo-realistic image warps to 3D cartoon characters.
Abstract: We introduce a method for learning a mapping between signals, and use this to drive facial animation directly from vocal cues. Instead of depending on heuristic intermediate representations such as phonemes or visemes, the system learns its own representation, which includes dynamical and contextual information. In principle, this allows the system to make optimal use of context to handle ambiguity and relatively long-lasting facial co-articulation effects. The output is a series of facial control parameters, suitable for driving many different kinds of animation ranging from photo-realistic image warps to 3D cartoon characters.

3 citations

Proceedings ArticleDOI
07 Dec 2020
TL;DR: A marker-less approach for facial motion capture based on multi-view video is presented, which learns a neural representation of facial expressions, which is used to seamlessly concatenate facial performances during the animation procedure.
Abstract: Creating realistic animations of human faces with computer graphic models is still a challenging task. It is often solved either with tedious manual work or motion capture based techniques that require specialised and costly hardware. Example based animation approaches circumvent these problems by re-using captured data of real people. This data is split into short motion samples that can be looped or concatenated in order to create novel motion sequences. The obvious advantages of this approach are the simplicity of use and the high realism, since the data exhibits only real deformations. Rather than tuning weights of a complex face rig, the animation task is performed on a higher level by arranging typical motion samples in a way such that the desired facial performance is achieved. Two difficulties with example based approaches, however, are high memory requirements as well as the creation of artefact-free and realistic transitions between motion samples. We solve these problems by combining the realism and simplicity of example-based animations with the advantages of neural face models. Our neural face model is capable of synthesising high quality 3D face geometry and texture according to a compact latent parameter vector. This latent representation reduces memory requirements by a factor of 100 and helps creating seamless transitions between concatenated motion samples. In this paper, we present a marker-less approach for facial motion capture based on multi-view video. Based on the captured data, we learn a neural representation of facial expressions, which is used to seamlessly concatenate facial performances during the animation procedure. We demonstrate the effectiveness of our approach by synthesising mouthings for Swiss-German sign language based on viseme query sequences.

3 citations

Proceedings ArticleDOI
TL;DR: The animation tool proposes a good speech-based face animation as a point of departure for animators, who also get support by the system to then make further changes as desired.
Abstract: Efficient, realistic face animation is still a challenge. A system is proposed that yields realistic animations for speech. It starts from real 3D face dynamics, observed at a frame rate of 25 fps for thousands of points on the faces of speaking actors. When asked to animate a face it replicates the visemes that is has learned, and adds the necessary coarticulation effects. The speech animation could be based on as few as 16 modes, extracted through Independent Component Analysis from the observed face dynamics. Also faces for which only a static, neutral 3D model is available, can be animated. Rather then animating via verbatim copying other faces’ deformation fields, the visemes are adapted to the shape of the new face. By localising this face in a Face Space, where also the locations of the example faces are known, visemes are adapted automatically according to the relative distance with respect to these examples. The animation tool proposes a good speech-based face animation as a point of departure for animators, who also get support by the system to then make further changes as desired.

3 citations


Network Information
Related Topics (5)
Vocabulary
44.6K papers, 941.5K citations
78% related
Feature vector
48.8K papers, 954.4K citations
76% related
Feature extraction
111.8K papers, 2.1M citations
75% related
Feature (computer vision)
128.2K papers, 1.7M citations
74% related
Unsupervised learning
22.7K papers, 1M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20237
202212
202113
202039
201919
201822