Audiovisual Speech Synthesis

doi:10.1023/A:1025700715107

Journal ArticleDOI

Audiovisual Speech Synthesis

Gérard Bailly, +3 more

- 01 Oct 2003 -

International Journal of Speech Technolo...

- Vol. 6, Iss: 4, pp 331-346

Chats0

TLDR

This paper presents the main approaches used to synthesize talking faces, and provides greater detail on a handful of these approaches, and an attempt is made to distinguish between facial synthesis itself and the manner in which facial movements are rendered on a computer screen.

Abstract:

This paper presents the main approaches used to synthesize talking faces, and provides greater detail on a handful of these approaches. An attempt is made to distinguish between facial synthesis itself (i.e. the manner in which facial movements are rendered on a computer screen), and the way these movements may be controlled and predicted using phonetic input. The two main synthesis techniques (model-based vs. image-based) are contrasted and presented by a brief description of the most illustrative existing systems. The challenging issues—evaluation, data acquisition and modeling—that may drive future models are also discussed and illustrated by our current work at ICP.

Citations

PDF

Open Access

More filters

MonographDOI

Text-to-Speech Synthesis

Paul Taylor

TL;DR: Text-to-Speech Synthesis provides an in-depth explanation of all aspects of current speech synthesis technology, and is designed for graduate students in electrical engineering, computer science, and linguistics.

...read moreread less

Journal ArticleDOI

Audio-driven facial animation by joint end-to-end learning of pose and emotion

Tero Karras, +4 more

- 20 Jul 2017 -

ACM Transactions on Graphics

TL;DR: This work presents a machine learning technique for driving 3D facial animation by audio input in real time and with low latency, and simultaneously discovers a compact, latent code that disambiguates the variations in facial expression that cannot be explained by the audio alone.

...read moreread less

Posted Content

The mesh-matching algorithm: an automatic 3D mesh generator for Finite element structures

Béatrice Couteau, +2 more

- 27 Jun 2006 -

arXiv: Medical Physics

TL;DR: A new patient-specific method allowing automatically 3D mesh generation for structures as complex as bone for example is investigated, called the mesh-matching (M-M) algorithm, which generated automatically customized 3D meshes of anatomical structures from an already existing model.

...read moreread less

The equilibrium point hypothesis and its application to speech motor control

Pascal Perrier, +2 more

TL;DR: It is suggested that even when no account is taken of upcoming context, that apparent anticipatory changes in movement amplitude and duration may arise due to dynamics, and that simple linear control signals may underlie smooth articulatory trajectories.

...read moreread less

Proceedings ArticleDOI

Task planning for human-robot interaction

Rachid Alami, +4 more

TL;DR: This paper intends to develop and experiment various task planners and interaction schemes, that will allow the robot to select and perform its tasks while taking into account explicitly the constraints imposed by the presence of humans, their needs and preferences.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Eigenfaces for recognition

Matthew Turk, +1 more

- 01 Jan 1991 -

Journal of Cognitive Neuroscience

TL;DR: A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture.

...read moreread less

Journal ArticleDOI

Active appearance models

Timothy F. Cootes, +2 more

- 01 Jun 2001 -

IEEE Transactions on Pattern Analysis an...

Abstract: We describe a new method of matching statistical models of appearance to images. A set of model parameters control modes of shape and gray-level variation learned from a training set. We construct an efficient iterative matching algorithm by learning the relationship between perturbations in the model parameters and the induced image errors.

...read moreread less

Journal ArticleDOI

Hearing lips and seeing voices

Harry McGurk, +1 more

- 01 Dec 1976 -

Nature

TL;DR: The study reported here demonstrates a previously unrecognised influence of vision upon speech perception, on being shown a film of a young woman's talking head in which repeated utterances of the syllable [ba] had been dubbed on to lip movements for [ga].

...read moreread less

Book ChapterDOI

Active Appearance Models

Timothy F. Cootes, +2 more

TL;DR: A novel method of interpreting images using an Active Appearance Model (AAM), a statistical model of the shape and grey-level appearance of the object of interest which can generalise to almost any valid example.

...read moreread less

Book

Unmasking the face

Paul Ekman

Collapse

Audiovisual Speech Synthesis

Citations

Text-to-Speech Synthesis

Audio-driven facial animation by joint end-to-end learning of pose and emotion

The mesh-matching algorithm: an automatic 3D mesh generator for Finite element structures

The equilibrium point hypothesis and its application to speech motor control

Task planning for human-robot interaction

References

Eigenfaces for recognition

Active appearance models

Hearing lips and seeing voices

Active Appearance Models

Unmasking the face

Related Papers (5)

Hearing lips and seeing voices

Trainable videorealistic speech animation

Video Rewrite: driving visual speech with audio

Modeling Coarticulation in Synthetic Visual Speech

Visual contribution to speech intelligibility in noise