Audiovisual speech synthesis

doi:10.1016/J.SPECOM.2014.11.001

Journal ArticleDOI

Audiovisual speech synthesis

Wesley Mattheyses, +1 more

- 01 Feb 2015 -

Speech Communication

- Vol. 66, pp 182-217

TLDR

The paper discusses the evaluation of audiovisual speech synthesizers, it elaborates on the hardware requirements for performing visual speech synthesis and it describes some important future directions that should stimulate the use of audiolabeled speech synthesis technology in real-life applications.

About:

This article is published in Speech Communication.The article was published on 2015-02-01. It has received 60 citations till now. The article focuses on the topics: Speech processing & Speech corpus.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Digital processing of speech signals

M.G. Bellanger

Journal ArticleDOI

Synthesizing Obama: learning lip sync from audio

Supasorn Suwajanakorn, +2 more

- 20 Jul 2017 -

ACM Transactions on Graphics

TL;DR: Given audio of President Barack Obama, a high quality video of him speaking with accurate lip sync is synthesized, composited into a target video clip, and a recurrent neural network learns the mapping from raw audio features to mouth shapes to produce photorealistic results.

...read moreread less

Journal ArticleDOI

Audio-driven facial animation by joint end-to-end learning of pose and emotion

Tero Karras, +4 more

- 20 Jul 2017 -

ACM Transactions on Graphics

TL;DR: This work presents a machine learning technique for driving 3D facial animation by audio input in real time and with low latency, and simultaneously discovers a compact, latent code that disambiguates the variations in facial expression that cannot be explained by the audio alone.

...read moreread less

Journal ArticleDOI

JALI: an animator-centric viseme model for expressive lip synchronization

Pif Edwards, +3 more

TL;DR: A system that, given an input audio soundtrack and speech transcript, automatically generates expressive lip-synchronized facial animation that is amenable to further artistic refinement, and that is comparable with both performance capture and professional animator output is presented.

...read moreread less

Book ChapterDOI

MEAD: A Large-Scale Audio-Visual Dataset for Emotional Talking-Face Generation

Kaisiyuan Wang, +9 more

TL;DR: The Multi-view Emotional Audio-visual Dataset (MEAD) is built, a talking-face video corpus featuring 60 actors and actresses talking with eight different emotions at three different intensity levels that could benefit a number of different research fields including conditional generation, cross-modal understanding and expression recognition.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Least squares quantization in PCM

S. P. Lloyd

- 01 Mar 1982 -

IEEE Transactions on Information Theory

TL;DR: In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.

...read moreread less

Journal ArticleDOI

Determining optical flow

Berthold K. P. Horn, +1 more

- 01 Aug 1981 -

Artificial Intelligence

TL;DR: In this paper, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.

...read moreread less

Journal ArticleDOI

LIII. On lines and planes of closest fit to systems of points in space

Karl Pearson F.R.S.

- 01 Nov 1901 -

Philosophical Magazine Series 1

TL;DR: This paper is concerned with the construction of planes of closest fit to systems of points in space and the relationships between these planes and the planes themselves.

...read moreread less

Least Squares Quantization in PCM

S. P. Lloyd

TL;DR: The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.

...read moreread less

Book

Fundamentals of speech recognition

Lawrence R. Rabiner, +1 more

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

...read moreread less