Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Visual speech synthesis using dynamic visemes, contextual features and DNNs

[...]

Ausdang Thangthai, Ben Milner, Sarah Taylor

08 Sep 2016

TL;DR: This paper examines methods to improve visual speech synthesis from a text input using a deep neural network (DNN) and reveals the importance of the frame level information which is able to avoid discontinuities in the visual feature sequence and produces a smooth and realistic output.

...read moreread less

Abstract: This paper examines methods to improve visual speech synthesis from a text input using a deep neural network (DNN). Two representations of the input text are considered, namely into phoneme sequences or dynamic viseme sequences. From these sequences, contextual features are extracted that include information at varying linguistic levels, from frame level down to the utterance level. These are extracted from a broad sliding window that captures context and produces features that are input into the DNN to estimate visual features. Experiments first compare the accuracy of these visual features against an HMM baseline method which establishes that both the phoneme and dynamic viseme systems perform better with best performance obtained by a combined phoneme-dynamic viseme system. An investigation into the features then reveals the importance of the frame level information which is able to avoid discontinuities in the visual feature sequence and produces a smooth and realistic output.

...read moreread less

3 citations

Lip Motion synthesis using a context dependent trajectory Hidden Markov Model

[...]

Gregor Hofer, Hiroshi Shimodaira, Junichi Yamagishi

01 Jan 2007

TL;DR: This poster presents a novel technique to automatically synthesise lip motion trajectories given some text and speech using a time series stochastic model called ”Trajectory Hidden Markov Model”.

...read moreread less

Abstract: Lip synchronisation is essential to make character animation believeable. In this poster we present a novel technique to automatically synthesise lip motion trajectories given some text and speech. Our work distinguishes itself from other work by not using visemes (visual counterparts of phonemes). The lip motion trajectories are directly modelled using a time series stochastic model called ”Trajectory Hidden Markov Model”. Its parameter generation algorithm can produce motion trajectories that are used to drive control points on the lips directly.

...read moreread less

3 citations

Proceedings Article•DOI•

Thai Speech-Driven Facial Animation

[...]

Thavesak Chuensaichol¹, Pizzanu Kanongchaiyos¹, Chai Wutiwiwatchai²•Institutions (2)

Chulalongkorn University¹, NECTEC²

20 Oct 2011

TL;DR: The main idea is to extract and capture a vise me from the video of a human talking and the phonemic scripts inside this video, and generate a talking head animation video by synchronizing a time-stamped of each phoneme to concatenated visemes.

...read moreread less

Abstract: We consider the problem of making lip movement for an animated talking character, which consumes workload and cost during the animation development process. The main idea is to extract and capture a vise me from the video of a human talking and the phonemic scripts inside this video. After that, we generate a talking head animation video by synchronizing a time-stamped of each phoneme to concatenated visemes. The results of experimental tests are reported, indicating good accuracy.

...read moreread less

3 citations

DOI•

Hidden markov models based indonesian viseme model for natural speech with affection

[...]

Endang Setyati, Mauridhi Hery Purnomo¹, Surya Sumpeno¹, Joan Santoso•Institutions (1)

Sepuluh Nopember Institute of Technology¹

13 Dec 2016

TL;DR: This work defined system of an Indonesian viseme set and the associated mouth shapes, namely system of text input segmentation and proposed a choice of one of affection type as input in the system to generate a viseme sequence for natural speech of Indonesian sentences with affection.

...read moreread less

Abstract: In a communication using texts input, viseme (visual phonemes) is derived from a group of phonemes having similar visual appearances. Hidden Markov model (HMM) has been a popular mathematical approach for sequence classification such as speech recognition. For speech emotion recognition, a HMM is trained for each emotion and an unknown sample is classified according to the model which illustrate the derived feature sequence best. Viterbi algorithm, HMM is used for guessing the most possible state sequence of observable states. In this work, first stage, we defined system of an Indonesian viseme set and the associated mouth shapes, namely system of text input segmentation. The second stage, we defined a choice of one of affection type as input in the system. The last stage, we experimentally using Trigram HMMs for generating the viseme sequence to be used for synchronized mouth shape and lip movements. The whole system is interconnected in a sequence. The final system produced a viseme sequence for natural speech of Indonesian sentences with affection. We show through various experiments that the proposed, the results in about 82,19% relative improvement in classification accuracy.

...read moreread less

3 citations

Proceedings Article•

Modeling recognition of speech sounds with minerva2.

[...]

Travis Wade, Deborah K. Eakin, Russell Webb, Arvin Agah, Frank Brown, Allard Jongman, John Gauch, Thomas A. Schreiber, Joan A. Sereno - Show less +5 more

01 Jan 2002

TL;DR: This study investigates the extent to which a localist-distributive hybrid formal model of human memory replicates observed behavioral patterns in perception and recognition of appropriately coded language data.

...read moreread less

Abstract: This study investigates the extent to which a localist-distributive hybrid formal model of human memory replicates observed behavioral patterns in perception and recognition of appropriately coded language data. Extending previous research that considered for modeled memorization only items with uniform, undefined randomly generated featural specifications, a MINERVA2 simulation was trained to recognize linguistic events and categories at both acoustic-phonetic and phonological-featural processing levels. Results of both test conditions parallel two important effects observed in behavioral data and are discussed with respect to speech perception as well as human memory research.

...read moreread less

3 citations

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics