Topic
Viseme
About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.
Papers published on a yearly basis
Papers
More filters
••
22 Aug 1999TL;DR: This paper presents a system, which is capable of appropriately combining emotional cues automatically with phonemes to generate emotional visual speech on a synthetic human face.
Abstract: The animation of a three dimensional synthetic human face has been the object of much research in the past few years. Many systems now exist for this purpose, which rely on the artistic and animation skills of animators. Methods for the generation of lip movements to accompany a speech soundtrack have also been developed. These systems rely on the extraction of phonemes from the speech signal and converting them to "visemes" or visual lip shapes for a synthetic human face. The generation of human emotional expressions has also been developed in the recent past. This paper combines some of these developments to present a system, which is capable of appropriately combining emotional cues automatically with phonemes to generate emotional visual speech on a synthetic human face.
2 citations
••
TL;DR: In this article, face rotation was used to detect pitch modulation in target speech with upright and inverted faces that either matched the target or masker speech such that performance differences could be explained by binding, an early multisensory integration mechanism distinct from traditional late integration.
Abstract: When listening is difficult, seeing the face of the talker aids speech comprehension. Faces carry both temporal (low-level physical correspondence of mouth movement and auditory speech) and linguistic (learned physical correspondences of mouth shape (viseme) and speech sound (phoneme)) cues. Listeners participated in two experiments investigating how these cues may be used to process sentences when maskers are present. In Experiment I, faces were rotated to disrupt linguistic but not temporal cue correspondence. Listeners suffered a deficit in speech comprehension when the faces were rotated, indicating that visemes are processed in a rotation-dependent manner, and that linguistic cues aid comprehension. In Experiment II, listeners were asked to detect pitch modulation in the target speech with upright and inverted faces that either matched the target or masker speech such that performance differences could be explained by binding, an early multisensory integration mechanism distinct from traditional late integration. Performance in this task replicated previous findings that temporal integration induces binding, but there was no behavioral evidence for a role of linguistic cues in binding. Together these experiments point to temporal cues providing a speech processing benefit through binding and linguistic cues providing a benefit through late integration.
2 citations
••
TL;DR: A new lip synchronization algorithm for realistic applications is proposed, which can be employed to generate synchronized facial movements among the audio generated from natural speech or through a text-to-speech engine.
Abstract: Speech is one of the most important interaction methods between the humans. Therefore, most of avatar researches focus on this area with significant attention. Creating animated speech requires a facial model capable of representing the myriad shapes the human face expressions during speech. Moreover, a method to produce the correct shape at the correct time is also in order. One of the main challenges is to create precise lip movements of the avatar and synchronize it with a recorded audio. This paper proposes a new lip synchronization algorithm for realistic applications, which can be employed to generate synchronized facial movements among the audio generated from natural speech or through a text-to-speech engine. This method requires an animator to construct animations using a canonical set of visemes for all pair wise combination of a reduced phoneme set. These animations are then stitched together smoothly to construct the final animation.
2 citations