Topic
Viseme
About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.
Papers published on a yearly basis
Papers
More filters
••
15 Feb 2018TL;DR: A system Text-to-Audio Visual Indonesian can visualize the pronunciation of the sentences Indonesian synchronized with speech signals and shows that the level of conformity visualization syllable pronunciation and spoken voice is good.
Abstract: This paper aims to develop a system Text-to-Audio Visual Indonesian to support learning of Indonesian pronunciation based on speech database syllable-based. This system can visualize the pronunciation of the sentences Indonesian synchronized with speech signals. We conduct several research stages, namely forming the Indonesian viseme models, creating the speech database syllable-based, converting the text into syllables dan synchronizing. The synchronization process is a compilation the viseme models and the speech signal based on input text. This system was evaluated by involving 30 respondents who rate the system based on “lip-reading”. Each respondent provides an assessment of the 10 Indonesian sentences about the level of compatibility between the visualization of syllable and speech spoken based on text input. The MOS methode (Mean Opinion Score) is used to calculate the average ratings of respondents. MOS calculation results is 4.24, It shows that the level of conformity visualization syllable pronunciation and spoken voice is good.
1 citations
•
23 Apr 2009
TL;DR: In this paper, the lip sync animation creating apparatus includes a viseme sequence creating apparatus, which is capable of automatically setting a keyframe and blending ratio so as to obtain a smooth and natural animation.
Abstract: PROBLEM TO BE SOLVED: To provide an animation creating apparatus, capable of automatically setting a keyframe and blending ratio so as to obtain a smooth and natural animation. SOLUTION: The lip sync animation creating apparatus 200 includes a viseme sequence creating apparatus 230 for obtaining visemes from speech data 152 by use of an acoustic model 170, a mapping definition 176 and a transcription 154, adding a default blended rate, and creating a viseme sequence 180; a keyframe deleter 236 for deleting keyframes, in the descending of order of face model change speed among adjacent keyframes within a keyframe sequence comprising keyframes defined in the viseme sequence 180; an adjusting section 244 for reducing a blended rate, when the speech power in a keyframe is small; an adjusting section 250 for reducing the blended rate, when the change of the image speed is high; and a blending processing section 256 for creating face image animation by blending between keyframes. COPYRIGHT: (C)2009,JPO&INPIT
1 citations