scispace - formally typeset
Search or ask a question
Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.


Papers
More filters
Patent
08 Jul 2014
TL;DR: In this paper, a speech recognition device controls one or a plurality of pieces of equipment by speech recognition and is characterized by being provided with: a speech acquisition unit that acquires speech information showing speech uttered by a user; a speech processing unit that recognizes, as character information, speech information acquired by the speech Acquisition Unit; and a recognition result determination unit that determines whether or not an utterance is to the equipment on the basis of the character information recognized by the Speech Acquisition Unit.
Abstract: A speech recognition device controls one or a plurality of pieces of equipment by speech recognition and is characterized by being provided with: a speech acquisition unit that acquires speech information showing speech uttered by a user; a speech recognition processing unit that recognizes, as character information, speech information acquired by the speech acquisition unit; and a recognition result determination unit that determines whether or not an utterance is to the equipment on the basis of the character information recognized by the speech recognition processing unit.

3 citations

Proceedings ArticleDOI
15 Aug 2008
TL;DR: The previous work of digital speech signal processing is discussed and how to apply the existing speech processing techniques into the proposed algorithms of speech driven lip motion animation for Japanese style anime is discussed.
Abstract: In the manufacture of Japanese style anime the movement of lip with speech is usually shortened to the more convenient 'open' and 'close' of mouth because of the expensive production cost. In this paper we provide an approach to deal with speech driven lip animation for Japanese style anime. First we discuss the previous work of digital speech signal processing and show how to apply the existing speech processing techniques into our work. Then we propose our algorithms of speech driven lip motion animation. Finally the experiment results will be provided.

3 citations

Proceedings Article
01 Jan 1998
TL;DR: SIVHA, a high quality Spanish speech synthesis system for severe disabled persons controlled by their eye movements, follows the eye-gaze of the patients along the screen and constructs the text with the selected words.
Abstract: This paper presents SIVHA, a high quality Spanish speech synthesis system for severe disabled persons controlled by their eye movements. The system follows the eye-gaze of the patients along the screen and constructs the text with the selected words. When the user considers that the construction of the message has been finished, the synthesis of the message can be ordered. The system is divided in three modules. The first one determines the point of the screen the user is looking at, the second one is an interface to construct the sentences and the third one is the synthesis itself.

3 citations

Journal ArticleDOI
TL;DR: In this paper, a speech text synchroniser that is intended to teach the deaf a knowledge of the frequency and intensity characteristics of syllables so that they can relate speech to text thus allowing them to make sense of an otherwise confusing lip reading is described.
Abstract: Describes a speech text synchroniser that is intended to teach the deaf a knowledge of the frequency and intensity characteristics of syllables so that they can relate speech to text thus allowing them to make sense of an otherwise confusing lip reading.

3 citations

Proceedings ArticleDOI
10 Nov 2003
TL;DR: According to the frame rate to be rendered, intermediate frames are interpolated between key frames to make the animation result looks more natural and realistic than those obtained based on the text or speech-driven only.
Abstract: To create 3D realistic talking face has been a challenge for a long time. Previous works emphasize text or speech driven talking face respectively while the animation result is not very realistic or natural-looking. In the proposed approach, text and speech are considered to drive the 3D talkingface coordinately. The text is translated into a sequence of visemes' transcription. And time vector of the sequence is extracted from the speech corresponding to the text after it is segmented into phonetic sequence. A muscle based viseme vector is defined for static viseme. And then, with the time vector and the static visemes's sequence, dynamic visemes are generated through time-related dominance function. Finally, according to the frame rate to be rendered, intermediate frames are interpolated between key frames to make the animation result looks more natural and realistic than those obtained based on the text or speech-driven only.

3 citations


Network Information
Related Topics (5)
Vocabulary
44.6K papers, 941.5K citations
78% related
Feature vector
48.8K papers, 954.4K citations
76% related
Feature extraction
111.8K papers, 2.1M citations
75% related
Feature (computer vision)
128.2K papers, 1.7M citations
74% related
Unsupervised learning
22.7K papers, 1M citations
73% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20237
202212
202113
202039
201919
201822