Home
/
Topics
/
Viseme

Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1973
1972
1971
1970
1969
1968

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Problem of Inner Speech in Psychology and Psychophysiology

[...]

T. N. Ushakova¹•Institutions (1)

Russian Academy of Sciences¹

01 Jan 1987-Journal of Russian and East European Psychology

TL;DR: In this article, various aspects of the mechanisms of speech are studied, such as the perception of speech, the sounds of speech and word recognition, and articulation, the pronunciation of words and the sound of speech.

...read moreread less

Abstract: Various aspects of the mechanisms of speech are studied. One series of studies has concentrated on the perception of speech, the sounds of speech, and word recognition. Various models for speech recognition have been created. Another set of studies has focused on articulation, the pronunciation of words and the sounds of speech. This area has also been explored in considerable detail.

...read moreread less

Patent•

Method for generating head model animation from voice signal, and electronic device for implementing same

[...]

Glazistov Ivan Victorovich, Krotov Ilya Igorevich, Nurlanov Zhakshylyk Nurlanovich, Karacharov Ivan Olegovich, Simutin Aleksandr Vladislavovich, Danilevich Alexey Bronislavovich - Show less +2 more

10 Jun 2021

TL;DR: In this paper, a method for generating a head model animation from a voice signal using an artificial intelligence model; and an electronic device for implementing same, is presented, which comprises the steps of: acquiring characteristics information of a voice signals from the voice signal; by using the artificial intelligence models, acquiring, from the characteristics information, a phoneme stream corresponding to the voice signals, and a viseme stream correspond to the phoneme streams; and generating a Head Model animation by applying the animation curve to the visemes of the merged phoneme and viseme streams.

...read moreread less

Abstract: Disclosed are: a method for generating a head model animation from a voice signal using an artificial intelligence model; and an electronic device for implementing same. The disclosed method for generating a head model animation from a voice signal, carried out by the electronic device, comprises the steps of: acquiring characteristics information of a voice signal from the voice signal; by using the artificial intelligence model, acquiring, from the characteristics information, a phoneme stream corresponding to the voice signal, and a viseme stream corresponding to the phoneme stream; by using the artificial intelligence model, acquiring an animation curve of visemes included in the viseme stream; merging the phoneme stream with the viseme stream; and generating a head model animation by applying the animation curve to the visemes of the merged phoneme and viseme stream.

...read moreread less

Journal Article•

A demisyllable-based text-to-speech synthesis system for English

[...]

S.J. Eady, P. Ollek, J.R. Woolsey

01 Sep 1991-Canadian Acoustics

TL;DR: This paper proposes a new strategy of speech synthesis that uses intermediate-sized units corresponding to half syllables, called `demisyllables' in order to produce computer-generated speech.

...read moreread less

Abstract: Synthesis of English speech by computer can be accomplished in several different ways, depending on the size of the speech units that are used to produce voice output. The most widely used units for speech synthesis are phonemes (i.e., small speech units corresponding to individual phonetic items). An alternate method of producing computer-generated speech is to concatenate entire words of English in a method called `word-concatenation' synthesis. A third strategy, the one described in this paper, is to use intermediate-sized units corresponding to half syllables, called `demisyllables'

...read moreread less

Book Chapter•DOI•

Lip Movement Modeling Based on DCT and HMM for Visual Speech Recognition System

[...]

Ilham Addarrazi, Hassan Satori, Khalid Satori

01 Jan 2020

TL;DR: This paper presents a system that recognizes the lip movement for lip-reading system using Viola–Jones algorithm and DCT to extract the mouth features.

...read moreread less

Abstract: This paper presents a system that recognizes the lip movement for lip-reading system. Four lip gestures are recognized: rounded open, wide open, small open and closed. These gestures are used to describe visually the speech. Firstly, we detect the mouth region from frame using Viola–Jones algorithm. Then, we use DCT to extract the mouth features. The recognition is performed by a HMM which achieves a high performance of 84.99%.

...read moreread less

Book Chapter•DOI•

Modelling Graph-Based Observation Spaces for Segment-Based Speech Recognition

[...]

James Glass¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2004

TL;DR: This work describes a maximum a posteriori decoding strategy for feature-based recognizers and derive two normalization critera useful for a segment-based Viterbi or A* search.

...read moreread less

Abstract: Most speech recognizers use an observation space which is based on a temporal sequence of spectral “frames.” There is another class of recognizer which further processes these frames to produce a segment-based network, and represents each segment by a fixed-dimensional “feature.” In such feature-based recognizers the observation space takes the form of a temporal graph of feature vectors, so that any single segmentation of an utterance will use a subset of all possible feature vectors. In this work we describe a maximum a posteriori decoding strategy for feature-based recognizers and derive two normalization critera useful for a segment-based Viterbi or A* search. We show how a segment-based recognizer is able to obtain good results on the tasks of phonetic and word recognition.

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
…
164
165
166
167
168
169
170
…
171
172
173
174
175
176
177

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics