Home
/
Topics
/
Viseme

Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1973
1972
1971
1970
1969
1968

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

An Emotion Estimation from Human Speech Using Speech Recognition and Speech Synthesize.

[...]

Masaki Kurematsu, Marina Ohashi, Orimi Kinosita, Jun Hakura, Hamido Fujita - Show less +1 more

01 Jan 2008

2 citations

Journal Article•DOI•

Viseme Recognition System Based on Transformed Acoustic Models

[...]

A. Zgank, Z. Kacic

11 Jul 2013-Elektronika Ir Elektrotechnika

TL;DR: A novel method for generating acoustic models for viseme recognition from speech using transformations from trained phoneme acoustic models is proposed, which is language-independent; only the available speech resources are needed.

...read moreread less

Abstract: Viseme recognition from speech is one of the methods needed to operate a talking head system, which can be used in various areas, such as mobile services and applications, gaming, the entertainment industry, and so on. This paper proposes a novel method for generating acoustic models for viseme recognition from speech. The viseme acoustic models were generated using transformations from trained phoneme acoustic models. The proposed transformation method is language-independent; only the available speech resources are needed. The viseme sequence with corresponding time information was produced as a result of recognition using context-dependent acoustic models. The evaluation of the proposed acoustic models’ transformation method was carried out on a test scenario with phonetically balanced words, in which the results were compared to the baseline viseme recognition system. The improvement in viseme accuracy was statistically significant when using the proposed method for transforming acoustic models. DOI: http://dx.doi.org/10.5755/j01.eee.19.9.5657

...read moreread less

2 citations

Proceedings Article•DOI•

Visual speech synthesis from 3d video

[...]

James D. Edge¹, Adrian Hilton¹•Institutions (1)

University of Surrey¹

01 Jan 2006

TL;DR: In this article, a stereo capture system is used to reconstruct 3D models of a speaker producing sentences from the TIMIT corpus, which is mapped into a space which maintains the relationships between samples and their temporal derivatives.

...read moreread less

Abstract: In this paper we describe a parameterisation of lip movements which maintains the dynamic structure inherent in the task of producing speech sounds. A stereo capture system is used to reconstruct 3D models of a speaker producing sentences from the TIMIT corpus. This data is mapped into a space which maintains the relationships between samples and their temporal derivatives. By incorporating dynamic information within the parameterisation of lip movements we can model the cyclical structure, as well as the causal nature of speech movements as described by an underlying visual speech manifold. It is believed that such a structure will be appropriate to various areas of speech modeling, in particular the synthesis of speech lip movements.

...read moreread less

2 citations

Proceedings Article•

Model synthesis for band-limited speech recognition.

[...]

Yongjun He, Jiqing Han¹•Institutions (1)

Harbin Institute of Technology¹

01 Jan 2010

TL;DR: A novel model synthesis method is proposed for band-limited speech recognition that detects speech bandwidth automatically and synthesizes a new acoustic model only using a full-bandwidth model when the bandwidth has been changed.

...read moreread less

Abstract: A recognizer trained with full-bandwidth speech performs badly when recognizing band-limited speech because of environment mismatch. In this paper, we have proposed a novel model synthesis method for band-limited speech recognition. It detects speech bandwidth automatically and synthesizes a new acoustic model only using a full-bandwidth model when the bandwidth has been changed. Experiments conducted on TIMIT/NTIMIT databases show that the proposed method has achieved substantial improvement over the baseline speech recognizer.

...read moreread less

2 citations

Posted Content•

Lip reading using external viseme decoding.

[...]

Javad Peymanfard¹, Mohammad Reza Mohammadi¹, Hossein Zeinali², Nasser Mozayani¹•Institutions (2)

Iran University of Science and Technology¹, Amirkabir University of Technology²

10 Apr 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors proposed a method to use external text data (for viseme-to-character mapping) by dividing video to character into two stages, namely converting video to viseme, and then converting viseme to character by using separate models.

...read moreread less

Abstract: Lip-reading is the operation of recognizing speech from lip movements. This is a difficult task because the movements of the lips when pronouncing the words are similar for some of them. Viseme is used to describe lip movements during a conversation. This paper aims to show how to use external text data (for viseme-to-character mapping) by dividing video-to-character into two stages, namely converting video to viseme, and then converting viseme to character by using separate models. Our proposed method improves word error rate by 4\% compared to the normal sequence to sequence lip-reading model on the BBC-Oxford Lip Reading Sentences 2 (LRS2) dataset.

...read moreread less

2 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
…
118
119
120
121
122
123
124
…
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics