Home
/
Topics
/
Viseme

Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1973
1972
1971
1970
1969
1968

1 / 2

Papers

PDF

Open Access

More filters

Nineteen (±two) French visemes for visual speech synthesis.

[...]

Christian Benoît, Tahar Lallouache, T. Mohamedi, A. Tseva, Christian Abry - Show less +1 more

01 Jan 1990

9 citations

Proceedings Article•

Cross-stream observation dependencies for multi-stream speech recognition.

[...]

Özgür Çetin, Mari Ostendorf

01 Jan 2003

TL;DR: This paper extends prior work in multi-stream modeling by introducing cross-stream observation dependencies and a new discriminative criterion for selecting such dependencies and Experimental results combining short-term PLP features with longterm TRAP features show gains associated with a multi- stream model with partial state asynchrony over a baseline HMM.

...read moreread less

Abstract: This paper extends prior work in multi-stream modeling by introducing cross-stream observation dependencies and a new discriminative criterion for selecting such dependencies. Experimental results combining short-term PLP features with longterm TRAP features show gains associated with a multi-stream model with partial state asynchrony over a baseline HMM. Frame-based analyses show significant discriminant information in the added cross-stream dependencies, but so far there are only small gains in recognition accuracy.

...read moreread less

9 citations

Journal Article•DOI•

Alternative Visual Units for an Optimized Phoneme-Based Lipreading System

[...]

Helen L. Bear¹, Richard P. Harvey•Institutions (1)

University of East Anglia¹

16 Sep 2019-arXiv: Image and Video Processing

TL;DR: A structured approach to create speaker-dependent visemes with a fixed number of viseme within each set, based upon clustering phonemes, which significantly improves on previous lipreading results with RMAV speakers.

...read moreread less

Abstract: Lipreading is understanding speech from observed lip movements. An observed series of lip motions is an ordered sequence of visual lip gestures. These gestures are commonly known, but as yet are not formally defined, as `visemes'. In this article, we describe a structured approach which allows us to create speaker-dependent visemes with a fixed number of visemes within each set. We create sets of visemes for sizes two to 45. Each set of visemes is based upon clustering phonemes, thus each set has a unique phoneme-to-viseme mapping. We first present an experiment using these maps and the Resource Management Audio-Visual (RMAV) dataset which shows the effect of changing the viseme map size in speaker-dependent machine lipreading and demonstrate that word recognition with phoneme classifiers is possible. Furthermore, we show that there are intermediate units between visemes and phonemes which are better still. Second, we present a novel two-pass training scheme for phoneme classifiers. This approach uses our new intermediary visual units from our first experiment in the first pass as classifiers; before using the phoneme-to-viseme maps, we retrain these into phoneme classifiers. This method significantly improves on previous lipreading results with RMAV speakers.

...read moreread less

9 citations

Proceedings Article•DOI•

A mouth full of words: Visually consistent acoustic redubbing

[...]

Sarah Taylor¹, Barry-John Theobald², Iain Matthews¹•Institutions (2)

Disney Research¹, University of East Anglia²

19 Apr 2015

TL;DR: This paper introduces a method for automatic redubbing of video that exploits the many-to-many mapping of phoneme sequences to lip movements modelled as dynamic visemes, and explores the natural ambiguity in visual speech.

...read moreread less

Abstract: This paper introduces a method for automatic redubbing of video that exploits the many-to-many mapping of phoneme sequences to lip movements modelled as dynamic visemes [1]. For a given utterance, the corresponding dynamic viseme sequence is sampled to construct a graph of possible phoneme sequences that synchronize with the video. When composed with a pronunciation dictionary and language model, this produces a vast number of word sequences that are in sync with the original video, literally putting plausible words into the mouth of the speaker. We demonstrate that traditional, many-to-one, static visemes lack flexibility for this application as they produce significantly fewer word sequences. This work explores the natural ambiguity in visual speech and offers insight for automatic speech recognition and the importance of language modeling.

...read moreread less

9 citations

Journal Article•DOI•

A constraint-based approach to visual speech for a Mexican-Spanish talking head

[...]

Oscar M. Martinez Lazalde¹, Steve Maddock¹, Michael Meredith¹•Institutions (1)

University of Sheffield¹

10 Jan 2008

TL;DR: This work describes an approach to pose-based interpolation that deals with coarticulation using a constraint-based technique and demonstrates it using a Mexican-Spanish talking head, which can vary its speed of talking and produce coARTiculation effects.

...read moreread less

Abstract: A common approach to produce visual speech is to interpolate the parameters describing a sequence of mouth shapes, known as visemes, where a viseme corresponds to a phoneme in an utterance. The interpolation process must consider the issue of context-dependent shape, or coarticulation, in order to produce realistic-looking speech. We describe an approach to such pose-based interpolation that deals with coarticulation using a constraint-based technique. This is demonstrated using a Mexican-Spanish talking head, which can vary its speed of talking and produce coarticulation effects.

...read moreread less

9 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
…
65
66
67
68
69
70
71
…
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics