Home
/
Topics
/
Viseme

Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1973
1972
1971
1970
1969
1968

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

Using relative duration in large vocabulary speech recognition.

[...]

Matt Jones, Philip C. Woodland

16 Sep 1993

10 citations

A Detailed Description of the AVOZES data corpus

[...]

Roland Goecke¹, J Millar²•Institutions (2)

NICTA¹, Australian National University²

01 Jan 2004

TL;DR: AVOZES is the first publicly available audio-video speech data corpus for Australian English and contains recordings from 20 speakers and the sequences provide both a systematic coverage of the phonemes and visemes of Australian English as well as some application-driven utterances.

...read moreread less

Abstract: The AVOZES data corpus has recently been made publicly available for other interested researchers. It is the first publicly available audio-video speech data corpus for Australian English. It contains recordings from 20 speakers and the sequences provide both a systematic coverage of the phonemes and visemes of Australian English as well as some application-driven utterances. AVOZES is also the first audio-video speech data corpus with stereo-video recordings, which enable a more accurate measurement of geometric facial features.

...read moreread less

10 citations

Journal Article•DOI•

Synthesising visual speech using dynamic visemes and deep learning architectures

[...]

Ausdang Thangthai¹, Ben Milner¹, Sarah Taylor¹•Institutions (1)

University of East Anglia¹

01 May 2019-Computer Speech & Language

TL;DR: A detailed objective evaluation shows that a combined dynamic viseme-phoneme speech unit combined with a many-to-many encoder-decoder architecture models visual co-articulations effectively and outperforms significantly conventional phoneme-driven speech animation systems.

...read moreread less

10 citations

Proceedings Article•DOI•

Lipreading using a comparative machine learning approach

[...]

Ziad Thabet¹, Amr Nabih¹, Karim Azmi¹, Youssef Samy¹, Ghada Khoriba², Mai Elshehaly³ - Show less +2 more•Institutions (3)

Misr International University¹, Helwan University², University of Leeds³

29 Mar 2018

TL;DR: This paper presents a detailed study of the machine learning approach for the real-time visual recognition of spoken words and nine different classifiers have been implemented and tested, reporting their confusion matrices among different groups of words.

...read moreread less

Abstract: Lipreading is the process of interpreting spoken word by observing lip movement. It plays a vital role in human communication and speech understanding, especially for hearing-impaired individuals. Automated lipreading approaches have recently been used in such applications as biometric identification, silent dictation, forensic analysis of surveillance camera capture, and communication with autonomous vehicles. However, lipreading is a difficult process that poses several challenges to human- and machine-based approaches alike. This is due to the large number of phonemes in human language that are visually represented by a smaller number of lip movements (visemes). Consequently, the same viseme may be used to represent several phonemes, which confuses any lipreader. In this paper, we present a detailed study of the machine learning approach for the real-time visual recognition of spoken words. Our focus on real-time performance is motivated by the recent trend of using lipreading in autonomous vehicles. In this paper, machine learning approaches are applied to recognize lip-reading and nine different classifiers has been implemented and tested, reporting their confusion matrices among different groups of words. The classification process went on more than one classifier but these three classifiers got the best results which are GradientBoosting, Support Vector Machine(SVM) and logistic regression with results 64.7%, 63.5% and 59.4% respectively.

...read moreread less

10 citations

Proceedings Article•DOI•

Predicting word accuracy for the automatic speech recognition of non-native speech.

[...]

Su-Youn Yoon, Lei Chen, Klaus Zechner

26 Sep 2010

TL;DR: An automated method that predicts the word accuracy of a speech recognition system for non-native speech, in the context of speaking proficiency scoring, showed promising performance by themselves, and improved the overall performance in tandem with other more traditional features.

...read moreread less

Abstract: We have developed an automated method that predicts the word accuracy of a speech recognition system for non-native speech, in the context of speaking proficiency scoring. A model was trained using features based on speech recognizer scores, function word distributions, prosody, background noise, and speaking fluency. Since the method was implemented for non-native speech, fluency features, which have been used for non-native speakers’ proficiency scoring, were implemented along with several feature groups used from past research. The fluency features showed promising performance by themselves, and improved the overall performance in tandem with other more traditional features. A model using stepwise regression achieved a correlation with word accuracy rates of 0.76, compared to a baseline of 0.63 using only confidence scores. A binary classifier for placing utterances in high-or low-word accuracy bins achieved an accuracy of 84%, compared to a majority class baseline of 64%.

...read moreread less

10 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
…
58
59
60
61
62
63
64
…
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics