Home
/
Topics
/
Viseme

Topic

Viseme

About: Viseme is a research topic. Over the lifetime, 865 publications have been published within this topic receiving 17889 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1973
1972
1971
1970
1969
1968

1 / 2

Papers

PDF

Open Access

More filters

Measuring and Compensating for the Effects of Speech Rate in Large Vocabulary Continuous Speech Recognition

[...]

Matthew A. Siegler

01 Jan 1995

TL;DR: The phone duration percentile, a comparison of measured versus expected phone duration, is shown to be robust with respect to lexical content and consistent with previous findings about the statistics of long-term and short-term speech rate.

...read moreread less

Abstract: This report describes a series of experiments that measure speech rate and that attempt to improve speech recognition accuracy for rapidly-spoken speech. Descriptions of several measures of speech rate are presented, with their advantages and disadvanatges. Speech recognition results obtained using several compensation methods are compared to identify methods by which compensation for the effects of fast speech may yield the greatest improvement in recognition accuracy. Very simple measures of speech rate such as the word rate or phone rate are found to be unsuitable for detection of both long-term and short-term speech rate since they are sensitive to the lexical content of speech. In contrast, the phone duration percentile, a comparison of measured versus expected phone duration, is shown to be robust with respect to lexical content and consistent with previous findings about the statistics of long-term and short-term speech rate. Using this metric, speakers with a speech rate in the top 30% are found to produce a 50 to 150% increase in word error rate. The compensation techniques explored contain modifications to five components of the recognition system: the models of the acoustical characteristics of speech sounds, the models of the HMM state-transition probabilities, the pronunciations of words in the dictionary, the weight with which acoustic and linguistic evidence are combined, and the base phone set. Optimizing the language weight reduced the word error rate of fast speech by 10.3% relative to baseline performance. Adapting the state-transition probabilities to fast speech reduced the word error rate for fast speech by 2.6%. Using one of the modified pronunciation dictionaries reduced the word error rate of fast speech by 2.6%. The other techniques yielded little or no reduction in the word error rate.

...read moreread less

15 citations

Journal Article•

Speech analysis by computer.

[...]

K. Nakata

01 Jun 1962-Journal of the Radio Research Laboratory

15 citations

Journal Article•DOI•

About neural-network algorithms application in viseme classification problem with face video in audiovisual speech recognition systems

[...]

Andrey V. Savchenko¹, Ya. I. Khokhlova¹•Institutions (1)

University High School¹

01 Jan 2014-Optical Memory and Neural Networks

TL;DR: A neural network recognition algorithm is developed by using the phonetic words decoding method and the requirement for isolated syllable pronunciation of voice commands to solve the phoneme recognition by facial expressions of a speaker in voice-activated control systems.

...read moreread less

Abstract: The paper considers the phoneme recognition by facial expressions of a speaker in voice-activated control systems. We have developed a neural network recognition algorithm by using the phonetic words decoding method and the requirement for isolated syllable pronunciation of voice commands. The paper presents the experimental results of viseme (facial and lip position corresponding to a particular phoneme) classification of Russian vowels. We show the dependence of the classification accuracy on the used classifier (multilayer feed-forward network, support vector machine, k-nearest neighbor method), image features (histogram of oriented gradients, eigenvectors, SURF local descriptors) and the type of camera (built-in or Kinect one). The best accuracy of speaker-dependent recognition is shown to be 85% for a built-in camera and 96% for Kinect depth maps when the classification is performed with the histogram of oriented gradients and the support vector machine.

...read moreread less

15 citations

Journal Article•DOI•

Speech perception seen through the ear

[...]

Christopher J. Darwin¹, John F. Culling¹•Institutions (1)

University of Sussex¹

01 Dec 1990-Speech Communication

TL;DR: Evidence is presented that both low-level grouping mechanisms and knowledge specific to speech are deployed in solving the problem of listeners' ability to separate speech from other sounds.

...read moreread less

14 citations

Journal Article•DOI•

Application of Concepts From Cross-Recurrence Analysis in Speech Production: An Overview and Comparison With Other Nonlinear Methods

[...]

Leonardo Lancia¹, Susanne Fuchs, Mark Tiede²•Institutions (2)

Max Planck Society¹, Haskins Laboratories²

16 Oct 2013-Journal of Speech Language and Hearing Research

TL;DR: The aim of this article was to introduce an important tool, cross-recurrence analysis, to speech production applications by showing how it can be adapted to evaluate the similarity of multi-step comparisons.

...read moreread less

Abstract: Purpose The aim of this article was to introduce an important tool, cross-recurrence analysis, to speech production applications by showing how it can be adapted to evaluate the similarity of multi...

...read moreread less

14 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
…
45
46
47
48
49
50
51
…
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177

Collapse

Network Information

Performance

Metrics

884

Papers

19,235

Citations

No. of papers in the topic in previous years
Year	Papers
2023	7
2022	12
2021	13
2020	39
2019	19
2018	22

Viseme

Papers published on a yearly basis

Papers

Trending Questions (8)

Network Information

Related Topics (5)

Performance

Metrics