Home
/
Topics
/
Speech coding

Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review

[...]

Thomas Drugman¹, Mark R. P. Thomas², Jon Gudnason³, Patrick A. Naylor², Thierry Dutoit¹ - Show less +1 more•Institutions (3)

University of Mons¹, Imperial College London², Reykjavík University³

01 Mar 2012-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: In this paper, five state-of-the-art GCI detection algorithms are compared using six different databases with contemporaneous electroglottographic recordings as ground truth, and containing many hours of speech by multiple speakers.

...read moreread less

Abstract: The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the glottal closure instants (GCIs) are available. The focus of this paper is the evaluation of automatic methods for the detection of GCIs directly from the speech waveform. Five state-of-the-art GCI detection algorithms are compared using six different databases with contemporaneous electroglottographic recordings as ground truth, and containing many hours of speech by multiple speakers. The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA). The efficacy of these methods is first evaluated on clean speech, both in terms of reliabililty and accuracy. Their robustness to additive noise and to reverberation is also assessed. A further contribution of the paper is the evaluation of their performance on a concrete application of speech processing: the causal-anticausal decomposition of speech. It is shown that for clean speech, SEDREAMS and YAGA are the best performing techniques, both in terms of identification rate and accuracy. ZFR and SEDREAMS also show a superior robustness to additive noise and reverberation.

...read moreread less

241 citations

Journal Article•DOI•

Sparse Representations in Audio and Music: From Coding to Source Separation

[...]

Mark D. Plumbley¹, Thomas Blumensath², Laurent Daudet, Rémi Gribonval³, Michael Davies⁴ - Show less +1 more•Institutions (4)

Queen Mary University of London¹, University of Southampton², French Institute for Research in Computer Science and Automation³, University of Edinburgh⁴

01 Jun 2010

TL;DR: An overview of a number of current and emerging applications of sparse representations in areas from audio coding, audio enhancement and music transcription to blind source separation solutions that can solve the cocktail party problem.

...read moreread less

Abstract: Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations in areas from audio coding, audio enhancement and music transcription to blind source separation solutions that can solve the ?cocktail party problem.? In each case we will show how the prior assumption that the audio signals are approximately sparse in some time-frequency representation allows us to address the associated signal processing task.

...read moreread less

239 citations

Patent•

Speech user interface for portable personal devices

[...]

Lauren L'Esperance, Alan Schell, Johan Smolders, Erin Hemenway, Piet Verhoeve, Eric Niblack, Mark Goslin - Show less +3 more

26 Feb 2001

TL;DR: In this paper, a speech manager interface allows the speech recognition process and the text-to-speech process to be accessed by other application processes in handheld electronic devices such as a personal digital assistant (PDA).

...read moreread less

Abstract: A handheld electronic device such as a personal digital assistant (PDA) has multiple application processes. A speech recognition process takes input speech from a user and produces a recognition output representative of the input speech. A text-to-speech process takes output text and produces a representative speech output. A speech manager interface allows the speech recognition process and the text-to-speech process to be accessed by other application processes.

...read moreread less

239 citations

Journal Article•DOI•

Digital representations of speech signals

[...]

R. Schafer¹, Lawrence R. Rabiner•Institutions (1)

Georgia Institute of Technology¹

01 Apr 1975

TL;DR: This paper presents several digital signal processing methods for representing speech, including simple waveform coding methods; time domain techniques; frequency domain representations; nonlinear or homomorphic methods; and finaIly linear predictive coding techniques.

...read moreread less

Abstract: This paper presents several digital signal processing methods for representing speech. Included among the representations are simple waveform coding methods; time domain techniques; frequency domain representations; nonlinear or homomorphic methods; and finaIly linear predictive coding techniques. The advantages and disadvantages of each of these representations for various speech processing applications are discussed.

...read moreread less

238 citations

Journal Article•DOI•

Design and description of CS-ACELP: a toll quality 8 kb/s speech coder

[...]

R. Salami¹, Claude Laflamme, J.-P. Adoul, A. Kataoka, S. Hayashi, Takehiro Moriya, C. Lamblin, D. Massaloux, S. Proust, P. Kroon, Y. Shoham - Show less +7 more•Institutions (1)

Université de Sherbrooke¹

01 Mar 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The coder structure is described in detail and the reasons behind certain design choices are discussed and a summary of the subjective test results based on a real-time implementation of this version are presented.

...read moreread less

Abstract: This paper describes the 8 kb/s speech coding algorithm G.729 which has been standardized by ITU-T. The algorithm is based on a conjugate-structure algebraic CELP (CS-ACELP) coding technique and uses 10 ms speech frames. The codec delivers toll-quality speech (equivalent to 32 kb/s ADPCM) for most operating conditions. This paper describes the coder structure in detail and discusses the reasons behind certain design choices. A 16-b fixed-point version has been developed as part of Recommendation G.729 and a summary of the subjective test results based on a real-time implementation of this version are presented.

...read moreread less

236 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
…
22
23
24
25
26
27
28
…
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics