Home
/
Topics
/
Speech coding

Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Patent•DOI•

Method and apparatus for using visual images to mix sound

[...]

David A. Gibson

18 Apr 1995-Journal of the Acoustical Society of America

TL;DR: In this paper, each audio signal is digitized and then transformed into a predefined visual image, which is displayed in a 3D space, and selected audio characteristics, such as frequency, amplitude, time and spatial placement, are correlated to selected visual characteristics of the visual image.

...read moreread less

Abstract: A method and apparatus for mixing audio signals. Each audio signal is digitized and then transformed into a predefined visual image, which is displayed in a three-dimensional space. Selected audio characteristics of the audio signal, such as frequency, amplitude, time and spatial placement, are correlated to selected visual characteristics of the visual image, such as size, location, texture, density and color. Dynamic changes or adjustment to any one of these parameters causes a corresponding change in the correlated parameter.

...read moreread less

218 citations

Proceedings Article•DOI•

Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments

[...]

David Malah¹, Richard Vandervoort Cox, A.J. Accardi•Institutions (1)

Bell Labs¹

15 Mar 1999

TL;DR: The estimated q's are used to control both the gain and the update of the estimated noise spectrum during speech presence in a modified MMSE log-spectral amplitude estimator, which resulted in higher scores than for the IS-127 standard enhancement algorithm, when pre-processing noisy speech for a coding application.

...read moreread less

Abstract: Speech enhancement algorithms which are based on estimating the short-time spectral amplitude of the clean speech have better performance when a soft-decision gain modification, depending on the a priori probability of speech absence, is used. In reported works a fixed probability, q, is assumed. Since speech is non-stationary and may not be present in every frequency bin when voiced, we propose a method for estimating distinct values of q for different bins which are tracked in time. The estimation is based on a decision-theoretic approach for setting a threshold in each bin followed by short-time averaging. The estimated q's are used to control both the gain and the update of the estimated noise spectrum during speech presence in a modified MMSE log-spectral amplitude estimator. Subjective tests resulted in higher scores than for the IS-127 standard enhancement algorithm, when pre-processing noisy speech for a coding application.

...read moreread less

217 citations

Journal Article•DOI•

Rate-distortion speech coding with a minimum discrimination information distortion measure

[...]

Robert M. Gray, A. Gray, G. Rebolledo¹, John E. Shore²•Institutions (2)

Stanford University¹, United States Naval Research Laboratory²

01 Nov 1981-IEEE Transactions on Information Theory

TL;DR: An information theory approach to the theory and practice of linear predictive coded speech compression systems is developed and it is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech.

...read moreread less

Abstract: An information theory approach to the theory and practice of linear predictive coded (LPC) speech compression systems is developed. It is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech. This distortion measure is used in an algorithm for computer-aided design of block source codes subject to a fidelity criterion to obtain a 750-bits/s speech compression system that resembles an LPC system but has a much lower rate, a larger memory requirement, and requires no on-line LPC analysis. Quantitative and informal subjective comparisons are made among our system and LPC systems.

...read moreread less

217 citations

Journal Article•DOI•

Lossy source coding

[...]

Toby Berger¹, Jerry D. Gibson¹•Institutions (1)

Cornell University¹

01 Oct 1998-IEEE Transactions on Information Theory

TL;DR: This work chronicles the development of rate-distortion theory and provides an overview of its influence on the practice of lossy source coding.

...read moreread less

Abstract: Lossy coding of speech, high-quality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called rate-distortion theory. For the first 25 years of its existence, rate-distortion theory had relatively little impact on the methods and systems actually used to compress real sources. Today, however, rate-distortion theoretic concepts are an important component of many lossy compression techniques and standards. We chronicle the development of rate-distortion theory and provide an overview of its influence on the practice of lossy source coding.

...read moreread less

213 citations

Patent•

Systems and methods of performing speech recognition with barge-in for use in a bluetooth system

[...]

Younan Lu

19 Nov 2007

TL;DR: In this article, a speech recognition method was proposed to improve methods of performing speech recognition with barge-in by starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user input signal, selectively altering the synthesized speech, and recognizing the user choice.

...read moreread less

Abstract: Embodiments of the present invention improve methods of performing speech recognition with barge-in. In one embodiment, the present invention includes a speech recognition method comprising starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user speech input signal, selectively altering the synthesis of recorded speech, and recognizing the user choice.

...read moreread less

213 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
…
27
28
29
30
31
32
33
…
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics