Home
/
Topics
/
Speech coding

Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Method for generating pesonalized speech from text

[...]

Donald T. Tang¹, Ligin Shen¹, Qin Shi¹, Wei Zhang¹•Institutions (1)

IBM¹

05 Apr 2002

TL;DR: In this paper, a method for generating personalized speech from text includes the steps of analyzing the input text to get standard parameters of the speech to be synthesized from a standard text-to-speech database.

...read moreread less

Abstract: A method for generating personalized speech from text includes the steps of analyzing the input text to get standard parameters of the speech to be synthesized from a standard text-to-speech database; mapping the standard speech parameters to the personalized speech parameters via a personalization model obtained in a training process; and synthesizing speech of the input text based on the personalized speech parameters. The method can be used to simulate the speech of the target person so as to make the speech produced by a TTS system more attractive and personalized.

...read moreread less

127 citations

Proceedings Article•DOI•

Embedded coding of speech: A vector quantization approach

[...]

A. Haoui¹, David G. Messerschmitt•Institutions (1)

University of California, Berkeley¹

01 Apr 1985

TL;DR: Design methods for vector quantizers with the embedded coding property are presented and their performance simulated for the medium-band of 32-8 Kbits/sec.

...read moreread less

Abstract: Embedded speech coders are characterized by the property that their output quality degrades gracefully as their bit rate is decreased. Design methods for vector quantizers with the embedded coding property are presented and their performance simulated for the medium-band of 32-8 Kbits/sec. Listening tests indicate that these coders can provide good quality speech at 32 and 24 Kbits/sec and intelligible speech down to 8 Kbits/sec.

...read moreread less

127 citations

Journal Article•DOI•

Steganography in Inactive Frames of VoIP Streams Encoded by Source Codec

[...]

Yongfeng Huang¹, Shanyu Tang², Jian Yuan¹•Institutions (2)

Tsinghua University¹, London Metropolitan University²

01 Jun 2011-IEEE Transactions on Information Forensics and Security

TL;DR: It is revealed that, contrary to existing thought, the inactive frames of VoIP streams are more suitable for data embedding than the active frames of the streams; that is, steganography in the inactive audio frames attains a largerData embedding capacity than that in the active audio frames under the same imperceptibility.

...read moreread less

Abstract: This paper describes a novel high-capacity steganography algorithm for embedding data in the inactive frames of low bit rate audio streams encoded by G.723.1 source codec, which is used extensively in Voice over Internet Protocol (VoIP). This study reveals that, contrary to existing thought, the inactive frames of VoIP streams are more suitable for data embedding than the active frames of the streams; that is, steganography in the inactive audio frames attains a larger data embedding capacity than that in the active audio frames under the same imperceptibility. By analyzing the concealment of steganography in the inactive frames of low bit rate audio streams encoded by G.723.1 codec with 6.3 kb/s, the authors propose a new algorithm for steganography in different speech parameters of the inactive frames. Performance evaluation shows embedding data in various speech parameters led to different levels of concealment. An improved voice activity detection algorithm is suggested for detecting inactive audio frames taking into packet loss account. Experimental results show our proposed steganography algorithm not only achieved perfect imperceptibility but also gained a high data embedding rate up to 101 bits/frame, indicating that the data embedding capacity of the proposed algorithm is very much larger than those of previously suggested algorithms.

...read moreread less

127 citations

Proceedings Article•DOI•

CELP Coding for high-quality speech at 8 kbit/s

[...]

M. Copperi¹, D. Sereno•Institutions (1)

CSELT¹

01 Apr 1986

TL;DR: A new speech coding technique at low bit-rate is presented, which split the incoming speech signal into two frequency bands in order to gain the benefits of the piecewise LP (Linear Prediction) approximation.

...read moreread less

Abstract: A new speech coding technique at low bit-rate is presented in this paper. The coder is based upon a novel speech production model, independently developed by the authors [1,2] and by Atal and Schroeder [3,4], called CELP (Codebook Excited Linear Prediction). Differences exist between the two approaches, both in the strategy chosen to construct codebooks, and in the method to generate the innovation sequence. In this scheme, we split the incoming speech signal into two frequency bands in order to gain the benefits of the piecewise LP (Linear Prediction) approximation. Then, each residual signal is coded in blocks of 5-ms duration through an adaptive vector quantizer incorporating a noise shaping filter. Our results show that good quality speech can be obtained at 8 kbit/s.

...read moreread less

127 citations

Proceedings Article•DOI•

Low bit rate high quality audio coding with combined harmonic and wavelet representations

[...]

K.N. Hamdy¹, Murtaza Ali¹, A.H. Tewfi•Institutions (1)

University of Minnesota¹

07 May 1996

TL;DR: A novel high quality audio coding method using adaptive signal representation, based on sinusoidal and wavelet analysis of signals, which separates out tones, transients, and broadband noise.

...read moreread less

Abstract: We describe a novel high quality audio coding method using adaptive signal representation, based on sinusoidal and wavelet analysis of signals. First, we perform a harmonic analysis of the signal to remove strong periodic structures or tones from the signal. Then we carry out wavelet analysis that are useful in tracking the transients of the signal. These transients are then removed from the wavelet coefficients. The remaining coefficients have broadband noise-like structure. Since this method separates out tones (sinusoids), transients, and broadband noise, we may use tonal, noise, and temporal masking information to individually encode the tones and the wavelet coefficients. Our experiments suggest that this method yields a nominal bit rate of 1 bit/sample for high quality audio compression.

...read moreread less

126 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
…
67
68
69
70
71
72
73
…
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics