Home
/
Topics
/
Audio signal processing

Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Universal IP-based and scalable architectures across conversational applications using web services for speech and audio processing resources

[...]

Stephane H. Maes¹, David M. Lubensky¹, Andrzej Sakrajda¹•Institutions (1)

IBM¹

25 Jun 2002

TL;DR: In this paper, the authors present systems and methods for building distributed conversational applications using a Web services-based model where speech engines (e.g., speech recognition) and audio I/O systems are programmable services that can be asynchronously programmed by an application using a standard, extensible SERCP (speech engine remote control protocol), to provide scalable and flexible IP-based architectures that enable deployment of the same application or application development environment across a wide range of voice processing platforms and networks/gateways.

...read moreread less

Abstract: Systems and methods for conversational computing and, in particular, to systems and methods for building distributed conversational applications using a Web services-based model wherein speech engines (e.g., speech recognition) and audio I/O systems are programmable services that can be asynchronously programmed by an application using a standard, extensible SERCP (speech engine remote control protocol), to thereby provide scalable and flexible IP-based architectures that enable deployment of the same application or application development environment across a wide range of voice processing platforms and networks/gateways (e.g., PSTN (public switched telephone network), Wireless, Internet, and VoIP (voice over IP)). Systems and methods are further provided for dynamically allocating, assigning, configuring and controlling speech resources such as speech engines, speech pre/post processing systems, audio subsystems, and exchanges between speech engines using SERCP in a web service-based framework.

...read moreread less

619 citations

Book•

Digital processing of signals

[...]

Bernard Gold, Charles M. Rader

01 Jan 1969

612 citations

Journal Article•DOI•

Content analysis for audio classification and segmentation

[...]

Lie Lu¹, Hong-Jiang Zhang¹, Hao Jiang¹•Institutions (1)

Microsoft¹

10 Dec 2002-IEEE Transactions on Speech and Audio Processing

TL;DR: A robust approach that is capable of classifying and segmenting an audio stream into speech, music, environment sound, and silence is proposed, and an unsupervised speaker segmentation algorithm using a novel scheme based on quasi-GMM and LSP correlation analysis is developed.

...read moreread less

Abstract: We present our study of audio content analysis for classification and segmentation, in which an audio stream is segmented according to audio type or speaker identity. We propose a robust approach that is capable of classifying and segmenting an audio stream into speech, music, environment sound, and silence. Audio classification is processed in two steps, which makes it suitable for different applications. The first step of the classification is speech and nonspeech discrimination. In this step, a novel algorithm based on K-nearest-neighbor (KNN) and linear spectral pairs-vector quantization (LSP-VQ) is developed. The second step further divides nonspeech class into music, environment sounds, and silence with a rule-based classification scheme. A set of new features such as the noise frame ratio and band periodicity are introduced and discussed in detail. We also develop an unsupervised speaker segmentation algorithm using a novel scheme based on quasi-GMM and LSP correlation analysis. Without a priori knowledge, this algorithm can support the open-set speaker, online speaker modeling and real time segmentation. Experimental results indicate that the proposed algorithms can produce very satisfactory results.

...read moreread less

559 citations

Patent•

Apparatus and methods for including codes in audio signals and decoding

[...]

James M. Jensen, Lynch Wendell D, Perelshteyn Michael M, Robert B Graybill, Hassan Sayed, Sabin Wayne - Show less +2 more

27 Mar 1995

TL;DR: A code frequency component in the encoded audio signal is detected based on an expected code amplitude or on a noise amplitude within a range of audio frequencies including the frequency of the code component as discussed by the authors.

...read moreread less

Abstract: Apparatus and methods for including a code (68) having at least one code frequency component in an audio signal (60) are provided. The abilities of various frequency components in the audio signal to mask the code frequency component to human hearing are evaluated (64), and based on these evaluations an amplitude (76) is assigned to the code frequency component. Methods and apparatus for detecting a code in an encoded audio signal are also provided. A code frequency component in the encoded audio signal is detected based on an expected code amplitude or on a noise amplitude within a range of audio frequencies including the frequency of the code component.

...read moreread less

554 citations

Journal Article•DOI•

Multimedia content analysis-using both audio and visual clues

[...]

Yao Wang, Zhu Liu¹, Jincheng Huang²•Institutions (2)

New York University¹, Tsinghua University²

01 Nov 2000-IEEE Signal Processing Magazine

TL;DR: This work describes audio and visual features that can effectively characterize scene content, present selected algorithms for segmentation and classification, and review some testbed systems for video archiving and retrieval.

...read moreread less

Abstract: Multimedia content analysis refers to the computerized understanding of the semantic meanings of a multimedia document, such as a video sequence with an accompanying audio track. With a multimedia document, its semantics are embedded in multiple forms that are usually complimentary of each other, Therefore, it is necessary to analyze all types of data: image frames, sound tracks, texts that can be extracted from image frames, and spoken words that can be deciphered from the audio track. This usually involves segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, and indexing and summarizing the document for efficient retrieval and browsing. We review advances in using audio and visual information jointly for accomplishing the above tasks. We describe audio and visual features that can effectively characterize scene content, present selected algorithms for segmentation and classification, and review some testbed systems for video archiving and retrieval. We also describe audio and visual descriptors and description schemes that are being considered by the MPEG-7 standard for multimedia content description.

...read moreread less

552 citations

…
1
2
3
4
5
6
7
…
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

21,541

Papers

328,867

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	63
2021	217
2020	525
2019	659
2018	597

Audio signal processing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics