scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Proceedings ArticleDOI
18 Nov 2011
TL;DR: This work proposes an approach to detecting and modeling acoustic events that directly describes temporal context, using convolutive non-negative matrix factorization (NMF), and discovers a set of spectro-temporal patch bases that best describe the data.
Abstract: Automatic detection of different types of acoustic events is an interesting problem in soundtrack processing. Typical approaches to the problem use short-term spectral features to describe the audio signal, with additional modeling on top to take temporal context into account. We propose an approach to detecting and modeling acoustic events that directly describes temporal context, using convolutive non-negative matrix factorization (NMF). NMF is useful for finding parts-based decompositions of data; here it is used to discover a set of spectro-temporal patch bases that best describe the data, with the patches corresponding to event-like structures. We derive features from the activations of these patch bases, and perform event detection on a database consisting of 16 classes of meeting-room acoustic events. We compare our approach with a baseline using standard short-term mel frequency cepstal coefficient (MFCC) features. We demonstrate that the event-based system is more robust in the presence of added noise than the MFCC-based system, and that a combination of the two systems performs even better than either individually.

134 citations

Journal ArticleDOI
TL;DR: A novel approach to applying text-based information retrieval techniques to music collections that represents tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal.
Abstract: In this paper we describe a novel approach to applying text-based information retrieval techniques to music collections. We represent tracks with a joint vocabulary consisting of both conventional words, drawn from social tags, and audio muswords, representing characteristics of automatically-identified regions of interest within the signal. We build vector space and latent aspect models indexing words and muswords for a collection of tracks, and show experimentally that retrieval with these models is extremely well-behaved. We find in particular that retrieval performance remains good for tracks by artists unseen by our models in training, and even if tags for their tracks are extremely sparse.

134 citations

Patent
30 Jan 2008
TL;DR: In this article, a signal processing apparatus including a first decimation processing section for generating, based on a digital signal in a first form, a digital signals in a second form; a second decimation Processing Section for processing the digital signals based on the second form outputted from the interpolation processing section and the second signal processing section.
Abstract: Disclosed herein is a signal processing apparatus including: a first decimation processing section for generating, based on a digital signal in a first form, a digital signal in a second form; a second decimation processing section for generating, based on the digital signal in the second form, a digital signal in a third form; a first signal processing section for processing the digital signal in the third form; an interpolation processing section for converting a digital signal in the third form outputted from the first signal processing section into a digital signal in the second form; a second signal processing section for processing the digital signal in the second form outputted from the first decimation processing section; and a combining section for combining the digital signals in the second form outputted from the interpolation processing section and the second signal processing section.

134 citations

Patent
06 Jul 2009
TL;DR: In this paper, an apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal.
Abstract: An apparatus for generating at least one audio output signal representing a superposition of at least two different audio objects comprises a processor for processing an audio input signal to provide an object representation of the audio input signal, where this object representation can be generated by a parametrically guided approximation of original objects using an object downmix signal. An object manipulator individually manipulates objects using audio object based metadata referring to the individual audio objects to obtain manipulated audio objects. The manipulated audio objects are mixed using an object mixer for finally obtaining an audio output signal having one or several channel signals depending on a specific rendering setup.

134 citations

Patent
01 Aug 1983
TL;DR: In this article, a digital audio satellite transmission system involving analog-to-digital conversion of audio signals, processing of such signals and transmission to and reception from a satellite transponder link is presented.
Abstract: A digital audio satellite transmission system involving analog-to-digital conversion of audio signals, processing of such signals and transmission to and reception from a satellite transponder link. The digitized audio signal is digitally compressed to reduce the data rate. Several such signals are then time division multiplexed together. Differential encoding, bit interleaving and forward error correction coding is utilized to maintain high signal quality, together with binary phase shift key carrier modulation. Matched filter reception of the BPSK signal transmitted over a conventional satellite transponder link is provided together with BPSK demodulation, TDM demultiplexing, differential decoding, bit deinterleaving and threshold FEC decoding. The compressed digital signal is then expanded and converted to an analog audio signal.

134 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597