scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Proceedings ArticleDOI
22 May 2011
TL;DR: A new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score to initialize an algorithm which computes a parametric decomposition of the spectrogram based on non-negative matrix factorization (NMF).
Abstract: In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize an algorithm which computes a parametric decomposition of the spectrogram based on non-negative matrix factorization (NMF). This algorithm provides time-frequency masks which are used to separate the sources with Wiener filtering.

100 citations

Proceedings ArticleDOI
14 Nov 2013
TL;DR: It is an open problem for signal processing and machine learning to reliably identify bird sounds in real-world audio data collected in an acoustic monitoring scenario.
Abstract: Birds have been widely used as biological indicators for ecological research. They respond quickly to environmental changes and can be used to infer about other organisms (e.g., insects they feed on). Traditional methods for collecting data about birds involves costly human effort. A promising alternative is acoustic monitoring. There are many advantages to recording audio of birds compared to human surveys, including increased temporal and spatial resolution and extent, applicability in remote sites, reduced observer bias, and potentially lower cost. However, it is an open problem for signal processing and machine learning to reliably identify bird sounds in real-world audio data collected in an acoustic monitoring scenario. Some of the major challenges include multiple simultaneously vocalizing birds, other sources of non-bird sound (e.g., buzzing insects), and background noise like wind, rain, and motor vehicles.

100 citations

Proceedings ArticleDOI
12 May 2008
TL;DR: A novel method based on matching pursuit to analyze environment sounds for their feature extraction that is flexible, yet intuitive and physically interpretable, and can be used to supplement another well-known audio feature, i.e. MFCC, to yield higher recognition accuracy for environmental sounds.
Abstract: Defining suitable features for environmental sounds is an important problem in an automatic acoustic scene recognition system. As with most pattern recognition problems, extracting the right feature set is the key to effective performance. A variety of features have been proposed for audio recognition, but the vast majority of the past work utilizes features that are well-known for structured data, such as speech and music, and assumes this association will transfer naturally well to unstructured sounds. In this paper, we propose a novel method based on matching pursuit (MP) to analyze environment sounds for their feature extraction. The proposed MP-based method utilizes a dictionary from which to select features, resulting in a representation that is flexible, yet intuitive and physically interpretable. We will show that these features are less sensitive to noise and are capable of effectively representing sounds that originate from different sources and different frequency ranges. The MP- based feature can be used to supplement another well-known audio feature, i.e. MFCC, to yield higher recognition accuracy for environmental sounds.

100 citations

Patent
26 Feb 2002
TL;DR: In this paper, an audio signal is divided into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal.
Abstract: In one aspect, the invention divides an audio signal into auditory events, each of which tends to be perceived as separate and distinct, by calculating the spectral content of successive time blocks of the audio signal, calculating the difference in spectral content between successive time blocks of the audio signal, and identifying an auditory event boundary as the boundary between successive time blocks when the difference in the spectral content between such successive time blocks exceeds a threshold. In another aspect, the invention generates a reduced-information representation of an audio signal by dividing an audio signal into auditory events, each of which tends to be perceived as separate and distinct, and formatting and storing information relating to the auditory events. Optionally, the invention may also assign a characteristic to one or more of the auditory events. Auditory events may be determined according to the first aspect of the invention or by another method.

100 citations

PatentDOI
TL;DR: In this article, a desired acoustic signal is extracted from a noisy environment by generating a signal representative of the desired signal with processor (30) using a discrete Fourier transform process.
Abstract: A desired acoustic signal is extracted from a noisy environment by generating a signal representative of the desired signal with processor (30). Processor (30) receives aural signals from two sensors (22, 24) each at a different location. The two inputs to processor (30) are converted from analog to digital format and then submitted to a discrete Fourier transform process to generate discrete spectral signal representations. The spectral signals are delayed to provide a number of intermediate signals, each corresponding to a different spatial location relative to the two sensors. Locations of the noise source and the desired source, and the spectral content of the desired signal are determined fron the intermediate signal corresponding to the noise source locations. Inverse transformation of the selected intermediate signal followed by digital to analog conversion provides an output signal representative of the desired signal with output device (90). Techniques to localize multiple acoustic sources are also disclosed. Further, a technique to enhance noise reduction from multiple sources based on two-sensor reception is described.

100 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597