scispace - formally typeset
Search or ask a question
Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.


Papers
More filters
Journal ArticleDOI
TL;DR: A new signal model is proposed where the leading vocal part is explicitly represented by a specific source/filter model and reaches state-of-the-art performances on all test sets.
Abstract: Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of the main melody (and in particular the leading vocal part) from polyphonic audio signals. To that aim, we propose a new signal model where the leading vocal part is explicitly represented by a specific source/filter model. The proposed representation is investigated in the framework of two statistical models: a Gaussian Scaled Mixture Model (GSMM) and an extended Instantaneous Mixture Model (IMM). For both models, the estimation of the different parameters is done within a maximum-likelihood framework adapted from single-channel source separation techniques. The desired sequence of fundamental frequencies is then inferred from the estimated parameters. The results obtained in a recent evaluation campaign (MIREX08) show that the proposed approaches are very promising and reach state-of-the-art performances on all test sets.

191 citations

Journal ArticleDOI
TL;DR: A method for estimating RT without prior knowledge of sound sources or room geometry is presented, and results obtained for simulated and real room data are in good agreement with the real RT values.
Abstract: The reverberation time (RT) is an important parameter for characterizing the quality of an auditory space. Sounds in reverberant environments are subject to coloration. This affects speech intelligibility and sound localization. Many state-of-the-art audio signal processing algorithms, for example in hearing-aids and telephony, are expected to have the ability to characterize the listening environment, and turn on an appropriate processing strategy accordingly. Thus, a method for characterization of room RT based on passively received microphone signals represents an important enabling technology. Current RT estimators, such as Schroeder’s method, depend on a controlled sound source, and thus cannot produce an online, blind RT estimate. Here, a method for estimating RT without prior knowledge of sound sources or room geometry is presented. The diffusive tail of reverberation was modeled as an exponentially damped Gaussian white noise process. The time-constant of the decay, which provided a measure of the RT, was estimated using a maximum-likelihood procedure. The estimates were obtained continuously, and an order-statistics filter was used to extract the most likely RT from the accumulated estimates. The procedure was illustrated for connected speech. Results obtained for simulated and real room data are in good agreement with the real RT values.

190 citations

Patent
10 Apr 2008
TL;DR: In this paper, a body worn communications device for communicating with a head-worn listening device, the communications device being adapted for receiving a multitude of audio signals and for transmitting at least one audio signal selected among the many audio signals to the listening device.
Abstract: The invention relates to a body worn communications device for communicating with a head-worn listening device, the communications device being adapted for receiving a multitude of audio signals and for transmitting at least one audio signal selected among the multitude of audio signals to the listening device, the communications device comprising a number of functional push-buttons for influencing the selection and properties of said audio signals. The invention further relates to a system, a method, and use. The object of the present invention is to provide a simple user interface between an audio selection device and a head-worn listening device, such as a hearing aid. The problem is solved in that the communications device comprises a user interface comprising a number of functional push-buttons for influencing the state of the user interface, such as the selection (and de-selection) of an audio signal, events and properties related to the audio signal, and wherein the state of the user interface is indicated at the same button where the state can be influenced. Among the advantages for a user are: Clear visual feedback by using simple button light indications; operation and indication are tied together in the buttons; the combination of audio and visual indications. The invention may e.g. be used for the hearing aids, ear phones, head sets, etc.

190 citations

Journal ArticleDOI
TL;DR: This work uses a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements, to create virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects.
Abstract: High-quality virtual audio scene rendering is required for emerging virtual and augmented reality applications, perceptual user interfaces, and sonification of data. We describe algorithms for creation of virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects. We use a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements. Details of algorithms for HRTF interpolation, room impulse response creation, HRTF selection from a database, and audio scene presentation are presented. Our system runs in real time on an office PC without specialized DSP hardware.

188 citations

Patent
29 Apr 2002
TL;DR: A sound processing apparatus for creating virtual sound sources in a three dimensional space includes a number of modules as mentioned in this paper, including an aural exciter module, panning module, distance control module, delay module, occlusion and air absorption module, Doppler module for pitch shifting, a location processor module, and an output.
Abstract: A sound processing apparatus for creating virtual sound sources in a three dimensional space includes a number of modules. These include an aural exciter module; an automated panning module; a distance control module; a delay module; an occlusion and air absorption module; a Doppler module for pitch shifting; a location processor module; and an output.

188 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Robustness (computer science)
94.7K papers, 1.6M citations
78% related
Noise
110.4K papers, 1.3M citations
77% related
Image segmentation
79.6K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202319
202263
2021217
2020525
2019659
2018597