Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Bat detective—Deep learning tools for bat acoustic signal detection

[...]

Oisin Mac Aodha¹, Rory Gibb¹, Kate E. Barlow², Ella Browning³, Ella Browning¹, Michael Firman¹, Robin Freeman³, Briana Harder⁴, Libby Kinsey¹, Gary R. Mead, Stuart E. Newson⁵, Ivan Pandourski⁶, Stuart Parsons⁷, Jon Russ⁸, Abigel Szodoray-Paradi, Farkas Szodoray-Paradi, Elena Tilova, Mark Girolami⁹, Gabriel J. Brostow¹, Kate E. Jones¹, Kate E. Jones³ - Show less +17 more•Institutions (9)

University College London¹, Bat Conservation Trust², Zoological Society of London³, University of Washington⁴, British Trust for Ornithology⁵, Bulgarian Academy of Sciences⁶, Queensland University of Technology⁷, University of Warwick⁸, Imperial College London⁹

08 Mar 2018-PLOS Computational Biology

TL;DR: In this article, a convolutional neural network based open-source pipeline was developed for detecting ultrasonic, full-spectrum, search-phase calls produced by echolocating bats.

...read moreread less

Abstract: Passive acoustic sensing has emerged as a powerful tool for quantifying anthropogenic impacts on biodiversity, especially for echolocating bat species. To better assess bat population trends there is a critical need for accurate, reliable, and open source tools that allow the detection and classification of bat calls in large collections of audio recordings. The majority of existing tools are commercial or have focused on the species classification task, neglecting the important problem of first localizing echolocation calls in audio which is particularly problematic in noisy recordings. We developed a convolutional neural network based open-source pipeline for detecting ultrasonic, full-spectrum, search-phase calls produced by echolocating bats. Our deep learning algorithms were trained on full-spectrum ultrasonic audio collected along road-transects across Europe and labelled by citizen scientists from www.batdetective.org. When compared to other existing algorithms and commercial systems, we show significantly higher detection performance of search-phase echolocation calls with our test sets. As an example application, we ran our detection pipeline on bat monitoring data collected over five years from Jersey (UK), and compared results to a widely-used commercial system. Our detection pipeline can be used for the automatic detection and monitoring of bat populations, and further facilitates their use as indicator species on a large scale. Our proposed pipeline makes only a small number of bat specific design decisions, and with appropriate training data it could be applied to detecting other species in audio. A crucial novelty of our work is showing that with careful, non-trivial, design and implementation considerations, state-of-the-art deep learning methods can be used for accurate and efficient monitoring in audio.

...read moreread less

149 citations

Patent•

Digital audio signal coding using a CELP coder and a transform coder

[...]

Gilad Cohen¹, Yossef Cohen¹, Doron Hoffman¹, Hagai Krupnik¹, Aharon Satt¹ - Show less +1 more•Institutions (1)

IBM¹

04 Mar 1998

TL;DR: In this paper, a method for adaptively switching between transform audio coder and CELP coder, which makes use of the superior performance of cELP coders for speech signal coding, while enjoying the benefits of transform coder for other audio signals.

...read moreread less

Abstract: Apparatus is described for digitally encoding an input audio signal for storage or transmission. A distinguishing parameter is measure from the input signal. It is determined from the measured distinguishing parameter whether the input signal contains an audio signal of a first type or a second type. First and second coders are provided for digitally encoding the input signal using first and second coding methods respectively and a switching arrangement directs, at any particular time, the generation of an output signal by encoding the input signal using either the first or second coders according to whether the input signal contains an audio signal of the first type or the second type at that time. A method for adaptively switching between transform audio coder and CELP coder, is presented. In a preferred embodiment, the method makes use of the superior performance of CELP coders for speech signal coding, while enjoying the benefits of transform coder for other audio signals. The combined coder is designed to handle both speech and music and achieve an improved quality.

...read moreread less

148 citations

Patent•

Methods and apparatus for altering audio output signals

[...]

Michael M. Lee¹•Institutions (1)

Apple Inc.¹

02 Apr 2008

TL;DR: In this paper, the authors present a system for altering an audio output to sound as if a different person had recorded it when it was played back when the audio data file was sent to the system.

...read moreread less

Abstract: Methods, systems and computer readable media for altering an audio output are provided. In some embodiments, the system may change the original frequency content of an audio data file to a second frequency content so that a recorded audio track will sound as if a different person had recorded it when it is played back. In other embodiments, the system may receive an audio data file and a voice signature, and it may apply the voice signature to the audio data file to alter the audio output of the audio data file. In that instance, the audio data file may be a textual representation of a recorded audio data file.

...read moreread less

148 citations

Journal Article•DOI•

The long road to automation: Neurocognitive development of letter-speech sound processing

[...]

Dries Froyen¹, Milene Bonte, Nienke van Atteveldt, Leo Blomert•Institutions (1)

Maastricht University¹

01 Mar 2009-Journal of Cognitive Neuroscience

TL;DR: A transition from mere association in beginner readers to more automatic, but still not “adult-like,” integration in advanced readers is indicated and evidence for an extended development of letter–speech sound integration is provided.

...read moreread less

Abstract: In transparent alphabetic languages, the expected standard for complete acquisition of letter-speech sound associations is within one year of reading instruction. The neural mechanisms underlying the acquisition of letter-speech sound associations have, however, hardly been investigated. The present article describes an ERP study with beginner and advanced readers in which the influence of letters on speech sound processing is investigated by comparing the MMN to speech sounds presented in isolation with the MMN to speech sounds accompanied by letters. Furthermore, SOA between letter and speech sound presentation was manipulated in order to investigate the development of the temporal window of integration for letter-speech sound processing. Beginner readers, despite one year of reading instruction, showed no early letter-speech sound integration, that is, no influence of the letter on the evocation of the MMN to the speech sound. Only later in the difference wave, at 650 msec, was an influence of the letter on speech sound processing revealed. Advanced readers, with 4 years of reading instruction, showed early and automatic letter-speech sound processing as revealed by an enhancement of the MMN amplitude, however, at a different temporal window of integration in comparison with experienced adult readers. The present results indicate a transition from mere association in beginner readers to more automatic, but still not "adult-like," integration in advanced readers. In contrast to general assumptions, the present study provides evidence for an extended development of letter-speech sound integration.

...read moreread less

147 citations

Journal Article•DOI•

A frequency domain method for blind source separation of convolutive audio mixtures

[...]

K. Rahbar¹, James P. Reilly¹•Institutions (1)

McMaster University¹

15 Aug 2005-IEEE Transactions on Speech and Audio Processing

TL;DR: A new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment using a joint diagonalization procedure on the cross power spectral density matrices to identify the mixing system at each frequency bin up to a scale and permutation ambiguity.

...read moreread less

Abstract: In this paper, we propose a new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment. We propose a joint diagonalization procedure on the cross power spectral density matrices of the signals at the output of the mixing system to identify the mixing system at each frequency bin up to a scale and permutation ambiguity. The frequency domain joint diagonalization is performed using a new and quickly converging algorithm which uses an alternating least-squares (ALS) optimization method. The inverse of the mixing system is then used to separate the sources. An efficient dyadic algorithm to resolve the frequency dependent permutation ambiguities that exploits the inherent nonstationarity of the sources is presented. The effect of the unknown scaling ambiguities is partially resolved using an initialization procedure for the ALS algorithm. The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms. Performance comparisons are made with previous methods.

...read moreread less

147 citations

Collapse

Network Information

Performance

Metrics

21,541

Papers

328,867

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	63
2021	217
2020	525
2019	659
2018	597

Audio signal processing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics