Topic

Audio signal processing

About: Audio signal processing is a research topic. Over the lifetime, 21463 publications have been published within this topic receiving 319597 citations. The topic is also known as: audio processing & Acoustic signal processing.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Audio converter device and method for using the same

[...]

Craig M. Janik

04 Sep 2001

TL;DR: In this article, an audio converter device and a method for using the same is presented, in which the audio data is decompressed and converted into analog electrical data and then transferred to an audio playback device.

...read moreread less

Abstract: An audio converter device and a method for using the same are provided. In one embodiment, the audio converter device receives the digital audio data from a first device via a local area network. The audio converter device decompresses the digital audio data and converts the digital audio data into analog electrical data. The audio converter device transfers the analog electrical data to an audio playback device.

...read moreread less

141 citations

Journal Article•DOI•

Audio Authenticity: Detecting ENF Discontinuity With High Precision Phase Analysis

[...]

Daniel Patricio Nicolalde Rodríguez, Jose A. Apolinario, Luiz W. P. Biscainho¹•Institutions (1)

Federal University of Rio de Janeiro¹

01 Sep 2010-IEEE Transactions on Information Forensics and Security

TL;DR: The proposed method is based on detecting phase discontinuity of the power grid signal, referred to as electric network frequency (ENF), which is sometimes embedded in audio signals when the recording is carried out with the equipment connected to an electrical outlet or when certain microphones are in an ENF magnetic field.

...read moreread less

Abstract: This paper addresses a forensic tool used to assess audio authenticity. The proposed method is based on detecting phase discontinuity of the power grid signal; this signal, referred to as electric network frequency (ENF), is sometimes embedded in audio signals when the recording is carried out with the equipment connected to an electrical outlet or when certain microphones are in an ENF magnetic field. After down-sampling and band-filtering the audio around the nominal value of the ENF, the result can be considered a single tone such that a high-precision Fourier analysis can be used to estimate its phase. The estimated phase provides a visual aid to locating editing points (signalled by abrupt phase changes) and inferring the type of audio editing (insertion or removal of audio segments). From the estimated values, a feature is used to quantify the discontinuity of the ENF phase, allowing an automatic decision concerning the authenticity of the audio evidence. The theoretical background is presented along with practical implementation issues related to the proposed technique, whose performance is evaluated on digitally edited audio signals.

...read moreread less

141 citations

Journal Article•DOI•

Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio

[...]

Kyogu Lee¹, Malcolm Slaney¹•Institutions (1)

Stanford University¹

01 Feb 2008-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: An acoustic chord transcription system that uses symbolic data to train hidden Markov models and gives best-of-class frame-level recognition results and the robustness of the tonal centroid feature, which outperforms the conventional chroma feature.

...read moreread less

Abstract: We describe an acoustic chord transcription system that uses symbolic data to train hidden Markov models and gives best-of-class frame-level recognition results. We avoid the extremely laborious task of human annotation of chord names and boundaries-which must be done to provide machine learning models with ground truth-by performing automatic harmony analysis on symbolic music files. In parallel, we synthesize audio from the same symbolic files and extract acoustic feature vectors which are in perfect alignment with the labels. We, therefore, generate a large set of labeled training data with a minimal amount of human labor. This allows for richer models. Thus, we build 24 key-dependent HMMs, one for each key, using the key information derived from symbolic data. Each key model defines a unique state-transition characteristic and helps avoid confusions seen in the observation vector. Given acoustic input, we identify a musical key by choosing a key model with the maximum likelihood, and we obtain the chord sequence from the optimal state path of the corresponding key model, both of which are returned by a Viterbi decoder. This not only increases the chord recognition accuracy, but also gives key information. Experimental results show the models trained on synthesized data perform very well on real recordings, even though the labels automatically generated from symbolic data are not 100% accurate. We also demonstrate the robustness of the tonal centroid feature, which outperforms the conventional chroma feature.

...read moreread less

140 citations

Posted Content•

madmom: a new Python Audio and Music Signal Processing Library

[...]

Sebastian Böck¹, Filip Korzeniowski¹, Jan Schlüter², Florian Krebs¹, Gerhard Widmer¹ - Show less +1 more•Institutions (2)

Johannes Kepler University of Linz¹, Austrian Research Institute for Artificial Intelligence²

23 May 2016-arXiv: Sound

TL;DR: Madmom is an open-source audio processing and music information retrieval (MIR) library written in Python that features a concise, NumPy-compatible, object oriented design with simple calling conventions and sensible default values for all parameters that facilitates fast prototyping of MIR applications.

...read moreread less

Abstract: In this paper, we present madmom, an open-source audio processing and music information retrieval (MIR) library written in Python. madmom features a concise, NumPy-compatible, object oriented design with simple calling conventions and sensible default values for all parameters, which facilitates fast prototyping of MIR applications. Prototypes can be seamlessly converted into callable processing pipelines through madmom's concept of Processors, callable objects that run transparently on multiple cores. Processors can also be serialised, saved, and re-run to allow results to be easily reproduced anywhere. Apart from low-level audio processing, madmom puts emphasis on musically meaningful high-level features. Many of these incorporate machine learning techniques and madmom provides a module that implements some in MIR commonly used methods such as hidden Markov models and neural networks. Additionally, madmom comes with several state-of-the-art MIR algorithms for onset detection, beat, downbeat and meter tracking, tempo estimation, and piano transcription. These can easily be incorporated into bigger MIR systems or run as stand-alone programs.

...read moreread less

140 citations

Patent•DOI•

Song-matching system and method

[...]

Daniel N. Ozick¹•Institutions (1)

iRobot¹

24 Jun 2003-Journal of the Acoustical Society of America

TL;DR: In this article, a song-matching system was proposed to provide real-time, dynamic recognition of a song being sung and providing an audio accompaniment signal in synchronism therewith.

...read moreread less

Abstract: A song-matching system, which provides real-time, dynamic recognition of a song being sung and providing an audio accompaniment signal in synchronism therewith, includes a song database having a repertoire of songs, each song of the database being stored as a relative pitch template, an audio processing module operative in response to the song being sung to convert the song being sung into a digital signal, an analyzing module operative in response to the digital signal to determine a definition pattern representing a sequence of pitch intervals of the song being sung that have been captured by the audio processing module, a matching module operative to compare the definition pattern of the song being sung with the relative pitch template of each song stored in the song database to recognize one song in the song database as the song being sung, the matching module being further operative to cause the song database to download the unmatched portion of the relative pitch template of the recognized song as a digital accompaniment signal; and a synthesizer module operative to convert the digital accompaniment signal to the audio accompaniment signal that is transmitted in synchronism with the song being sung.

...read moreread less

140 citations

Collapse

Network Information

Performance

Metrics

21,541

Papers

328,867

Citations

No. of papers in the topic in previous years
Year	Papers
2023	19
2022	63
2021	217
2020	525
2019	659
2018	597

Audio signal processing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics