scispace - formally typeset
Open AccessProceedings ArticleDOI

Chroma Toolbox: MATLAB Implementations for Extracting Variants of Chroma-based Audio Features

Reads0
Chats0
TLDR
A chroma toolbox is presented, which contains MATLAB implementations for extracting various types of recently proposed pitch-based and chroma-based audio features and discusses two example applications showing that the final music analysis result may crucially depend on the initial feature design step.
Abstract
Chroma-based audio features, which closely correlate to the aspect of harmony, are a well-established tool in processing and analyzing music data. There are many ways of computing and enhancing chroma features, which results in a large number of chroma variants with different properties. In this paper, we present a chroma toolbox [13], which contains MATLAB implementations for extracting various types of recently proposed pitch-based and chroma-based audio features. Providing the MATLAB implementations on a welldocumented website under a GNU-GPL license, our aim is to foster research in music information retrieval. As another goal, we want to raise awareness that there is no single chroma variant that works best in all applications. To this end, we discuss two example applications showing that the final music analysis result may crucially depend on the initial feature design step.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Feature learning and deep architectures: new directions for music informatics

TL;DR: This work critically review the standard approach to music signal analysis and identifies three specific deficiencies to current methods: hand-crafted feature design is sub-optimal and unsustainable, the power of shallow architectures is fundamentally limited, and short-time analysis cannot encode musically meaningful structure.
Proceedings Article

A software framework for musical data augmentation

TL;DR: This work develops a general software framework for augmenting annotated musical datasets, which will allow practitioners to easily expand training sets with musically motivated perturbations of both audio and annotations.
Journal ArticleDOI

Environment Sound Classification Using a Two-Stream CNN Based on Decision-Level Fusion.

TL;DR: The proposed TSCNN-DS model achieves a classification accuracy of 97.2%, which is the highest taxonomic accuracy on UrbanSound8K datasets compared to existing models.
Proceedings ArticleDOI

Music Genre Recognition Using Deep Neural Networks and Transfer Learning.

TL;DR: This work proposes a novel approach for music genre recognition using an ensemble of convolutional long short term memory based neural networks (CNN LSTM) and a transfer learning model and shows that the model outperforms them and achieves new state of the art results.
Proceedings ArticleDOI

A haptic texture database for tool-mediated texture recognition and classification

TL;DR: A haptic texture database is introduced which allows for a systematic analysis of feature candidates and test and compare six well-established features from audio and speech recognition together with a Gaussian Mixture Model-based classifier on recorded free hand signals.
References
More filters
Book

Psychoacoustics: Facts and Models

TL;DR: This description of the processing of sound by the human hearing system presents the quantitative relationship between sound stimuli and auditory perceptions in terms of hearing sensations, and implements these relationships in model form.
Book

Information Retrieval for Music and Motion

TL;DR: Analysis and Retrieval Techniques for Music Data, SyncPlayer: An Advanced Audio Player, and Relational Features and Adaptive Segmentation.
Journal ArticleDOI

Circularity in Judgments of Relative Pitch

TL;DR: In this article, a special set of computer-generated complex tones is shown to lead to a complete breakdown of transitivity in judgments of relative pitch, and the results demonstrate the operation of a "proximity principle" for the continuum of frequency and suggest that perceived pitch cannot be adequately represented by a purely rectilinear scale.
Proceedings Article

Realtime Chord Recognition of Musical Sound : a System Using Common Lisp Music

TL;DR: A high resolution faceplate is provided by fabricating the faceplates in a manner such that the conductive elements are formed on a surface transverse to the inner and outer surfaces of the faceplate and this transverse surface is then formed into the face plate to provide a vacuum type window.
Proceedings ArticleDOI

Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking

TL;DR: A system that attempts to identify such a relationship between music audio recordings, including best performance on an independent international evaluation, where the system achieved a mean reciprocal ranking of 0.49 for true cover versions among top-10 returns.