scispace - formally typeset
Search or ask a question

Showing papers by "J. Stephen Downie published in 2008"


Journal ArticleDOI
TL;DR: The background, structure, challenges, and contributions of MIREX are looked at and it is indicated that there are groups of systems that perform equally well within various MIR tasks.
Abstract: The Music Information Retrieval Evaluation eXchange (MIREX) is the community-based framework for the formal evaluation of Music Information Retrieval (MIR) systems and algorithms. By looking at the background, structure, challenges, and contributions of MIREX this paper provides some insights into the world of MIR research. Because MIREX tasks are defined by the community they reflect the interests, techniques, and research paradigms of the community as a whole. Both MIREX and MIR have a strong bias toward audio-based approaches as most MIR researchers have strengths in signal processing. Spectral-based approaches to MIR tasks have led to advancements in the MIR field but they now appear to be reaching their limits of effectiveness. This limitation is called the “glass ceiling” problem and the MIREX results data support its existence. The post-hoc analyses of MIREX results data indicate that there are groups of systems that perform equally well within various MIR tasks. There are many challenges facing MIREX and MIR research most of which have their root causes in the intellectual property issues surrounding music. The current inability of researchers to test their approaches against the MIREX test collections outside the annual MIREX cycle is hindering the rapid development of improved MIR systems.

310 citations


Proceedings Article
01 Jan 2008
TL;DR: Important issues in setting up the AMC task are described, dataset construction and ground-truth labeling are analyzed, and human assessments on the audio dataset, as well as system performances from various angles are analyzed.
Abstract: Recent music information retrieval (MIR) research pays increasing attention to music classification based on moods expressed by music pieces The first Audio Mood Classification (AMC) evaluation task was held in the 2007 running of the Music Information Retrieval Evaluation eXchange (MIREX) This paper describes important issues in setting up the task, including dataset construction and ground-truth labeling, and analyzes human assessments on the audio dataset, as well as system performances from various angles Interesting findings include system performance differences with regard to mood clusters and the levels of agreement amongst human judgments regarding mood labeling Based on these analyses, we summarize experiences learned from the first community scale evaluation of the AMC task and propose recommendations for future AMC and similar evaluation tasks

166 citations


Proceedings Article
01 Dec 2008
TL;DR: Analysis of the 2006 and 2007 results of the Music Information Retrieval Evaluation eXchange (MIREX) Audio Cover Song Identification (ACS) tasks indicate significant improvements in this domain have been made over the course of 2006-2007.
Abstract: This paper presents analyses of the 2006 and 2007 results of the Music Information Retrieval Evaluation eXchange (MIREX) Audio Cover Song Identification (ACS) tasks. The Music Information Retrieval Evaluation eXchange (MIREX) is a community-based endeavor to scientifically evaluate music information retrieval (MIR) algorithms and techniques. The ACS task was created to motivate MIR researchers to expand their notions of similarity beyond acoustic similarity to include the important idea that musical works retain their identity notwithstanding variations in style, genre, orchestration, rhythm or melodic ornamentation, etc. A series of statistical analyses were performed that indicate significant improvements in this domain have been made over the course of 2006-2007. Post-hoc analyses reveal distinct differences between individual systems and the effects of certain classes of queries on performance. This paper discusses some of the techniques that show promise in this research domain

26 citations


Journal ArticleDOI
TL;DR: The experimental results indicate that compared with the traditional DP and the other three compititive schemes, E2LSH-SDP exhibits the best tradeoff in terms of the response time, retrieval accuracy and computation cost.
Abstract: This paper investigates suitable indexing techniques to enable efficient content-based audio retrieval in large acoustic databases. To make an index-based retrieval mechanism applicable to audio content, we investigate the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison. We propose a fast and efficient audio retrieval framework of query-by-content and develop an audio retrieval system. Based on this framework, four different audio retrieval schemes, LSH-Dynamic Programming (DP), LSH-Sparse DP (SDP), Exact Euclidian LSH (E2LSH)-DP, E2LSH-SDP, are introduced and evaluated in order to better understand the performance of audio retrieval algorithms. The experimental results indicate that compared with the traditional DP and the other three compititive schemes, E2LSH-SDP exhibits the best tradeoff in terms of the response time, retrieval accuracy and computation cost.

7 citations


Proceedings ArticleDOI
30 Oct 2008
TL;DR: This paper investigates a task: the original track of a song is embedded in datasets, with a batch of multi-variant audio tracks of this song as input, the retrieval system returns an ordered list by similarity and indicates the position of relevant audio track.
Abstract: Multi-variant music tracks are those audio tracks of a particular song which are sung and recorded by different people (i.e., cover songs). As music social clubs grow on the Internet, more and more people like to upload music recordings onto such music social sites to share their own home-produced albums and participate in Internet singing contests. Therefore it is very important to explore a computer-assisted evaluation tool to detect these audio-based multi-variant tracks. In this paper we investigate such a task: the original track of a song is embedded in datasets, with a batch of multi-variant audio tracks of this song as input, our retrieval system returns an ordered list by similarity and indicates the position of relevant audio track. To help process multi-variant audio tracks, we suggest a semantic indexing framework and propose the Federated Features (FF) scheme to generate the semantic summarization of audio feature sequences. The conjunction of federated features with three typical similarity searching schemes, K-Nearest Neighbor (KNN), Locality Sensitive Hashing (LSH), and Exact Euclidian LSH (E2LSH), is evaluated. From these findings, a computer-assisted evaluation tool for searching multi-variant audio tracks was developed to search over large musical audio datasets.

4 citations


Proceedings ArticleDOI
26 Oct 2008
TL;DR: A content-based audio COver Song IdeNtification (COSIN) system to detect/group cover songs and incorporates a set of tools to evaluate retrieval performance so researchers can explore different retrieval schemes and parameters.
Abstract: We develop a content-based audio COver Song IdeNtification (COSIN) system to detect/group cover songs. The COSIN takes music audio content as input and performs similarity searching to locate variants of the input (i.e., cover versions). Identified cover songs are returned in the rank order according to their similarity to the input. The COSIN also incorporates a set of tools to evaluate retrieval performance so researchers can explore different retrieval schemes and parameters (e.g. recall, precision). The COSIN utilizes a suite of techniques to detect cover songs including: Pitch + Dynamic Programming (DP), Chroma + DP, and Semantic Feature Summarization (SFS) + Hash-Based Approximate Matching (HBAM). Demonstration system shows that COSIN is a very potential music content retrieval tool. Running some music retrieval schemes on COSIN platform, recent experiments with SFS + LSH Variants demonstrate a nicely balanced efficiency (search speed) v. performance (search accuracy) tradeoff.

3 citations


Proceedings ArticleDOI
16 Jun 2008
TL;DR: This paper introduces the CARDINAL (Computer Assisted Recognition and Discovery in Natural Acoustic Landscapes) interface system for use in Bioacoustic Digital Libraries (BADL).
Abstract: This paper introduces the CARDINAL (Computer Assisted Recognition and Discovery in Natural Acoustic Landscapes) interface system for use in Bioacoustic Digital Libraries (BADL).

3 citations


Proceedings ArticleDOI
16 Jun 2008
TL;DR: This paper outlines a dynamic classification explorer for music digital library users and researchers that provides multiple simultaneous classification visualizations and synchronized audio.
Abstract: This paper outlines a dynamic classification explorer for music digital library users and researchers. System provides multiple simultaneous classification visualizations and synchronized audio.

3 citations