scispace - formally typeset
Open AccessProceedings Article

An Auditory Model Based Transcriber of Singing Sequences

Reads0
Chats0
TLDR
A new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented and it is shown that the accuracy of the newly proposed transcription system is not very to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.
Abstract
In this paper, a new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented. Although such a system may have a wider range of applications, it was mainly developed to become the acoustic module of a queryby-humming (QBH) system for retrieving pieces of music from a digitized musical library. The first part of the paper is devoted to the systematic evaluation of a variety of state-of-the art transcription systems. The main result of this evaluation is that there is clearly a need for more accurate systems. Especially the segmentation was experienced as being too error prone ( % segmentation errors). In the second part of the paper, a new auditory model based transcription system is proposed and evaluated. The results of that evaluation are very promising. Segmentation errors vary between 0 and 7 %, dependent on the amount of lyrics that is used by the singer. The paper ends with the description of an experimental study that was issued to demonstrate that the accuracy of the newly proposed transcription system is not very sensitive to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

TL;DR: A comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features and Results for Distinguishing Between Speech, Music and Environmental Sound shows that the former is superior to the latter in terms of sound classification.
Journal ArticleDOI

Automatic Music Transcription as We Know it Today

TL;DR: The aim of this overview is to describe methods for the automatic transcription of Western polyphonic music as transforming an acoustic musical signal into a MIDI-like symbolic representation, with main emphasis on estimating the multiple fundamental frequencies of several concurrent sounds.

Signal Processing Methods for the Automatic Transcription of Music

TL;DR: Signal processing methods for the automatic transcription of music are developed in this thesis and the main part of the thesis is dedicated to multiple fundamental frequency (F0) estimation, that is, estimation of the F0s of several concurrent musical sounds.
Journal ArticleDOI

Prediction of Musical Affect Using a Combination of Acoustic Structural Cues

TL;DR: The results indicate that musical affect attribution can partly be predicted using a combination of acoustical structural cues, and manual structural cues worked better than acoustically structural cues.
Journal ArticleDOI

Name that tune: a pilot study in finding a melody from a sung query

TL;DR: The approach to the construction of a target database of themes, encoding, and transcription of user queries, and the results of preliminary experimentation with a set of sung queries show that while no approach is clearly superior to the other system, string matching has a slight advantage.
References
More filters
Journal ArticleDOI

Dynamic programming algorithm optimization for spoken word recognition

TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Journal ArticleDOI

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

TL;DR: In this article, several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system, and the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations.
Proceedings ArticleDOI

Query by humming: musical information retrieval in an audio database

TL;DR: A system for querying an audio database by humming is described along with a scheme for representing the melodic information in a song as relative pitch changes, and the performance results of system indicating its effectiveness are presented.
Journal ArticleDOI

A comparative performance study of several pitch detection algorithms

TL;DR: A comparative performance study of seven pitch detection algorithms was conducted, consisting of eight utterances spoken by three males, three females, and one child, to assess their relative performance as a function of recording condition, and pitch range of the various speakers.