Open AccessProceedings Article
An Auditory Model Based Transcriber of Singing Sequences
Lieven Clarisse,Jean-Pierre Martens,Micheline Lesaffre,Bernard De Baets,Hans De Meyer,Marc Leman +5 more
- pp 16-20
Reads0
Chats0
TLDR
A new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented and it is shown that the accuracy of the newly proposed transcription system is not very to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.Abstract:
In this paper, a new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented. Although such a system may have a wider range of applications, it was mainly developed to become the acoustic module of a queryby-humming (QBH) system for retrieving pieces of music from a digitized musical library. The first part of the paper is devoted to the systematic evaluation of a variety of state-of-the art transcription systems. The main result of this evaluation is that there is clearly a need for more accurate systems. Especially the segmentation was experienced as being too error prone ( % segmentation errors). In the second part of the paper, a new auditory model based transcription system is proposed and evaluated. The results of that evaluation are very promising. Segmentation errors vary between 0 and 7 %, dependent on the amount of lyrics that is used by the singer. The paper ends with the description of an experimental study that was issued to demonstrate that the accuracy of the newly proposed transcription system is not very sensitive to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.read more
Citations
More filters
Book
MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval
TL;DR: A comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features and Results for Distinguishing Between Speech, Music and Environmental Sound shows that the former is superior to the latter in terms of sound classification.
Journal ArticleDOI
Automatic Music Transcription as We Know it Today
TL;DR: The aim of this overview is to describe methods for the automatic transcription of Western polyphonic music as transforming an acoustic musical signal into a MIDI-like symbolic representation, with main emphasis on estimating the multiple fundamental frequencies of several concurrent sounds.
Signal Processing Methods for the Automatic Transcription of Music
TL;DR: Signal processing methods for the automatic transcription of music are developed in this thesis and the main part of the thesis is dedicated to multiple fundamental frequency (F0) estimation, that is, estimation of the F0s of several concurrent musical sounds.
Journal ArticleDOI
Prediction of Musical Affect Using a Combination of Acoustic Structural Cues
TL;DR: The results indicate that musical affect attribution can partly be predicted using a combination of acoustical structural cues, and manual structural cues worked better than acoustically structural cues.
Journal ArticleDOI
Name that tune: a pilot study in finding a melody from a sung query
TL;DR: The approach to the construction of a target database of themes, encoding, and transcription of user queries, and the results of preliminary experimentation with a set of sung queries show that while no approach is clearly superior to the other system, string matching has a slight advantage.
References
More filters
Journal ArticleDOI
Dynamic programming algorithm optimization for spoken word recognition
TL;DR: This paper reports on an optimum dynamic progxamming (DP) based time-normalization algorithm for spoken word recognition, in which the warping function slope is restricted so as to improve discrimination between words in different categories.
Journal ArticleDOI
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
S. Davis,Paul Mermelstein +1 more
TL;DR: In this article, several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system, and the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations.
Proceedings ArticleDOI
Query by humming: musical information retrieval in an audio database
TL;DR: A system for querying an audio database by humming is described along with a scheme for representing the melodic information in a song as relative pitch changes, and the performance results of system indicating its effectiveness are presented.
Journal ArticleDOI
A comparative performance study of several pitch detection algorithms
TL;DR: A comparative performance study of seven pitch detection algorithms was conducted, consisting of eight utterances spoken by three males, three females, and one child, to assess their relative performance as a function of recording condition, and pitch range of the various speakers.