Automatic music transcription: challenges and future directions
read more
Citations
Applications of evolutionary computing: EvoWorkshops 2008: EvoCOMNET, EvoFIN, EvoHOT, EvoIASP, EvoMUSART, EvoNUM, EvoSTOC, and EvoTransLog, Naples, Italy, March 26-28, 2008. Proceedings.
Automatic Music Transcription: An Overview
Learning Features of Music from Scratch
Internet of Musical Things: Vision and Challenges
Novel Audio Features for Music Emotion Recognition
References
A Short Introduction to Boosting
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Non-negative matrix factorization for polyphonic music transcription
Calculation of a constant Q spectral transform
Applications of Evolutionary Computing
Related Papers (5)
Frequently Asked Questions (13)
Q2. What are the future works in "Automatic music transcription: challenges and future directions" ?
For further insights into current research challenges in MIR, see [ 118 ]. Another promising direction for further research is the combination of multiple processing principles, such as different algorithms with complementary properties which estimate a particular feature, or algorithms which extract various types of musical information, such as the key, metrical structure, and instrument identities, and feed that into a model that provides context for the note detection process. However, the authors believe that AMT research has reached the point where certain practical end-user applications can be built, especially where transcribed notes are used as a basis for extracting higher-level information, and they expect to see many more of these appearing in the near future as the state of AMT research advances. For example, the authors discussed how a genre- or instrumentspecific transcription system can utilise high-level models that are more precise and powerful than their more general counterparts.
Q3. What is the weighted score function for the pitch candidate set?
The weighted score function for the pitch candidate set consists of 4 features: harmonicity, mean bandwidth, spectral centroid, and “synchronicity” (synchrony).
Q4. What is the first step towards understanding the underlying periodicities and accents in the music?
Onset detection (finding the beginnings of notes or events) is the first step towards understanding the underlying periodicities and accents in the music, which ultimately define the rhythm.
Q5. What is the common application area for scoring?
One application area where a score is available is automatic instrument tutoring [14, 36, 124], where a system evaluates the performance of a student based on a reference score and provides feedback.
Q6. What disciplines are needed to enable progress in these directions?
To enable progress in these directions, expertise from a range of disciplines will be needed, such as musicology, acoustics, audio engineering, cognitive science and computing.
Q7. What are the main issues in order to produce output in the form of sheet music?
in order to produce output in the form of sheet music, additional issues need to be addressed, such as typesetting, estimation of dynamics, fingering, expressive notation and articulation.
Q8. What could be used to describe the local sequential dependencies of notes and chords?
Musicological models could be employed to describe these local sequential dependencies [115] as well as longerterm relationships such as structural repetition and key.
Q9. What could be the reason for the improved performance of the algorithm?
A possible explanation behind the improved performance of the algorithm could be the more sophisticated note tracking algorithm that is based upon perceptual studies, whereas the standard note tracking systems are simply filtering the note activations.
Q10. What is the way to train cluster-specific transcription parameters?
One possible use is to cluster the data (for example, according to automatically detected genres) and then train cluster-specific transcription parameters.
Q11. What is the main challenge of combining algorithms?
Although this is necessary, considering the complexity of each task, the challenge remains to combine the outputs of the algorithms, or better, the algorithms themselves, to perform joint estimation of all parameters, in order to avoid the cascading of errors when algorithms are combined sequentially.
Q12. What other subtasks are used to estimate the features of the music?
The other subtasks involve the estimation of features relating to rhythm, melody, harmony and instrumentation, which carry information which, if integrated, could improve transcription performance.
Q13. What are some of the approaches that could be used to treat notes as time-frequency objects?
Other approaches that would treat notes as time-frequency objects and exploit dynamic time warping or HMMs integrated at a low level could offer a breath of fresh air on research in the field.