Metrics for Polyphonic Sound Event Detection
TLDR
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously.Abstract:
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time The polyphonic system output requires a suitable procedure for evaluation against a reference Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study In parallel, we provide a toolbox containing implementations of presented metricsread more
Citations
More filters
Proceedings ArticleDOI
TUT database for acoustic scene classification and sound event detection
TL;DR: The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented.
Journal ArticleDOI
Deep Learning for Audio Signal Processing
TL;DR: Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross fertilization between areas.
Journal ArticleDOI
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
TL;DR: In this paper, a convolutional recurrent neural network (CRNN) was proposed for polyphonic sound event detection task and compared with CNN, RNN and other established methods, and observed a considerable improvement for four different datasets consisting of everyday sound events.
DCASE 2017 challenge setup: tasks, datasets and baseline system
Annamaria Mesaros,Toni Heittola,Aleksandr Diment,Benjamin Elizalde,Ankit Shah,Emmanuel Vincent,Bhiksha Raj,Tuomas Virtanen +7 more
TL;DR: This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset.
Journal ArticleDOI
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
TL;DR: The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios.
References
More filters
Journal ArticleDOI
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Journal ArticleDOI
Environmental Sound Recognition With Time–Frequency Audio Features
TL;DR: An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Journal ArticleDOI
Detection and Classification of Acoustic Scenes and Events
TL;DR: The state of the art in automatically classifying audio scenes, and automatically detecting and classifyingaudio events is reported on.
Journal ArticleDOI
Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement
George Forman,Martin B. Scholz +1 more
TL;DR: It is shown by experiment that all but one of these computation methods leads to biased measurements, especially under high class imbalance, which is of particular interest to those designing machine learning software libraries and researchers focused onhigh class imbalance.
Proceedings ArticleDOI
Events Detection for an Audio-Based Surveillance System
TL;DR: The automatic shot detection system presented is based on a novelty detection approach which offers a solution to detect abnormality (abnormal audio events) in continuous audio recordings of public places and takes advantage of potential similarity between the acoustic signatures of the different types of weapons by building a hierarchical classification system.