scispace - formally typeset
Open AccessJournal ArticleDOI

Metrics for Polyphonic Sound Event Detection

Annamaria Mesaros, +2 more
- 25 May 2016 - 
- Vol. 6, Iss: 6, pp 162
TLDR
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously.
Abstract
This paper presents and discusses various metrics proposed for evaluation of polyphonic sound event detection systems used in realistic situations where there are typically multiple sound sources active simultaneously The system output in this case contains overlapping events, marked as multiple sounds detected as being active at the same time The polyphonic system output requires a suitable procedure for evaluation against a reference Metrics from neighboring fields such as speech recognition and speaker diarization can be used, but they need to be partially redefined to deal with the overlapping events We present a review of the most common metrics in the field and the way they are adapted and interpreted in the polyphonic case We discuss segment-based and event-based definitions of each metric and explain the consequences of instance-based and class-based averaging using a case study In parallel, we provide a toolbox containing implementations of presented metrics

read more

Citations
More filters
Proceedings ArticleDOI

TUT database for acoustic scene classification and sound event detection

TL;DR: The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented.
Journal ArticleDOI

Deep Learning for Audio Signal Processing

TL;DR: Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross fertilization between areas.
Journal ArticleDOI

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

TL;DR: In this paper, a convolutional recurrent neural network (CRNN) was proposed for polyphonic sound event detection task and compared with CNN, RNN and other established methods, and observed a considerable improvement for four different datasets consisting of everyday sound events.

DCASE 2017 challenge setup: tasks, datasets and baseline system

TL;DR: This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset.
Journal ArticleDOI

Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

TL;DR: The proposed convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3-D) space is generic and applicable to any array structures, robust to unseen DOA values, reverberation, and low SNR scenarios.
References
More filters
Journal ArticleDOI

Machine learning in automated text categorization

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Journal ArticleDOI

Environmental Sound Recognition With Time–Frequency Audio Features

TL;DR: An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Journal ArticleDOI

Detection and Classification of Acoustic Scenes and Events

TL;DR: The state of the art in automatically classifying audio scenes, and automatically detecting and classifyingaudio events is reported on.
Journal ArticleDOI

Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement

TL;DR: It is shown by experiment that all but one of these computation methods leads to biased measurements, especially under high class imbalance, which is of particular interest to those designing machine learning software libraries and researchers focused onhigh class imbalance.
Proceedings ArticleDOI

Events Detection for an Audio-Based Surveillance System

TL;DR: The automatic shot detection system presented is based on a novelty detection approach which offers a solution to detect abnormality (abnormal audio events) in continuous audio recordings of public places and takes advantage of potential similarity between the acoustic signatures of the different types of weapons by building a hierarchical classification system.
Related Papers (5)