Joint time-frequency scattering for audio classification

doi:10.1109/MLSP.2015.7324385

Open AccessProceedings ArticleDOI

Joint time-frequency scattering for audio classification

Joakim Andén, +2 more

- pp 1-6

Chats0

TLDR

It is shown that this descriptor successfully characterizes complex time-frequency phenomena such as time-varying filters and frequency modulated excitations on the TIMIT dataset.

Abstract:

We introduce the joint time-frequency scattering transform, a time shift invariant descriptor of time-frequency structure for audio classification. It is obtained by applying a two-dimensional wavelet transform in time and log-frequency to a time-frequency wavelet scalogram. We show that this descriptor successfully characterizes complex time-frequency phenomena such as time-varying filters and frequency modulated excitations. State-of-the-art results are achieved for signal reconstruction and phone segment classification on the TIMIT dataset.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Understanding deep convolutional networks.

Stéphane Mallat

- 13 Apr 2016 -

Philosophical Transactions of the Royal ...

TL;DR: Deep convolutional networks provide state-of-the-art classifications and regressions results over many high-dimensional problems and a mathematical framework is introduced to analyse their properties.

...read moreread less

Journal ArticleDOI

Per-Channel Energy Normalization: Why and How

Vincent Lostanlen, +6 more

- 01 Jan 2019 -

IEEE Signal Processing Letters

TL;DR: This letter investigates the adequacy of PCEN for spectrogram-based pattern recognition in far-field noisy recordings, both from theoretical and practical standpoints and describes the asymptotic regimes in PCEN: temporal integration, gain control, and dynamic range compression.

...read moreread less

Journal ArticleDOI

Robust sound event detection in bioacoustic sensor networks.

Vincent Lostanlen, +5 more

- 24 Oct 2019 -

PLOS ONE

TL;DR: In this paper, the authors proposed a method for detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six autonomous recording units (ARUs) in the presence of heterogeneous background noise.

...read moreread less

Proceedings ArticleDOI

Extended playing techniques: the next milestone in musical instrument recognition

Vincent Lostanlen, +2 more

TL;DR: This work identifies and discusses three necessary conditions for significantly outperforming the traditional mel-frequency cepstral coefficient (MFCC) baseline: the addition of second-order scattering coefficients to account for amplitude modulation, the incorporation of long-range temporal dependencies, and metric learning using large-margin nearest neighbors (LMNN) to reduce intra-class variability.

...read moreread less

Proceedings ArticleDOI

Exponential decay of scattering coefficients

Irène Waldspurger

TL;DR: In this article, it was shown that the norm of the scattering coefficients at a given layer only depends on the values of the signal outside a frequency band whose size is exponential in the depth of the layer.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011 -

ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Journal ArticleDOI

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

S. Davis, +1 more

- 01 Aug 1980 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: In this article, several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system, and the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations.

...read moreread less

Proceedings ArticleDOI

Convolutional networks and applications in vision

Yann LeCun, +2 more

TL;DR: New unsupervised learning algorithms, and new non-linear stages that allow ConvNets to be trained with very few labeled samples are described, including one for visual object recognition and vision navigation for off-road mobile robots.

...read moreread less

Proceedings Article

Unsupervised feature learning for audio classification using convolutional deep belief networks

Honglak Lee, +3 more

TL;DR: In this paper, the authors apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks and show that the learned features correspond to phones/phonemes.

...read moreread less

Journal ArticleDOI

Group Invariant Scattering

Stéphane Mallat

- 01 Oct 2012 -

Communications on Pure and Applied Mathe...

TL;DR: This paper constructs translation-invariant operators on L 2 .R d /, which are Lipschitz-continuous to the action of diffeomorphisms, and extendsScattering operators are extended on L2 .G/, where G is a compact Lie group, and are invariant under theaction of G.

...read moreread less

Joint time-frequency scattering for audio classification

Citations

Understanding deep convolutional networks.

Per-Channel Energy Normalization: Why and How

Robust sound event detection in bioacoustic sensor networks.

Extended playing techniques: the next milestone in musical instrument recognition

Exponential decay of scattering coefficients

References

LIBSVM: A library for support vector machines

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

Convolutional networks and applications in vision

Unsupervised feature learning for audio classification using convolutional deep belief networks

Group Invariant Scattering

Related Papers (5)

Joint Time-Frequency Scattering for Audio Classification

Group Invariant Scattering

Understanding deep convolutional networks.

Deep Scattering Spectrum

Wavelet Transforms and Time-Frequency Signal Analysis