Pattern analysis based acoustic signal processing: a survey of the state-of-art

doi:10.1007/S10772-020-09681-3

Journal ArticleDOI

Pattern analysis based acoustic signal processing: a survey of the state-of-art

Jyotismita Chaki

- 03 Feb 2020 -

International Journal of Speech Technolo...

- Vol. 24, Iss: 4, pp 913-955

Chats0

TLDR

The aim of this state-of-art paper is to produce a summary and guidelines for using the broadly used methods, to identify the challenges as well as future research directions of acoustic signal processing.

Abstract:

Audio signal processing is the most challenging field in the current era for an analysis of an audio signal. Audio signal classification (ASC) comprises of generating appropriate features from a sound and utilizing these features to distinguish the class the sound is most likely to fit. Based on the application’s classification domain, the characteristics extraction and classification/clustering algorithms used may be quite diverse. The paper provides the survey of the state-of art for understanding ASC’s general research scope, including different types of audio; representation of audio like acoustic, spectrogram; audio feature extraction techniques like physical, perceptual, static, dynamic; audio pattern matching approaches like pattern matching, acoustic phonetic, artificial intelligence; classification, and clustering techniques. The aim of this state-of-art paper is to produce a summary and guidelines for using the broadly used methods, to identify the challenges as well as future research directions of acoustic signal processing.

Citations

PDF

Open Access

More filters

Posted Content

An Ensemble of Convolutional Neural Networks for Audio Classification.

Loris Nanni, +3 more

- 15 Jul 2020 -

arXiv: Audio and Speech Processing

TL;DR: This work has managed to create an off-the-shelf ensemble that can be trained on different datasets and reach performances competitive with the state of the art in audio classification.

...read moreread less

Journal ArticleDOI

Ensemble of handcrafted and deep features for urban sound classification

Jederson S. Luz, +3 more

- 01 Apr 2021 -

Applied Acoustics

TL;DR: A small parameter space CNN model to extract deep features that are combined with handcrafted features extracted from audio signals is proposed, outperforming most of the state-of-the-art CNN models for urban sound classification.

...read moreread less

Journal ArticleDOI

Music genre classification based on auditory image, spectral and acoustic features

Xin Cai, +1 more

- 10 Jan 2022 -

Multimedia Systems

Proceedings ArticleDOI

A Supervised Approach for Corrective Maintenance Using Spectral Features from Industrial Sounds

Luana Gantert, +3 more

TL;DR: In this paper, different spectral features are extracted from industrial sounds and are used as input of supervised learning algorithms for classification between normal and abnormal operations, which reveals promising results based on the f1-score.

...read moreread less

Journal ArticleDOI

Gender Classification Based On Audio Features

نداء فليح حسن, +1 more

References

PDF

Open Access

More filters

Posted Content

WaveNet: A Generative Model for Raw Audio

Aaron van den Oord, +8 more

- 12 Sep 2016 -

arXiv: Sound

TL;DR: This paper proposed WaveNet, a deep neural network for generating audio waveforms, which is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones.

...read moreread less

WaveNet: A Generative Model for Raw Audio

Aaron van den Oord, +8 more

TL;DR: WaveNet, a deep neural network for generating raw audio waveforms, is introduced; it is shown that it can be efficiently trained on data with tens of thousands of samples per second of audio, and can be employed as a discriminative model, returning promising results for phoneme recognition.

...read moreread less

Proceedings ArticleDOI

Audio Set: An ontology and human-labeled dataset for audio events

Jort F. Gemmeke, +7 more

TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.

...read moreread less

Proceedings ArticleDOI

librosa: Audio and Music Signal Analysis in Python

Brian McFee, +6 more

TL;DR: A brief overview of the librosa library's functionality is provided, along with explanations of the design goals, software development practices, and notational conventions.

...read moreread less

Proceedings ArticleDOI

CNN architectures for large-scale audio classification

Shawn Hershey, +12 more

TL;DR: In this paper, the authors used various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels.

...read moreread less

Collapse

Pattern analysis based acoustic signal processing: a survey of the state-of-art

Citations

An Ensemble of Convolutional Neural Networks for Audio Classification.

Ensemble of handcrafted and deep features for urban sound classification

Music genre classification based on auditory image, spectral and acoustic features

A Supervised Approach for Corrective Maintenance Using Spectral Features from Industrial Sounds

Gender Classification Based On Audio Features

References

WaveNet: A Generative Model for Raw Audio

WaveNet: A Generative Model for Raw Audio

Audio Set: An ontology and human-labeled dataset for audio events

librosa: Audio and Music Signal Analysis in Python

CNN architectures for large-scale audio classification

Related Papers (5)

Audio signal classification

A general audio classifier based on human perception motivated model

Towards robust features for classifying audio in the CueVideo system

An Overview on Perceptually Motivated Audio Indexing and Classification

A robust audio classification and segmentation method