Journal ArticleDOI
Pattern analysis based acoustic signal processing: a survey of the state-of-art
Reads0
Chats0
TLDR
The aim of this state-of-art paper is to produce a summary and guidelines for using the broadly used methods, to identify the challenges as well as future research directions of acoustic signal processing.Abstract:
Audio signal processing is the most challenging field in the current era for an analysis of an audio signal. Audio signal classification (ASC) comprises of generating appropriate features from a sound and utilizing these features to distinguish the class the sound is most likely to fit. Based on the application’s classification domain, the characteristics extraction and classification/clustering algorithms used may be quite diverse. The paper provides the survey of the state-of art for understanding ASC’s general research scope, including different types of audio; representation of audio like acoustic, spectrogram; audio feature extraction techniques like physical, perceptual, static, dynamic; audio pattern matching approaches like pattern matching, acoustic phonetic, artificial intelligence; classification, and clustering techniques. The aim of this state-of-art paper is to produce a summary and guidelines for using the broadly used methods, to identify the challenges as well as future research directions of acoustic signal processing.read more
Citations
More filters
Posted Content
An Ensemble of Convolutional Neural Networks for Audio Classification.
TL;DR: This work has managed to create an off-the-shelf ensemble that can be trained on different datasets and reach performances competitive with the state of the art in audio classification.
Journal ArticleDOI
Ensemble of handcrafted and deep features for urban sound classification
TL;DR: A small parameter space CNN model to extract deep features that are combined with handcrafted features extracted from audio signals is proposed, outperforming most of the state-of-the-art CNN models for urban sound classification.
Journal ArticleDOI
Music genre classification based on auditory image, spectral and acoustic features
Xin Cai,Hongjuan Zhang +1 more
Proceedings ArticleDOI
A Supervised Approach for Corrective Maintenance Using Spectral Features from Industrial Sounds
TL;DR: In this paper, different spectral features are extracted from industrial sounds and are used as input of supervised learning algorithms for classification between normal and abnormal operations, which reveals promising results based on the f1-score.
References
More filters
Posted Content
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord,Sander Dieleman,Heiga Zen,Karen Simonyan,Oriol Vinyals,Alex Graves,Nal Kalchbrenner,Andrew W. Senior,Koray Kavukcuoglu +8 more
TL;DR: This paper proposed WaveNet, a deep neural network for generating audio waveforms, which is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones.
WaveNet: A Generative Model for Raw Audio
Aaron van den Oord,Sander Dieleman,Heiga Zen,Karen Simonyan,Oriol Vinyals,Alex Graves,Nal Kalchbrenner,Andrew W. Senior,Koray Kavukcuoglu +8 more
TL;DR: WaveNet, a deep neural network for generating raw audio waveforms, is introduced; it is shown that it can be efficiently trained on data with tens of thousands of samples per second of audio, and can be employed as a discriminative model, returning promising results for phoneme recognition.
Proceedings ArticleDOI
Audio Set: An ontology and human-labeled dataset for audio events
Jort F. Gemmeke,Daniel P. W. Ellis,Dylan Freedman,Aren Jansen,Wade Lawrence,R. Channing Moore,Manoj Plakal,Marvin Ritter +7 more
TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Proceedings ArticleDOI
librosa: Audio and Music Signal Analysis in Python
Brian McFee,Colin Raffel,Dawen Liang,Daniel P. W. Ellis,Matt McVicar,Eric Battenberg,Oriol Nieto +6 more
TL;DR: A brief overview of the librosa library's functionality is provided, along with explanations of the design goals, software development practices, and notational conventions.
Proceedings ArticleDOI
CNN architectures for large-scale audio classification
Shawn Hershey,Sourish Chaudhuri,Daniel P. W. Ellis,Jort F. Gemmeke,Aren Jansen,R. Channing Moore,Manoj Plakal,Devin Platt,Rif A. Saurous,Bryan Seybold,Malcolm Slaney,Ron Weiss,Kevin W. Wilson +12 more
TL;DR: In this paper, the authors used various CNN architectures to classify the soundtracks of a dataset of 70M training videos (5.24 million hours) with 30,871 video-level labels.