Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications

doi:10.1109/MEMEA49120.2020.9137254

Proceedings ArticleDOI

Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications

- pp 1-5

TLDR

Some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring are described and the spatial invariance assumption of the classifier is investigated.

Abstract:

Image classification has had huge success in recent years, mainly due to the vast array of databases available. The lack of audio databases presents a problem when it comes to creating a deep neural network classifier aimed at measurement and monitoring of health-related sounds. Such sounds (i.e. cough) can be indicative of worsening health conditions, specifically as it relates to remote monitoring of older adults. The application of pre-existing deep neural network image classifiers to audio classification has been presented as a potential solution. This paper describes some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring. The spatial invariance assumption of the classifier is further investigated by creating two different classification tasks based on spectrograms computed from notes on a classical piano at four different noise levels; (1) octave classification and (2) note classification. As expected, the AlexNet classifier with clean data performs better when classifying octaves (98%), when compared to the note classification (83 %). When evaluating on audio with noise, the note classifier performance decreases more than the octave classification performance.

Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications

Citations

Impact of face coverings on cough measurement characterization

Ensembling Residual Networks for Multi-Label Sound Event Recognition with Weak Labeling

Cough Classification Using Audio Spectrogram Transformer

Evaluation of Respiratory Sounds Using Image-Based Approaches for Health Measurement Applications

Cough Classification Using Audio Spectrogram Transformer

References

ImageNet Classification with Deep Convolutional Neural Networks

Neural networks for pattern recognition

Neural Networks for Pattern Recognition

Audio Set: An ontology and human-labeled dataset for audio events

Classifying environmental sounds using image recognition networks

Related Papers (5)

A fast and robust speech/music discrimination approach

Multiclass Language Identification using Deep Learning on Spectral Images of Audio Signals

SVM-based audio scene classification

Background Sound Classification in Speech Audio Segments

Real time speech emotion recognition using RGB image classification and transfer learning