scispace - formally typeset
Proceedings ArticleDOI

Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications

TLDR
Some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring are described and the spatial invariance assumption of the classifier is investigated.
Abstract
Image classification has had huge success in recent years, mainly due to the vast array of databases available. The lack of audio databases presents a problem when it comes to creating a deep neural network classifier aimed at measurement and monitoring of health-related sounds. Such sounds (i.e. cough) can be indicative of worsening health conditions, specifically as it relates to remote monitoring of older adults. The application of pre-existing deep neural network image classifiers to audio classification has been presented as a potential solution. This paper describes some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring. The spatial invariance assumption of the classifier is further investigated by creating two different classification tasks based on spectrograms computed from notes on a classical piano at four different noise levels; (1) octave classification and (2) note classification. As expected, the AlexNet classifier with clean data performs better when classifying octaves (98%), when compared to the note classification (83 %). When evaluating on audio with noise, the note classifier performance decreases more than the octave classification performance.

read more

Citations
More filters
Proceedings ArticleDOI

Impact of face coverings on cough measurement characterization

TL;DR: In this paper, a modeling approach was used to characterize the effects of both coughing while wearing a mask and coughing into a bent elbow, and these two models were then applied to an existing dataset for evaluating the influence of the face coverings on selected data features that have been used for differentiating wet and dry cough types.
Proceedings ArticleDOI

Ensembling Residual Networks for Multi-Label Sound Event Recognition with Weak Labeling

TL;DR: In this article , a semi-supervised learning approach is proposed to learn from both the labeled and the unlabeled audio clips following the semi supervised learning paradigm, where the baseline is used to generate pseudo-labels of the unlabelled data.
Proceedings ArticleDOI

Cough Classification Using Audio Spectrogram Transformer

TL;DR: This study suggests the feasibility of using smart sensing and deep learning for gaining insights into the health of older adults through real-time data analysis through cough sound analysis.
Journal ArticleDOI

Evaluation of Respiratory Sounds Using Image-Based Approaches for Health Measurement Applications

TL;DR: In this article , the authors investigated the use of image-based transfer learning applied to five audio visualizations to evaluate three classification tasks (C1: wet vs. dry vs. whooping cough vs. restricted breathing).
Proceedings ArticleDOI

Cough Classification Using Audio Spectrogram Transformer

TL;DR: In this paper , the authors proposed a two-pronged approach: the first leverages unsupervised learning to compute intrinsic dimensions of the data and maps raw data for visualizations, and the second uses the insight to train machine learning models through transfer learning on Vision Transformer models.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book ChapterDOI

Neural Networks for Pattern Recognition

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Proceedings ArticleDOI

Audio Set: An ontology and human-labeled dataset for audio events

TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Journal ArticleDOI

Classifying environmental sounds using image recognition networks

TL;DR: This paper considers the classification accuracy for different image representations (Spectrogram, MFCC, and CRP) of environmental sounds, and evaluates the accuracy for environmental sounds in three publicly available datasets, using two well-known convolutional deep neural networks for image recognition.
Related Papers (5)