Proceedings ArticleDOI
Challenges with Audio Classification using Image Based Approaches for Health Measurement Applications
Madison Cohen-McFarlane,Rafik Goubran,Bruce Wallace +2 more
- pp 1-5
TLDR
Some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring are described and the spatial invariance assumption of the classifier is investigated.Abstract:
Image classification has had huge success in recent years, mainly due to the vast array of databases available. The lack of audio databases presents a problem when it comes to creating a deep neural network classifier aimed at measurement and monitoring of health-related sounds. Such sounds (i.e. cough) can be indicative of worsening health conditions, specifically as it relates to remote monitoring of older adults. The application of pre-existing deep neural network image classifiers to audio classification has been presented as a potential solution. This paper describes some of the issues associated with utilizing audio spectrograms to retrain the AlexNet image classifier for the purpose of remote patient monitoring. The spatial invariance assumption of the classifier is further investigated by creating two different classification tasks based on spectrograms computed from notes on a classical piano at four different noise levels; (1) octave classification and (2) note classification. As expected, the AlexNet classifier with clean data performs better when classifying octaves (98%), when compared to the note classification (83 %). When evaluating on audio with noise, the note classifier performance decreases more than the octave classification performance.read more
Citations
More filters
Proceedings ArticleDOI
Impact of face coverings on cough measurement characterization
Madison Cohen-McFarlane,Pengcheng Xi,Bruce Wallace,Julio J. Valdés,Rafik Goubran,Frank Knoefel +5 more
TL;DR: In this paper, a modeling approach was used to characterize the effects of both coughing while wearing a mask and coughing into a bent elbow, and these two models were then applied to an existing dataset for evaluating the influence of the face coverings on selected data features that have been used for differentiating wet and dry cough types.
Proceedings ArticleDOI
Ensembling Residual Networks for Multi-Label Sound Event Recognition with Weak Labeling
TL;DR: In this article , a semi-supervised learning approach is proposed to learn from both the labeled and the unlabeled audio clips following the semi supervised learning paradigm, where the baseline is used to generate pseudo-labels of the unlabelled data.
Proceedings ArticleDOI
Cough Classification Using Audio Spectrogram Transformer
Karim Habashy,Julio J. Valdés,Madison Cohen-McFarlane,Pengcheng Xi,Bruce Thomson Wallace,Rafik Goubran,Frank Knoefel +6 more
TL;DR: This study suggests the feasibility of using smart sensing and deep learning for gaining insights into the health of older adults through real-time data analysis through cough sound analysis.
Journal ArticleDOI
Evaluation of Respiratory Sounds Using Image-Based Approaches for Health Measurement Applications
TL;DR: In this article , the authors investigated the use of image-based transfer learning applied to five audio visualizations to evaluate three classification tasks (C1: wet vs. dry vs. whooping cough vs. restricted breathing).
Proceedings ArticleDOI
Cough Classification Using Audio Spectrogram Transformer
TL;DR: In this paper , the authors proposed a two-pronged approach: the first leverages unsupervised learning to compute intrinsic dimensions of the data and maps raw data for visualizations, and the second uses the insight to train machine learning models through transfer learning on Vision Transformer models.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Book
Neural networks for pattern recognition
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book ChapterDOI
Neural Networks for Pattern Recognition
Suresh Kothari,Heekuck Oh +1 more
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Proceedings ArticleDOI
Audio Set: An ontology and human-labeled dataset for audio events
Jort F. Gemmeke,Daniel P. W. Ellis,Dylan Freedman,Aren Jansen,Wade Lawrence,R. Channing Moore,Manoj Plakal,Marvin Ritter +7 more
TL;DR: The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Journal ArticleDOI
Classifying environmental sounds using image recognition networks
TL;DR: This paper considers the classification accuracy for different image representations (Spectrogram, MFCC, and CRP) of environmental sounds, and evaluates the accuracy for environmental sounds in three publicly available datasets, using two well-known convolutional deep neural networks for image recognition.