Detection of asphyxia in infants using deep learning convolutional neural network (cnn) trained on mel frequency cepstrum coefficient (mfcc) features extracted from cry sounds

doi:10.4314/JFAS.V9I3S.59

Home
/
Papers
/
Detection of asphyxia in infants using deep learning convolutional neural network (cnn) trained on mel frequency cepstrum coefficient (mfcc) features extracted from cry sounds

Journal Article•DOI•

Detection of asphyxia in infants using deep learning convolutional neural network (cnn) trained on mel frequency cepstrum coefficient (mfcc) features extracted from cry sounds

Azlee Zabidi, Ihsan Mohd Yassin, Hasliza Hassan¹, Nadiah Ismail, M.M.A.M. Hamzah, Zairi Ismael Rizman², Husna Zainol Abidin - Show less +3 more•Institutions (2)

University of Selangor¹, Universiti Teknologi MARA²

24 Jan 2018-Journal of Fundamental and Applied Sciences (African Journals Online (AJOL))-Vol. 9, pp 768-778

TL;DR: It is proved that Mel Frequency Cepstrum Coefficient (MFCC) feature generates from audio signal of infant cry could be used as input feature for the Convolution Neural Network (CNN).

read less

Abstract: Deep Learning Neural Network (DLNN), is a new branch of machine learning with the ability for complex feature representation compared to traditional 4th-generation neural networks. Although it was mainly suited for image feature (since it was inspired by object recognition method of mammalian visual system), if any type of feature can be translate into image, other type of data could be fit for using DLNN. In this paper, we prove that Mel Frequency Cepstrum Coefficient (MFCC) feature generates from audio signal of infant cry could be used as input feature for the Convolution Neural Network (CNN). The result shows CNN can be used to classify between normal and pathological (asphyxiated) cry with 94.3% accuracy in training set and 92.8% accuracy in testing set.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Predicting single freestanding transmission tower time history response during complex wind input through a convolutional neural network based surrogate model

[...]

Jiayue Xue¹, Zhongming Xiang¹, Ge Ou¹•Institutions (1)

University of Utah¹

15 Apr 2021-Engineering Structures

TL;DR: A machine learning approach based on convolutional neural network to predict the time history response of the transmission tower during the complex wind input and the effectiveness of the CNN surrogate model is validated through a fragility model development, and its robustness is investigated.

...read moreread less

23 citations

Proceedings Article•DOI•

A CNN-based Packet Classification of eMBB, mMTC and URLLC Applications for 5G

[...]

Whai-En Chen¹, Xiang-Yuan Fan¹, Li-Xian Chen¹•Institutions (1)

National Ilan University¹

01 Aug 2019

TL;DR: This paper captures the packets from the Narrow Band-Internet of Things (NB-IoT) transmission, Unmanned Aerial Vehicle (UAV) control, 4K video and Facebook access for emulating mMTC, URLLC, eMBB and Internet traffic in 5G.

...read moreread less

Abstract: 5G supports more new services, including enhanced Mobile Broadband (eMBB), Ultra-reliable and Low Latency Communications (URLLC) and massive Machine Type Communications (mMTC). The Quality of Service (QoS) requirements of these 5G service types are different. In this paper, we capture the packets from the Narrow Band-Internet of Things (NB-IoT) transmission, Unmanned Aerial Vehicle (UAV) control, 4K video and Facebook access for emulating mMTC, URLLC, eMBB and Internet traffic in 5G. With the captured packets, we investigate using the machine learning technology to classify the packets based on the payload information. Specifically, the Convolutional Neural Network (CNN) model is performed to classify the application packets into suitable groups. In addition, this paper studies the effects of various parameters such as the kernel number, kernel size, pooling window size, the dropout rate and the payload length to find the optimal values for high accuracy and low latency.

...read moreread less

16 citations

Cites methods from "Detection of asphyxia in infants us..."

...) based on the detection model in [13]....
[...]

Proceedings Article•DOI•

Deep Learning for Asphyxiated Infant Cry Classification Based on Acoustic Features and Weighted Prosodic Features

[...]

Chunyan Ji¹, Xueli Xiao¹, Sunitha Basodi¹, Yi Pan¹•Institutions (1)

Georgia State University¹

14 Jul 2019

TL;DR: This paper proposes a novel method through generating weighted prosodic features combined with acoustic features to form a merged feature matrix to classify asphyxiated baby crying effectively and has the benefits of keeping the robustness and resolution of the classification model simultaneously.

...read moreread less

Abstract: Asphyxia is a respiratory injury that leads to a serious damage for infants. Early detection of asphyxia using Artificially Intelligent technology helps in reducing infant mortality rate when compared to traditional medical diagnosis, which is time consuming. In this paper, we propose a novel method through generating weighted prosodic features combined with acoustic features to form a merged feature matrix to classify asphyxiated baby crying effectively. The weights of the prosodic features are trained at the frame level with labeled data and can be optimized using deep learning approach with neural networks. The novel merged feature matrix is established with both acoustic and weighted prosodic features. The matrix has good ability to capture the diversity of variations within infant cries, especially for asphyxiated samples. Our method has the benefits of keeping the robustness and resolution of the classification model simultaneously. The effectiveness of this approach is evaluated on Baby Chillanto Database. Our method yields a significant reduction of 3.11%, 3.23%, and 1.43% absolute classification error rate compared with the results using single acoustic features, single prosodic features, and both acoustic and prosodic features, respectively. The testing accuracy in our method reaches 96.74%, which outperforms all other related studies on asphyxiated baby crying classification.

...read moreread less

13 citations

Cites background from "Detection of asphyxia in infants us..."

...8% accuracy in a specific testing data [7]....
[...]

Journal Article•DOI•

PigTalk: An AI-Based IoT Platform for Piglet Crushing Mitigation

[...]

Whai-En Chen, Yi-Bing Lin¹, Li-Xian Chen•Institutions (1)

National Chiao Tung University¹

01 Jun 2021-IEEE Transactions on Industrial Informatics

TL;DR: PigTalk is a new approach that automatically mitigates piglet crushing, which could not be achieved in the past, and is indicated that PigTalk can save piglets within 0.05 s with 99.93% of the successful rate.

...read moreread less

Abstract: On pig farms, many piglets die because they are crushed when sows roll from side to side or lie down. On average, 1.2 piglets are crushed by sows every day. To resolve the piglet mortality issue, this article proposes PigTalk, an artificial intelligence (AI) based Internet of Things (IoT) platform for detecting and mitigating piglet crushing. Through real-time analysis of the voice data collected in a farrowing house, PigTalk detects if any piglet screaming occurs, and automatically activates sow-alert actuators for emergency handling of the crushing event. We propose an audio clip transform approach to pre-process the raw voice data, and utilizes min-max scaling in machine learning (ML) to detect piglet screams. In our first contribution, the above data preprocessing method together with subtle parameter setups of the machine learning model improve the piglet scream detection accuracy up to 99.4%, which is better than the previous solutions (up to 92.8%). In our second contribution, we show how to design two cyber IoT devices, i.e., DataBank for data pre-processing and ML_device for real-time AI to automatically trigger actuators such as floor vibration and water drop to force a sow to stand up. We conduct analytic analysis and simulation to investigate how the detection delay affects the critical time period to save crushed piglets. Our study indicates that PigTalk can save piglets within 0.05 s with 99.93% of the successful rate. Such results are validated in a commercial farrowing house. PigTalk is a new approach that automatically mitigates piglet crushing, which could not be achieved in the past.

...read moreread less

10 citations

Cites methods or result from "Detection of asphyxia in infants us..."

...Similar study was conducted to detect infant cry [11] and the prediction accuracy can be up to 92....
[...]
...Based on the detection model in [11], we develop the CNN model in the SAs of DataBank and ML_device with the steps illustrated in Fig....
[...]
...1) Similar to the studies in [10] and [11], PigTalk uses CNN to interpret vocalizations from farrowing cages in real time to detect piglet screaming....
[...]

Journal Article•DOI•

Classification model and analysis on students’ performance

[...]

H. Nawang¹, Mokhairi Makhtar¹, S.N.W. Shamsudin¹•Institutions (1)

Universiti Sultan Zainal Abidin¹

01 Feb 2018-Journal of Fundamental and Applied Sciences

TL;DR: In this paper, a classification model for classifying students' performance in SijilPelajaran Malaysia in order to help teachers plan suitable teaching activities for their students based on the students's performance was proposed.

...read moreread less

Abstract: The purpose of this paper is to propose a classification model for classifying students’ performance in SijilPelajaran Malaysia in order to help teachers plan suitable teaching activities for their students based on the students’ performance Five classifier algorithms have been used during the process which are Naive Bayes, Random Tree, Multi Class Classifier, Conjunctive Rule and Nearest Neighbour Data was collected from MaktabRendahSains MARA Kuala Berang, Terengganu, Malaysia starting May 2011 until December 2014 The students’ performance was evaluated based on the category of students according to their SPM Results Parameters that contribute to students’ performance such as stream, state, gender and hometown are also investigated along with the examination dataThis research shows that first semester results can be used to identify students’ performance Keywords : educational data mining; classification model; feature selection

...read moreread less

9 citations

Cites methods from "Detection of asphyxia in infants us..."

...outperforms the prediction decision tree and neural network [30-33] methods....
[...]

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A fast learning algorithm for deep belief nets

[...]

Geoffrey E. Hinton¹, Simon Osindero¹, Yee Whye Teh²•Institutions (2)

University of Toronto¹, National University of Singapore²

01 Jul 2006-Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

...read moreread less

15,055 citations

"Detection of asphyxia in infants us..." refers background in this paper

...DLNN was proposed by [4] as an improvement to the conventional fourth-generation neural networks....
[...]

Journal Article•DOI•

The vanishing gradient problem during learning recurrent neural nets and problem solutions

[...]

Sepp Hochreiter¹•Institutions (1)

Technische Universität München¹

01 Apr 1998-International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems

TL;DR: The de-caying error flow is theoretically analyzed, methods trying to overcome vanishing gradients are briefly discussed, and experiments comparing conventional algorithms and alternative methods are presented.

...read moreread less

Abstract: Recurrent nets are in principle capable to store past inputs to produce the currently desired output. Because of this property recurrent nets are used in time series prediction and process control. Practical applications involve temporal dependencies spanning many time steps, e.g. between relevant inputs and desired outputs. In this case, however, gradient based learning methods take too much time. The extremely increased learning time arises because the error vanishes as it gets propagated back. In this article the de-caying error flow is theoretically analyzed. Then methods trying to overcome vanishing gradients are briefly discussed. Finally, experiments comparing conventional algorithms and alternative methods are presented. With advanced methods long time lag problems can be solved in reasonable time.

...read moreread less

2,203 citations

"Detection of asphyxia in infants us..." refers background in this paper

...Therefore, ReLU was proposed as an alternative to tangent-sigmoid activation function to avoid this issue [18]....
[...]

Posted Content•

Large-Margin Softmax Loss for Convolutional Neural Networks

[...]

Weiyang Liu¹, Yandong Wen², Zhiding Yu³, Meng Yang⁴•Institutions (4)

Peking University¹, South China University of Technology², Carnegie Mellon University³, Shenzhen University⁴

07 Dec 2016-arXiv: Machine Learning

TL;DR: In this article, a generalized large-margin softmax (L-Softmax) loss is proposed to encourage intra-class compactness and inter-class separability between learned features.

...read moreread less

Abstract: Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.

...read moreread less

680 citations

Proceedings Article•DOI•

Object Detection from Video Tubelets with Convolutional Neural Networks

[...]

Kai Kang¹, Wanli Ouyang¹, Hongsheng Li¹, Xiaogang Wang¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Jun 2016

TL;DR: Wang et al. as mentioned in this paper introduced a complete framework for the object detection from video (VID) task based on still-image object detection and general object tracking, and a temporal convolution network is proposed to incorporate temporal information to regularize the detection results and shows its effectiveness for the task.

...read moreread less

Abstract: Deep Convolution Neural Networks (CNNs) have shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. For object detection, particularly in still images, the performance has been significantly increased last year thanks to powerful deep networks (e.g. GoogleNet) and detection frameworks (e.g. Regions with CNN features (RCNN)). The lately introduced ImageNet [6] task on object detection from video (VID) brings the object detection task into the video domain, in which objects' locations at each frame are required to be annotated with bounding boxes. In this work, we introduce a complete framework for the VID task based on still-image object detection and general object tracking. Their relations and contributions in the VID task are thoroughly studied and evaluated. In addition, a temporal convolution network is proposed to incorporate temporal information to regularize the detection results and shows its effectiveness for the task. Code is available at https://github.com/ myfavouritekk/vdetlib.

...read moreread less

338 citations

Journal Article•DOI•

Handwritten alphanumeric character recognition by the neocognitron

[...]

Kunihiko Fukushima¹, Nobuaki Wake¹•Institutions (1)

Osaka University¹

01 May 1991-IEEE Transactions on Neural Networks

TL;DR: A pattern recognition system which works with the mechanism of the neocognitron, a neural network model for deformation-invariant visual pattern recognition, is discussed, which has been trained to recognize 35 handwritten alphanumeric characters.

...read moreread less

Abstract: A pattern recognition system which works with the mechanism of the neocognitron, a neural network model for deformation-invariant visual pattern recognition, is discussed. The neocognition was developed by Fukushima (1980). The system has been trained to recognize 35 handwritten alphanumeric characters. The ability to recognize deformed characters correctly depends strongly on the choice of the training pattern set. Some techniques for selecting training patterns useful for deformation-invariant recognition of a large number of characters are suggested. >

...read moreread less

249 citations

"Detection of asphyxia in infants us..." refers background or methods in this paper

...The work in this paper focuses on CNN as it was reported as an excellent tool in the deep learning framework [5]....
[...]
...One of its famous usage examples is handwriting recognition [5]....
[...]