scispace - formally typeset
Search or ask a question
Posted Content

CardioXNet: A Novel Lightweight CRNN Framework for Classifying Cardiovascular Diseases from Phonocardiogram Recordings.

TL;DR: CardioXNet is introduced, a novel lightweight CRNN architecture for automatic detection of five classes of cardiac auscultation namely normal, aortic stenosis, mitral stenosis), mitral regurgitation and mitral valve prolapse using raw PCG signal with outstanding performance in all the evaluation metrics.
Abstract: The alarmingly high mortality rate and increasing global prevalence of cardiovascular diseases (CVDs) signify the crucial need for early detection schemes. Phonocardiogram(PCG) signals has been historically applied in this domain owing to its simplicity and cost-effectiveness. However, insufficiency of expert physicians and human subjectivity affect the applicability of this technique, especially in the low-resource settings. For resolving this issue, in this paper, we introduce CardioXNet,a novel lightweight CRNN architecture for automatic detection of five classes of cardiac auscultation namely normal, aortic stenosis, mitral stenosis, mitral regurgitation and mitral valve prolapse using raw PCG signal. The process has been automated by the involvement of two learning phases namely, representation learning and sequence residual learning. The first phase mainly focuses on automated feature extraction and it has been implemented in a modular way with three parallel CNN pathways i.e., frequency feature extractor (FFE), pattern extractor (PE) and adaptive feature extractor (AFE). 1D-CNN based FFE and PE respectively learn the coarse and fine-grained features from the PCG while AFE explores the salient features from variable receptive fields involving 2D-CNN based squeezeexpansion. Thus, in the representation learning phase, the network extracts efficient time-invariant features and converges with great rapidity. In the sequential residual learning phase,because of the bidirectional-LSTMs and the skip connection, the network can proficiently extract temporal features. The obtained results demonstrate that the proposed end-to-end architecture yields outstanding performance in all the evaluation metrics compared to the previous state-of-the-art methods with up to 99.6% accuracy, 99.6% precision, 99.6% recall and 99.4% F1-score on an average while being computationally comparable.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors have classified PCG signals into five classes namely extra systole, extra heart sound, artifacts, normal heartbeat and murmur, and they have achieved an average accuracy of 94% while doing the classification of PCG sound.
Abstract: During each cardiac cycle of heart, vibrations creates sound and murmur. When these sound and murmur wave is represented graphically then it is called phonocardiogram (PCG). Digital stethoscope is used to record the audio wave signals generated due to heart vibration. Audio waves recorded through digital stethoscope can be used to fetch information like tone, quality, intensity, frequency, heart rate etc. Based on the heart condition, this information will be different for different people and can be used to predict the status of heart at early stage in non-invasive manner. In this research work, by using deep learning models, authors have classified PCG signals into 5 classes namely extra systole, extra heart sound, artifacts, normal heartbeat and murmur. Initially spectrograms in the form of images are extracted from PCG sound and feed into Regularized Convolutional Neural Network. From the simulation environment designed in python, it has found that proposed model has shown the average accuracy of 94% while doing the classification of PCG sound in five classes.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

72,897 citations


"CardioXNet: A Novel Lightweight CRN..." refers background in this paper

  • ...Here, sigmoid and tanh are the activation functions which map the non-linearity of the features [45]....

    [...]

Proceedings Article
Sergey Ioffe1, Christian Szegedy1
06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

30,843 citations

Posted Content
Sergey Ioffe1, Christian Szegedy1
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.

17,184 citations

Proceedings ArticleDOI
Mark Sandler1, Andrew Howard1, Menglong Zhu1, Andrey Zhmoginov1, Liang-Chieh Chen1 
18 Jun 2018
TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.
Abstract: In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

9,381 citations

Posted Content
Mark Sandler1, Andrew Howard1, Menglong Zhu1, Andrey Zhmoginov1, Liang-Chieh Chen1 
TL;DR: A new mobile architecture, MobileNetV2, is described that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes and allows decoupling of the input/output domains from the expressiveness of the transformation.
Abstract: In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. The MobileNetV2 architecture is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on Imagenet classification, COCO object detection, VOC image segmentation. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as the number of parameters

8,807 citations


"CardioXNet: A Novel Lightweight CRN..." refers background in this paper

  • ...A handful of crossdomain studies have investigated the issue and introduced a few advanced strategies such as lightweight networks [29], [30], weight quantization [31] and low precision computation techniques [32]....

    [...]