scispace - formally typeset
Open accessJournal ArticleDOI: 10.1109/ACCESS.2021.3063129

CardioXNet: A Novel Lightweight Deep Learning Framework for Cardiovascular Disease Classification Using Heart Sound Recordings

02 Mar 2021-IEEE Access (IEEE)-Vol. 9, pp 36955-36967
Abstract: The alarmingly high mortality rate and increasing global prevalence of cardiovascular diseases (CVDs) signify the crucial need for early detection schemes. Phonocardiogram (PCG) signals have been historically applied in this domain owing to its simplicity and cost-effectiveness. In this article, we propose CardioXNet, a novel lightweight end-to-end CRNN architecture for automatic detection of five classes of cardiac auscultation namely normal, aortic stenosis, mitral stenosis, mitral regurgitation and mitral valve prolapse using raw PCG signal. The process has been automated by the involvement of two learning phases namely, representation learning and sequence residual learning. Three parallel CNN pathways have been implemented in the representation learning phase to learn the coarse and fine-grained features from the PCG and to explore the salient features from variable receptive fields involving 2D-CNN based squeeze-expansion. Thus, in the representation learning phase, the network extracts efficient time-invariant features and converges with great rapidity. In the sequential residual learning phase, because of the bidirectional-LSTMs and the skip connection, the network can proficiently extract temporal features without performing any feature extraction on the signal. The obtained results demonstrate that the proposed end-to-end architecture yields outstanding performance in all the evaluation metrics compared to the previous state-of-the-art methods with up to 99.60% accuracy, 99.56% precision, 99.52% recall and 99.68% F1- score on an average while being computationally comparable. This model outperforms any previous works using the same database by a considerable margin. Moreover, the proposed model was tested on PhysioNet/CinC 2016 challenge dataset achieving an accuracy of 86.57%. Finally the model was evaluated on a merged dataset of Github PCG dataset and PhysioNet dataset achieving excellent accuracy of 88.09%. The high accuracy metrics on both primary and secondary dataset combined with a significantly low number of parameters and end-to-end prediction approach makes the proposed network especially suitable for point of care CVD screening in low resource setups using memory constraint mobile devices.

... read more

Topics: Feature learning (55%), Feature extraction (51%), Deep learning (51%)
Citations
  More

8 results found


Proceedings ArticleDOI: 10.1109/MAJICC53071.2021.9526250
15 Jul 2021-
Abstract: Cardiovascular diseases (CVD) have been one of the top two causes of death globally, accounting for 633,842 fatalities. An intelligent system capable of detecting these disorders is needed urgently. Phonocardiogram (PCG) signals are useful in the earlier detection of CVDs as they help determine the actual nature and condition of the heart. Cardiac auscultation is the most used procedure for examining, classifying, and analyzing the cardiac sounds in a PCG. We formulated an algorithm for classifying various types of cardiovascular diseases using PCG auscultations. Dataset repository (Normal & Extrahls) is made up of personally acquired PCGs from different clinical facilities. Empirical Mode Decomposition (EMD) helps denoise and pre-process these raw signals. To extract the area of interest, soft threshold-based signal segmentation is applied. Then, four Impulsive domain features are extracted from each class’s pre-processed signal and fed to six separate machine learning-based ensemble classifiers to evaluate optimum accuracy. The proposed methodology obtained a cumulative accuracy of 98.8 %, specificity of 97.56%, and sensitivity of 99.99 %. This system will assist Pakistani doctors to detect and classify heart disease without any invasive technology usage.

... read more

Topics: Phonocardiogram (55%)

1 Citations


Open accessJournal ArticleDOI: 10.3390/JPM11060515
Tahir Mahmood1, Muhammad Owais1, Kyoung Jun Noh1, Hyo Sik Yoon1  +4 moreInstitutions (1)
Abstract: Accurate nuclear segmentation in histopathology images plays a key role in digital pathology. It is considered a prerequisite for the determination of cell phenotype, nuclear morphometrics, cell classification, and the grading and prognosis of cancer. However, it is a very challenging task because of the different types of nuclei, large intraclass variations, and diverse cell morphologies. Consequently, the manual inspection of such images under high-resolution microscopes is tedious and time-consuming. Alternatively, artificial intelligence (AI)-based automated techniques, which are fast and robust, and require less human effort, can be used. Recently, several AI-based nuclear segmentation techniques have been proposed. They have shown a significant performance improvement for this task, but there is room for further improvement. Thus, we propose an AI-based nuclear segmentation technique in which we adopt a new nuclear segmentation network empowered by residual skip connections to address this issue. Experiments were performed on two publicly available datasets: (1) The Cancer Genome Atlas (TCGA), and (2) Triple-Negative Breast Cancer (TNBC). The results show that our proposed technique achieves an aggregated Jaccard index (AJI) of 0.6794, Dice coefficient of 0.8084, and F1-measure of 0.8547 on TCGA dataset, and an AJI of 0.7332, Dice coefficient of 0.8441, precision of 0.8352, recall of 0.8306, and F1-measure of 0.8329 on the TNBC dataset. These values are higher than those of the state-of-the-art methods.

... read more

1 Citations


Open accessJournal ArticleDOI: 10.1109/ACCESS.2021.3103316
09 Aug 2021-IEEE Access
Abstract: A Phonocardiogram (PCG) signal represents murmurs and sounds signals made by vibrations caused for the period of a cardiac cycle. Acoustic wave generated through the beat of the cardiac cycle propagates through the chest wall. It can be easily recorded by a low-cost small handheld digital device called a stethoscope. It provides information like heart rate, intensity, tone, quality, frequency, and location of various components of cardiac sound. Due to these characteristics, phonocardiogram signals can be used to detect heart status at an early stage in a non-invasive manner. In previous studies, the Convolutional Neural Network (ConvNet) is the most studied architecture, which was fed by features, namely Mel Frequency Cepstral (MFC), Chroma Energy Normalized Statistics (CENS), and Constant-Q Transform (CQT). This work has proposed a ConvNet model trained by Hybrid Constant-Q Transform (HCQT) for heart sound beat classification. CQT, Variable-Q Transform (VQT), and HCQT are extracted from each phonocardiogram signal as the acoustic features, including the dominant MFCC features, feed into five-layer regularized ConvNets. After analyzing the literature in the same domain, it can be stated that this is the first time HCQT is being utilized for PCG signals. The findings of the experiments demonstrate that HCQT is more effective than standard CQT and other variants. Also, the accuracies of the system proposed in this work on the validation datasets are 96% in multi-class classification, which outperforms the proposed work relative to other models significantly. The source code is available on the Github repository https://github.com/shamiktiwari/ PCG-signal-Classification-using-Hybrid-Constant-Q-Transform to support the research community.

... read more

Topics: Phonocardiogram (67%), Mel-frequency cepstrum (52%)

1 Citations


Proceedings ArticleDOI: 10.1109/ICESC51422.2021.9532766
Ann Nita Netto1, Lizy Abraham1Institutions (1)
04 Aug 2021-
Abstract: Cardiovascular disease (CVD) is one of the prime reason for death in India and across the globe. Rural areas of India suffer from shortage of cardiologist and medical facilities. Hence there is a need for the development of an efficient, automated heart disease detection system that can analyse the phonocardiogram to detect the disease. The paper proposes deep learning architectures for anomaly detection from heart sounds. The work classifies the unsegmented phonocardiograms into five classes, four cardiovascular diseases and normal(N). The detected pathological conditions are mitral valve prolapse (MVP), mitral stenosis (MS), mitral regurgitation (MR) and aortic stenosis (AS). Features are extracted using Mel Frequency Cepstral Coefficient (MFCCs) and learning and classification are performed using deep learning methods such as Convolutional Neural Network (CNN), Long Short Term Memory (LSTM) and a combination of 1DCNN and LSTM. A total of 1960 phonocardiogram (PCG) segments are used to develop the models with 392 segments in each class. We have achieved an accuracy of 99.1%, 98.2%, 99.4% for CNN, LSTM and 1DCNN-LSTM respectively.

... read more

Topics: Phonocardiogram (60%), Heart sounds (51%), Mitral valve prolapse (50%)

Journal ArticleDOI: 10.1007/S10772-021-09890-4
Abstract: During each cardiac cycle of heart, vibrations creates sound and murmur. When these sound and murmur wave is represented graphically then it is called phonocardiogram (PCG). Digital stethoscope is used to record the audio wave signals generated due to heart vibration. Audio waves recorded through digital stethoscope can be used to fetch information like tone, quality, intensity, frequency, heart rate etc. Based on the heart condition, this information will be different for different people and can be used to predict the status of heart at early stage in non-invasive manner. In this research work, by using deep learning models, authors have classified PCG signals into 5 classes namely extra systole, extra heart sound, artifacts, normal heartbeat and murmur. Initially spectrograms in the form of images are extracted from PCG sound and feed into Regularized Convolutional Neural Network. From the simulation environment designed in python, it has found that proposed model has shown the average accuracy of 94% while doing the classification of PCG sound in five classes.

... read more

Topics: Phonocardiogram (62%), Heartbeat (53%), Stethoscope (52%)

References
  More

54 results found


Journal ArticleDOI: 10.1162/NECO.1997.9.8.1735
Sepp Hochreiter1, Jürgen Schmidhuber2Institutions (2)
01 Nov 1997-Neural Computation
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

... read more

49,735 Citations


Open accessProceedings Article
Sergey Ioffe1, Christian Szegedy1Institutions (1)
06 Jul 2015-
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

... read more

23,723 Citations


Open accessPosted Content
Sergey Ioffe1, Christian Szegedy1Institutions (1)
11 Feb 2015-arXiv: Learning
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.

... read more

17,151 Citations


Open accessProceedings ArticleDOI: 10.1109/CVPR.2018.00474
Mark Sandler1, Andrew Howard1, Menglong Zhu1, Andrey Zhmoginov1  +1 moreInstitutions (1)
18 Jun 2018-
Abstract: In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

... read more

Topics: Mobile architecture (54%), Object detection (53%), Image segmentation (52%) ... show more

5,263 Citations


Open accessJournal ArticleDOI: 10.1109/78.650093
Abstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported.

... read more

5,216 Citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20218