scispace - formally typeset
Search or ask a question
Author

Saumya Borwankar

Bio: Saumya Borwankar is an academic researcher from Nirma University of Science and Technology. The author has contributed to research in topics: Deep learning & Convolutional neural network. The author has an hindex of 1, co-authored 8 publications receiving 2 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article , a novel approach is proposed to pre-process the data and pass it through a newly proposed CNN architecture, which helps to make an accurate diagnosis of lung sounds.
Abstract: Every respiratory-related checkup includes audio samples collected from the individual, collected through different tools (sonograph, stethoscope). This audio is analyzed to identify pathology, which requires time and effort. The research work proposed in this paper aims at easing the task with deep learning by the diagnosis of lung-related pathologies using Convolutional Neural Network (CNN) with the help of transformed features from the audio samples. International Conference on Biomedical and Health Informatics (ICBHI) corpus dataset was used for lung sound. Here a novel approach is proposed to pre-process the data and pass it through a newly proposed CNN architecture. The combination of pre-processing steps MFCC, Melspectrogram, and Chroma CENS with CNN improvise the performance of the proposed system, which helps to make an accurate diagnosis of lung sounds. The comparative analysis shows how the proposed approach performs better with previous state-of-the-art research approaches. It also shows that there is no need for a wheeze or a crackle to be present in the lung sound to carry out the classification of respiratory pathologies.

7 citations

Proceedings ArticleDOI
02 Jul 2020
TL;DR: This work has automated the process of diagnosis of glaucoma using deep learning approaches and compared the results with previous approaches, which shows that this method has a better accuracy score.
Abstract: Glaucoma is termed as one of the top leading causes of vision loss and in many cases is irreversible [1]. It is a condition that damages the optic nerve and it goes unnoticed in early stages as the symptoms are not prominent in the early stages. Recent approaches have been made to automate the detection of glaucoma based on available datasets. World Health Organization also looks at eye defects to be critical as a result of the health evaluation conducted globally on health challenges. Survey points to the fact that it can become one of the primary concerns in 2020 which might affect around 75-80 million people. We have automated the process of diagnosis of glaucoma using deep learning approaches. Image processing has gained a lot of attraction and can be used for this problem in forming a computer-aided diagnosis for diseases. In the end, we have compared our results with previous approaches, which shows that our method has a better accuracy score.

3 citations

Book ChapterDOI
01 Jan 2021
TL;DR: In this paper, a convolutional neural network (CNN) was used for speech emotion classification with an accuracy of 97% on three publicly available datasets, namely, Surrey Audio-Visual Expressed Emotion (SAVEE), Toronto emotional speech set (TESS), and Berlin Database of Emotional Speech (Emo-DB) with the help of CNN.
Abstract: In our day-to-day life, speech is the primary medium of communication between humans. All the interpersonal communication that takes place is emotional. There is often a need to predict the emotion of the intended speech to understand the emotional and psychological response for the state of the person. Now machines can automate this task with the help of machine learning, so the task of speech emotion detection has seen many developments. In this paper, we have looked at a different feature for the classification of speech emotion and we have analyzed the results on three publicly available datasets, namely, Surrey Audio-Visual Expressed Emotion (SAVEE), Toronto emotional speech set (TESS), and Berlin Database of Emotional Speech (Emo-DB) with the help of convolutional neural networks. The accuracy of the proposed model reaches around 97% which is better than previous approaches. This robust Speech Emotion Recognition (SER) system can help people in many sectors like healthcare, accident prevention to name a few.

2 citations

Posted Content
TL;DR: The core fundamentals and brief overview of the research of R. G. GALLAGER on Low-Density Parity-Check (LDPC) codes and various parameters related toLDPC codes like, encoding and decoding of LDPC codes, code rate, parity check matrix, tanner graph are expressed.
Abstract: This paper basically expresses the core fundamentals and brief overview of the research of R. G. GALLAGER [1] on Low-Density Parity-Check (LDPC) codes and various parameters related to LDPC codes like, encoding and decoding of LDPC codes, code rate, parity check matrix, tanner graph. We also discuss advantages and applications as well as the usage of LDPC codes in 5G technology. We have simulated encoding and decoding of LDPC codes and have acquired results in terms of BER vs SNR graph in MATLAB software. This report was submitted as an assignment in Nirma University

2 citations

Book ChapterDOI
15 May 2020
TL;DR: In this paper, an accurate and robust automatic speaker verification (ASV) system has been proposed to authenticate users at a text independent level by converting audio files to spectrograms which are pre-processed and then are classified using Convolutional Neural Networks (CNN).
Abstract: In this paper we have implemented and designed an accurate and robust automatic speaker verification (ASV) system. There is a constant need for improving the performance of the ASV because of its many advantages over other biometric means. In this work we have experimented with the VoxCeleb dataset. The VoxCeleb Dataset which has more than 1200 individuals and around 300–500 utterances each. We have proposed a new approach to authenticate users at a text independent level. Firstly the audio files are converted to spectrograms which are pre-processed and then are classified using Convolutional Neural Networks (CNN) and a model is created at a text independent level. The classification of the individual speakers are made on the bases of Spectrogram. Finally, the performance evaluation is done using the training and validation accuracy plots.

1 citations


Cited by
More filters
Posted ContentDOI
TL;DR: A new methodology is introduced to identify the Glaucoma on earlier stages called Depth Optimized Machine Learning Strategy (DOMLS), in which it adapts the new optimization logic called Modified K-Means Optimization Logic (MkMOL) to provide best accuracy in results and the proposed approach assures the accuracy level of more than 96.2%.
Abstract: Glaucoma is a major threatening cause, in which it affects the optical nerve to lead a permanent blindness to individuals. The major causes of Glaucoma are high pressure to eyes, family history, irregular sleeping habits and so on. These kinds of causes leads to Glaucoma easily as well as the affection to such disease leads a heavy damage to the internal optic nervous system and the affected person will get permanent blindness within few months. The eye fluid called aqueous humor is getting blocked inside due to Glaucoma, in normal cases sometimes the fluid comes out from the eye via mesh perspective channel, but this Glaucoma blocks that channel and causes the fluid to getting locked inside and provides the permanent blockage inside. So, that the eyes are getting severe affections such as infection, random blindness in initial stages and so on. The World Health Organization analyzes and reports nearly 80 million people around the globe are affected due to some form of Glaucoma. The major problem with this disease is it is incurable, however, the affection stages can be reduced and maintain the same level of affection as it is for the long period but it is possible only earlier stages of identification. This Glaucoma causes structural affection to the eye ball and it is complex to estimate the cause during regular diagnosis. In medical terms, the Cup to Disc Ratio (CDR) is minimized to the Glaucoma patients suddenly and leads a harmful damage to one's eye in severe manner. The general way to identify the Glaucoma is to take Optical Coherence Tomography (OCT) test, in which it captures the uncovered portion of eye ball (backside) and it is an efficient way to visualize diverse portions of eyes with optical nerve visibility is shown clearly. The OCT images are mainly used to identify the diseases like Glaucoma with proper and robust accuracy levels. In this paper, a new methodology is introduced to identify the Glaucoma on earlier stages called Depth Optimized Machine Learning Strategy (DOMLS), in which it adapts the new optimization logic called Modified K-Means Optimization Logic (MkMOL) to provide best accuracy in results and the proposed approach assures the accuracy level of more than 96.2% with least error rate of 0.002%. This paper focuses on the identification of early stage of Glaucoma and provides an efficient solution to people in case of affection by such disease using OCT images. The exact position point out is handled by using Region of Interest (ROI) based optical region selection, in which it is easy to point the Optical Cup (OC) and Optical Disc (OD). The proposed algorithm of DOMLS proves the accuracy levels in estimation of Glaucoma and shows the practical proofs on resulting section in clear manner.

18 citations

Journal ArticleDOI
TL;DR: This work has proposed enhanced artificial neural network approaches for the accuracy of lung diseases by using the 120 subjective datasets from public landmarks with and without lung diseases to provide enhanced classification accuracy.
Abstract: Lung disease is one of the most harmful diseases in traditional days and is the same nowadays. Early detection is one of the most crucial ways to prevent a human from developing these types of diseases. Many researchers are involved in finding various techniques for predicting the accuracy of the diseases. On the basis of the machine learning algorithm, it was not possible to predict the better accuracy when compared to the deep learning technique; this work has proposed enhanced artificial neural network approaches for the accuracy of lung diseases. Here, the discrete Fourier transform and the Burg auto-regression techniques are used for extracting the computed tomography (CT) scan images, and feature reduction takes place by using principle component analysis (PCA). This proposed work has used the 120 subjective datasets from public landmarks with and without lung diseases. The given dataset is trained by using an enhanced artificial neural network (ANN). The preprocessing techniques are handled by using a Gaussian filter; thus, our proposed approach provides enhanced classification accuracy. Finally, our proposed method is compared with the existing machine learning approach based on its accuracy.

9 citations

Journal ArticleDOI
TL;DR: In this paper, the authors introduce an open-access database of synchronously recorded electroencephalogram signals (EEG), voice signals, and video feed from 51 volunteers, 25 female, 26 male, captured for, but not limited to, biometric purposes.
Abstract: Multimodal biometric schemes arise as an interesting solution to the multidimensional reinforcement problem for biometric security systems. Along with the performance dimension, these systems should also comply with required levels for other conditions such as permanence, collectability, and circumvention, among others. In response to the demand for a multimodal and synchronous dataset, we introduce in this paper an open-access database of synchronously recorded electroencephalogram signals (EEG), voice signals, and video feed from 51 volunteers, 25 female, 26 male, captured for, but not limited to, biometric purposes. A total of 140 samples were collected from each user when pronouncing single digits in Spanish, giving a total of 7140 instances. EEG signals were captured using a 14-channel Emotiv™ Epoc headset. The resulting set becomes a valuable resource when working on unimodal biometric systems, but significantly more for the evaluation of multimodal variants. Furthermore, the usefulness of the collected signals extends to being exploited by projects in brain-computer interfaces and face recognition to name just a few. As an initial report on data separability of the related samples, five user recognition experiments are presented: a face recognition identifier with an accuracy of 99%, a speaker identification system with accuracy of 94.2%, a bimodal face-speech verification case with Equal Error Rate around 2.64, an EEG identification example, and a bimodal user identification exercise based on EEG and voice modalities with a registered accuracy of 97.6%.

7 citations

Journal ArticleDOI
TL;DR: In this article , an unsupervised learning approach was used to detect and diagnose faults in a cellular network using two sets of data collected by the network: performance support system data and drive test data.
Abstract: Handling faults in a running cellular network can impair the performance and dissatisfy the end users. It is important to design an automatic self-healing procedure to not only detect the active faults, but also to diagnosis them automatically. Although fault detection has been well studied in the literature, fewer studies have targeted the more complicated task of diagnosing. Our presented method aims to tackle fault detection and diagnosis using two sets of data collected by the network: performance support system data and drive test data. Although performance support system data is collected automatically by the network, drive test data are collected manually in three mode call scenarios: short, long and idle. The short call can identify faults in a call setup, the long call is designed to identify handover failures and call interruption, and, finally, the idle mode is designed to understand the characteristics of the standard signal in the network. We have applied unsupervised learning, along with various classified algorithms, on performance support system data. Congestion and failures in TCH assignments are a few examples of the detected and diagnosed faults with our method. In addition, we present a framework to identify the need for handovers. The Silhouette coefficient is used to evaluate the quality of the unsupervised learning approach. We achieved an accuracy of 96.86% with the dynamic neural network method.

4 citations

Journal ArticleDOI
TL;DR: A comprehensive review of prior deep-learning-based architecture lung sound analysis can be found in this article , which discusses different trends in pathology/lung sound, the common features for classifying lung sounds, several considered datasets, classification methods, signal processing techniques, and some statistical information based on previous study findings.
Abstract: Lung auscultation has long been used as a valuable medical tool to assess respiratory health and has gotten a lot of attention in recent years, notably following the coronavirus epidemic. Lung auscultation is used to assess a patient’s respiratory role. Modern technological progress has guided the growth of computer-based respiratory speech investigation, a valuable tool for detecting lung abnormalities and diseases. Several recent studies have reviewed this important area, but none are specific to lung sound-based analysis with deep-learning architectures from one side and the provided information was not sufficient for a good understanding of these techniques. This paper gives a complete review of prior deep-learning-based architecture lung sound analysis. Deep-learning-based respiratory sound analysis articles are found in different databases including the Plos, ACM Digital Libraries, Elsevier, PubMed, MDPI, Springer, and IEEE. More than 160 publications were extracted and submitted for assessment. This paper discusses different trends in pathology/lung sound, the common features for classifying lung sounds, several considered datasets, classification methods, signal processing techniques, and some statistical information based on previous study findings. Finally, the assessment concludes with a discussion of potential future improvements and recommendations.

1 citations