scispace - formally typeset
Proceedings ArticleDOI

Vocal fold disorder detection based on continuous speech by using MFCC and GMM

Reads0
Chats0
TLDR
Mel-frequency cepstral coefficients (MFCC) are used with Gaussian mixture model (GMM) to build an automatic detection system capable of differentiating normal and pathological voices.
Abstract
Vocal fold voice disorder detection with a sustained vowel is well investigated by research community during recent years. The detection of voice disorder with a sustained vowel is a comparatively easier task than detection with continuous speech. The speech signal remains stationary in case of sustained vowel but it varies over time in continuous time. This is the reason; voice detection by using continuous speech is challenging and demands more investigation. Moreover, detection with continuous speech is more realistic because people use it in their daily conversation but sustained vowel is not used in everyday talks. An accurate voice assessment can provide unique and complementary information for the diagnosis, and can be used in the treatment plan. In this paper, vocal fold disorders, cyst, polyp, nodules, paralysis, and sulcus, are detected using continuous speech. Mel-frequency cepstral coefficients (MFCC) are used with Gaussian mixture model (GMM) to build an automatic detection system capable of differentiating normal and pathological voices. The detection rate of the developed detection system with continuous speech is 91.66%.

read more

Citations
More filters
Journal ArticleDOI

Development of the Arabic Voice Pathology Database and Its Evaluation by Using Speech Features and Machine Learning Algorithms

TL;DR: An Arabic voice pathology database (AVPD) is designed and developed in this study by recording three vowels, running speech, and isolated words and the shortcomings of different voice disorder databases were identified so that they could be avoided in the AVPD.
Journal ArticleDOI

A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders.

TL;DR: A survey of research work conducted on automatic detection of voice disorders and how it is able to identify the different types of voice Disorders is presented.
Journal ArticleDOI

An intelligent healthcare system for detection and classification to discriminate vocal fold disorders

TL;DR: The results show that the proposed intelligent healthcare system is accurate and reliable in vocal fold disorder assessment and can be deployed successfully for remote diagnosis and is better as compared to existing disorder assessment systems.
Journal ArticleDOI

An Automatic Health Monitoring System for Patients Suffering From Voice Complications in Smart Cities

TL;DR: An automatic voice disorder detection system to monitor the resident of all age group and professional backgrounds is implemented and it is found that lower frequencies from 1 to 1562 Hz contributes significantly in the detection of voice disorders.
Journal ArticleDOI

Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model.

TL;DR: The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.
References
More filters
Book

Fundamentals of speech recognition

TL;DR: This book presents a meta-modelling framework for speech recognition that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of manually modeling speech.

Gaussian Mixture Models.

TL;DR: Gaussian Mixture Model parameters are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A Posteriori (MAP) estimation from a well-trained prior model.
Book

Support Vector Machines for Pattern Classification

TL;DR: This book presents architectures for multiclass classification and function approximation problems, as well as evaluation criteria for classifiers and regressors, and discusses kernel methods for improving the generalization ability of neural networks and fuzzy systems.
Journal ArticleDOI

Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification

TL;DR: The cepstrum was found to be the most effective, providing an identification accuracy of 70% for speech 50 msec in duration, which increased to more than 98% for a duration of 0.5 sec.
Related Papers (5)