scispace - formally typeset
Open AccessJournal ArticleDOI

Dysarthric speech classification from coded telephone speech using glottal features

N. P. Narendra, +1 more
- 01 Jul 2019 - 
- Vol. 110, pp 47-55
Reads0
Chats0
TLDR
The results showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glattal features.
About
This article is published in Speech Communication.The article was published on 2019-07-01 and is currently open access. It has received 31 citations till now. The article focuses on the topics: Dysarthria.

read more

Citations
More filters
Journal ArticleDOI

Ensemble Learning of Hybrid Acoustic Features for Speech Emotion Recognition

TL;DR: The authors propose the agglutination of prosodic and spectral features from a group of carefully selected features to realize hybrid acoustic features for improving the task of emotion recognition.
Journal ArticleDOI

Glottal Source Information for Pathological Voice Detection

TL;DR: The evaluation of both approaches demonstrate that automatic detection of pathological voice from healthy speech benefits from using glottal source information.
Journal ArticleDOI

The Detection of Parkinson's Disease From Speech Using Voice Source Information

TL;DR: In this article, the use of voice source information in the detection of Parkinson's disease from speech using two classifier architectures: traditional pipeline approach and end-to-end approach.
Journal ArticleDOI

Repeatability of Commonly Used Speech and Language Features for Clinical Applications.

TL;DR: The results demonstrate that the repeatability of speech features extracted using open-source tool kits is low, and researchers should exercise caution when developing digital health models with open- source speech features.
Journal ArticleDOI

Detection of Speech Impairments Using Cepstrum, Auditory Spectrogram and Wavelet Time Scattering Domain Features

TL;DR: Bidirectional Long Short-Term Memory neural network and Wavelet Scattering Transform with Support Vector Machine classifier for detecting speech impairments of patients at the early stage of central nervous system disorders (CNSD) are adopted.
References
More filters
Book

Applied nonparametric statistics

TL;DR: In this paper, applied nonparametric statistics are applied to the problem of applied non-parametric statistical data collection in the context of the application of applied NN statistics, including:
Book

Applied nonparametric statistics

TL;DR: In this article, applied nonparametric statistics are applied to the problem of applied non-parametric statistical data collection in the context of the application of applied NN statistics, including:
Book

Motor Speech Disorders: Substrates, Differential Diagnosis, and Management

TL;DR: In this article, the authors define, understand, and categorize motor speech disorders, and present a classification of the disorders based on the following: 1. Defining, Understanding, and Categorizing Motor Speech Disorders 2. Neurologic Bases of Motor Speech and its Pathologies 3. Examination of motor Speech disorders Part 2: The Disorders and their Diagnoses 4.
Proceedings ArticleDOI

Recent developments in openSMILE, the munich open-source multimedia feature extractor

TL;DR: OpenSMILE 2.0 as mentioned in this paper unifies feature extraction paradigms from speech, music, and general sound events with basic video features for multi-modal processing, allowing for time synchronization of parameters, on-line incremental processing as well as off-line and batch processing, and the extraction of statistical functionals (feature summaries).
Proceedings Article

The INTERSPEECH 2009 Emotion Challenge

TL;DR: The challenge, the corpus, the features, and benchmark results of two popular approaches towards emotion recognition from speech, and the FAU Aibo Emotion Corpus are introduced.
Related Papers (5)
Frequently Asked Questions (14)
Q1. What are the contributions mentioned in the paper "Nonavinakere prabhakera, narendra; alku, paavo dysarthric speech classification from coded telephone speech using glottal features" ?

This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech. 

Possible future works are as follows. Apart from the AMR-NB and AMR-WB codecs, the proposed method can be evaluated using recent codecs, for example, Enhanced Voice Services ( EVS ) codec [ 58 ]. The proposed method can be extended for the speech-based telemonitoring of different neuro-motor disorders such as Parkinson ’ s disease, Alzheimer ’ s disease, and ALS. Apart from neuro-motor disorders, the proposed method can be utilized for different paralinguistic tasks such as the recognition of emotion, and speaker states and traits under the coded condition. 

80 sentence-level utterances from each speaker are used (except for two dysarthric speakers, only 23 and 28 utterances are used due to the lack of availability of recordings) to develop dysarthric speech classification systems. 

Apart from the AMR-NB and AMR-WB codecs, the proposed method can be evaluated using recent codecs, for example, Enhanced Voice Services (EVS) codec [58]. 

The existing dysarthric speech classification systems extract high-dimensional acoustic features to capture the wide variabilities of sources and patterns in pathological speech. 

two sets of glottal parameters are extracted from glottal flow waveforms, which are estimated using two GIF methods (QCP and DNN-GIF) and, hence, a total of four types of glottal parameters are extracted. 

In order to estimate the glottal flow from coded speech, two GIF methods are utilized: QCP and the recently proposed DNN-GIF method. 

Dysarthric speech data was recorded using an eight-microphone array, sampled at 16 kHz and each microphone was spaced at intervals of 1.5 inches. 

Two sets of acoustic features, named openSMILE-1 and openSMILE-2, are also extracted from every coded speech utterance using openSMILE (described in Section 2.4), which is a widely used toolkit in paralinguistic speech processing tasks. 

In this work, two sets of acoustic features, which are extracted from coded telephone speech using the openSMILE toolkit [35] are used as reference features. 

The proposed method utilizes SVM to predict dysarthric/healthy labels by using the acoustic and glottal features extracted from coded speech. 

The optimal values of kernel parameter γ and penalty parameter C are chosen based on grid search with C and γ, varying from 10−3 to 103 in multiples of 10. 

The glottal flow waveform is obtained from coded telephone speech (coded with two standardized speech codecs - AMR-NB and AMR-WB) using QCP and the recently proposed DNN-GIF method. 

From the table, it can be observed that with more than 80 % classification accuracy (except for openSMILE-1 of NB-coded speech with 77.71 % of TORGO), the two sets of openSMILE-based features have better classification accuracy than the glottal parameters after feature selection for both NB- and WB-coded speech.