scispace - formally typeset
Open AccessJournal ArticleDOI

Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

Marco Marini, +2 more
- 27 Sep 2021 - 
- Vol. 21, Iss: 19, pp 6460
Reads0
Chats0
TLDR
In this paper, a new approach was proposed to exploit the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform to improve the performance of a speaker-dependent ASR system.
Abstract
Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters.

read more

Citations
More filters
Journal ArticleDOI

Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm

TL;DR: This study proposes a machine learning-based approach that performs feature parameter extraction from speech signals to improve the performance of speech recognition applications in real-time smart city environments and achieves seamless classification performance compared to other conventional speech recognition algorithms.
Proceedings ArticleDOI

Research on the Influence of Different Feature Parameters on Speech Recognition Rate

TL;DR: In this paper , three feature parameters have been introduced, namely, LPCC, MFCC, and RAS_MFCC, which are common in speech recognition, and the WFBA method is applied to MFCC.
Journal ArticleDOI

Automatic Assessment of Aphasic Speech Sensed by Audio Sensors for Classification into Aphasia Severity Levels to Recommend Speech Therapies

TL;DR: The study shows that aphasia level classification can be completed with accuracy, precision, recall, and F1-score values of 0.99 using MFCC for 20 s audio samples using the deep neural network approach in order to recommend corresponding speech therapy for the identified level.
References
More filters
Journal ArticleDOI

The control of the false discovery rate in multiple testing under dependency

TL;DR: In this paper, it was shown that a simple FDR controlling procedure for independent test statistics can also control the false discovery rate when test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses.
Journal ArticleDOI

Robust statistics for outlier detection

TL;DR: An overview of several robust methods and outlier detection tools for univariate, low‐dimensional, and high‐dimensional data such as estimation of location and scatter, linear regression, principal component analysis, and classification are presented.
Journal ArticleDOI

Vocal Acoustic Analysis – Jitter, Shimmer and HNR Parameters☆

TL;DR: In this paper, a new procedure for automatic diagnosis of pathologies of the larynx is presented, which has the advantage over other traditional techniques of being non-invasive, inexpensive and objective.
Journal ArticleDOI

The TORGO database of acoustic and articulatory speech from speakers with dysarthria

TL;DR: This paper describes the acquisition of a new database of dysarthric speech in terms of aligned acoustics and articulatory data from seven individuals with speech impediments caused by cerebral palsy or amyotrophic lateral sclerosis and age- and gender-matched control subjects.
Journal ArticleDOI

Assessment of voice quality: Current state-of-the-art

TL;DR: A collection of critical issues of the current state-of-the-art in voice quality assessments of auditory-perceptual judgment, objective-acoustic analysis and aerodynamic measurements in clinical practice and research are discussed that maybe helpful for clinicians and researchers.
Related Papers (5)