Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

doi:10.3390/S21196460

Open AccessJournal ArticleDOI

Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

Marco Marini, +2 more

- 27 Sep 2021 -

Sensors

- Vol. 21, Iss: 19, pp 6460

Chats0

TLDR

In this paper, a new approach was proposed to exploit the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform to improve the performance of a speaker-dependent ASR system.

Abstract:

Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters.

Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

Citations

Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm

Research on the Influence of Different Feature Parameters on Speech Recognition Rate

Automatic Assessment of Aphasic Speech Sensed by Audio Sensors for Classification into Aphasia Severity Levels to Recommend Speech Therapies

References

The control of the false discovery rate in multiple testing under dependency

Robust statistics for outlier detection

Vocal Acoustic Analysis – Jitter, Shimmer and HNR Parameters☆

The TORGO database of acoustic and articulatory speech from speakers with dysarthria

Assessment of voice quality: Current state-of-the-art

Related Papers (5)

Modelling confusion matrices to improve speech recognition accuracy, with an application to dysarthric speech.

Automatic selection of speakers for improved acoustic modelling: recognition of disordered speech with sparse data

A Comparative Study of Adaptive, Automatic recognition of Disordered Speech

Creating speaker independent ASR system through prosody modification based data augmentation

Comparing humans and automatic speech recognition systems in recognizing dysarthric speech