Journal ArticleDOI
A comparison between native and non-native speech for automatic speech recognition
Seongjin Park,John Culnan +1 more
TLDR
Preliminary results suggest that non-native speakers of English fail to produce flaps and reduced vowels, insert or delete segments, engage in more self-correction, and place pauses in different locations from native speakers.Abstract:
This study investigates differences in sentence and story production between native and non-native speakers of English for use with a system of Automatic Speech Recognition (ASR). Previous studies have shown that production errors by non-native speakers of English include misproduced segments (Flege, 1995), longer pause duration (Anderson-Hsieh and Venkatagiri, 1994), abnormal pause location within clauses (Kang, 2010), and non-reduction of function words (Jang, 2009). The present study uses phonemically balanced sentences from TIMIT (Garofolo et al., 1993) and a story to provide an additional comparison of the differences in production by native and non-native speakers of English. Consistent with previous research, preliminary results suggest that non-native speakers of English fail to produce flaps and reduced vowels, insert or delete segments, engage in more self-correction, and place pauses in different locations from native speakers. Non-native English speakers furthermore produce different patterns of intonation from native speakers and produce errors indicative of transfer from their L1 phonology, such as coda deletion and vowel epenthesis. Native speaker productions also contained errors, the majority of which were content-related. These results indicate that difficulties posed by English ASR systems in recognizing non-native speech are due largely to the heterogeneity of non-native production.read more
Citations
More filters
Proceedings ArticleDOI
Overview of the Interspeech TLT2020 Shared Task on ASR for Non-Native Children's Speech.
TL;DR: The corpus of non-native children’s speech that was used for the ASR challenge, analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.
Proceedings ArticleDOI
Self-supervised end-to-end ASR for low resource L2 Swedish
TL;DR: This work experiments with several monolingual and cross-lingual selfsupervised acoustic models to develop end-to-end ASR system for L2 Swedish, and indicates that these systems are competitive in performance with traditional ASR pipeline.
Journal ArticleDOI
Automatic Speech Recognition and Pronunciation Error Detection of Dutch Non-native Speech: cumulating speech resources in a pluricentric language
TL;DR: In this article , the authors investigated ways of addressing these problems through conventional and transfer learning Deep Neural Network (DNN) based Automatic Speech Recognition (ASR) and ASR-based pronunciation error detection (PED) by cumulating Netherlandic Dutch and Flemish Dutch speech resources.
Journal ArticleDOI
Audio Augmentation for Non-Native Children’s Speech Recognition through Discriminative Learning
Kodali Radha,Mohan Bansal +1 more
TL;DR: Harnessing the collaborative power of speed perturbation-based data augmentation on the original children’s speech corpora yields an effective performance and reveals that feature-space MMI models with steadily increasing speed perturbedation factors outperform traditional ASR baseline models.
Reconnaissance automatique de la parole : génération des prononciations non natives pour l'enrichissement du lexique
TL;DR: In this paper, a methode d'adaptation du lexique, destinee a ameliorer les systemes de la reconnaissance automatique de la parole (SRAP) des locuteurs non-natifs.
References
More filters
Proceedings ArticleDOI
Overview of the Interspeech TLT2020 Shared Task on ASR for Non-Native Children's Speech.
TL;DR: The corpus of non-native children’s speech that was used for the ASR challenge, analyzes the results, and discusses some points that should be considered for subsequent challenges in this domain in the future.
Proceedings ArticleDOI
Self-supervised end-to-end ASR for low resource L2 Swedish
TL;DR: This work experiments with several monolingual and cross-lingual selfsupervised acoustic models to develop end-to-end ASR system for L2 Swedish, and indicates that these systems are competitive in performance with traditional ASR pipeline.
Reconnaissance automatique de la parole : génération des prononciations non natives pour l'enrichissement du lexique
TL;DR: In this paper, a methode d'adaptation du lexique, destinee a ameliorer les systemes de la reconnaissance automatique de la parole (SRAP) des locuteurs non-natifs.
Related Papers (5)
Non-native speaker pause patterns closely correspond to those of native speakers at different speech rates.
Acoustic features of English sentences produced by native and non-native speakers
Yu-Fu Chen,Chang Liu,Su-Hyun Jin +2 more