scispace - formally typeset
Open AccessProceedings ArticleDOI

Sequence-Based Multi-Lingual Low Resource Speech Recognition

Reads0
Chats0
TLDR
The authors showed that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss and showed that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data.
Abstract
Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained using Connectionist Temporal Classification (CTC) loss. We show that our model improves performance on Babel languages by over 6% absolute in terms of word/phoneme error rate when compared to mono-lingual systems built in the same setting for these languages. We also show that the trained model can be adapted cross-lingually to an unseen language using just 25% of the target data. We show that training on multiple languages is important for very low resource cross-lingual target scenarios, but not for multi-lingual testing scenarios. Here, it appears beneficial to include large well prepared datasets.

read more

Citations
More filters
Proceedings ArticleDOI

Unsupervised Pretraining Transfers Well Across Languages

TL;DR: In this article, contrastive predictive coding (CPC) algorithms have been proposed to pretrain ASR systems with unlabeled data, and the authors investigated whether unsupervised pretraining transfers well across languages.
Proceedings ArticleDOI

Meta Learning for End-To-End Low-Resource Speech Recognition

TL;DR: In this article, the authors proposed to apply meta learning approach for low-resource automatic speech recognition (ASR), and formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML).
Proceedings ArticleDOI

Hierarchical Multitask Learning With CTC

TL;DR: This paper shows how Hierarchical Multitask Learning can encourage the formation of useful intermediate representations by performing Connectionist Temporal Classification at different levels of the network with targets of different granularity.
Proceedings ArticleDOI

Universal Phone Recognition with a Multilingual Allophone System

TL;DR: This paper proposed a joint model of both language-independent phone and language-dependent phoneme distributions to improve low-resource phoneme error rate in multilingual ASR experiments over 11 languages, including Inuktitut and Tusom.
Related Papers (5)
Trending Questions (1)
Does spaced repetition help in language acquisition?

The given text does not provide any information about spaced repetition and its impact on language acquisition.