Learning Methods in Multilingual Speech Recognition

Open AccessProceedings Article

Learning Methods in Multilingual Speech Recognition

TLDR

Two learning methods, semiautomatic unit selection and global phonetic decision tree, are introduced to address the issue of effective utilization of acoustic data from multiple languages via effective utilization from multiple source languages.

Abstract:

One key issue in developing learning methods for multilingual acoustic modeling in large vocabulary automatic speech recognition (ASR) applications is to maximize the benefit of boosting the acoustic training data from multiple source languages while minimizing the negative effects of data impurity arising from language “mismatch”. In this paper, we introduce two learning methods, semiautomatic unit selection and global phonetic decision tree, to address this issue via effective utilization of acoustic data from multiple languages. The semi-automatic unit selection is aimed to combine the merits of both data-driven and knowledgedriven approaches to identifying the basic units in multilingual acoustic modeling. The global decision-tree method allows clustering of cross-center phones and cross-center states in the HMMs, offering the potential to discover a better sharing structure beneath the mixed acoustic dynamics and context mismatch caused by the use of multiple languages’ acoustic data. Our preliminary experiment results show that both of these learning methods improve the performance of multilingual speech recognition.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

Edward J. Vajda

- 01 Dec 2000 -

Language

Journal ArticleDOI

The subspace Gaussian mixture model-A structured model for speech recognition

Daniel Povey, +12 more

- 01 Apr 2011 -

Computer Speech & Language

TL;DR: A new approach to speech recognition, in which all Hidden Markov Model states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state, appears to give better results than a conventional model.

...read moreread less

Journal ArticleDOI

A Real-Time End-to-End Multilingual Speech Recognition Architecture

Javier Gonzalez-Dominguez, +5 more

- 01 Jun 2015 -

IEEE Journal of Selected Topics in Signa...

TL;DR: This work presents an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages and leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.

...read moreread less

Posted Content

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

Bo Li, +8 more

- 05 Dec 2017 -

arXiv: Audio and Speech Processing

TL;DR: In this article, the authors explore the possibility of training a single model to serve different English dialects, which simplifies the process of training multi-dialect systems without the need for separate acoustic, pronunciation and language models for each dialect.

...read moreread less

Proceedings ArticleDOI

An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling

Ching-Feng Yeh, +3 more

TL;DR: An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling, and proposes a state mapping approach to merge English states with similar Mandarin states to solve the problem of very limited data for English.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

Edward J. Vajda

- 01 Dec 2000 -

Language

Proceedings ArticleDOI

Tree-based state tying for high accuracy acoustic modelling

Steve Young, +2 more

TL;DR: This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree, which is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones.

...read moreread less

Book

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

Carlos Gussenhoven, +4 more

Journal ArticleDOI

Language-independent and language-adaptive acoustic modeling for speech recognition

Tanja Schultz, +3 more

- 01 Aug 2001 -

Speech Communication

TL;DR: Different methods for multilingual acoustic model combination and a polyphone decision tree specialization procedure are introduced for estimating acoustic models for a new target language using speech data from varied source languages, but only limited data from the target language.

...read moreread less

Proceedings ArticleDOI

Towards language independent acoustic modeling

William Byrne, +9 more

TL;DR: This work has developed both knowledge-based and automatic methods to map phonetic units from the source languages to the target language and employed HMM adaptation techniques and discriminative model combination to combine acoustic models from the individual source languages for recognition of speech in the targetlanguage.

...read moreread less

Related Papers (5)

Language-independent and language-adaptive acoustic modeling for speech recognition

Tanja Schultz, +3 more

- 01 Aug 2001 -

Speech Communication

IEEE Transactions on Audio, Speech, and ...

Structure-Based and Template-Based Automatic Speech Recognition: Comparing Parametric and Non-Parametric Approaches

Li Deng, +1 more

Learning Methods in Multilingual Speech Recognition

Citations

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

The subspace Gaussian mixture model-A structured model for speech recognition

A Real-Time End-to-End Multilingual Speech Recognition Architecture

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling

References

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

Tree-based state tying for high accuracy acoustic modelling

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

Language-independent and language-adaptive acoustic modeling for speech recognition

Towards language independent acoustic modeling

Related Papers (5)

Language-independent and language-adaptive acoustic modeling for speech recognition

n-gram and decision tree based language identification for written words

Improving the Performance of Transformer Based Low Resource Speech Recognition for Indian Languages

Crosslingual and Multilingual Speech Recognition Based on the Speech Manifold

Structure-Based and Template-Based Automatic Speech Recognition: Comparing Parametric and Non-Parametric Approaches