Open AccessProceedings Article
Learning Methods in Multilingual Speech Recognition
TLDR
Two learning methods, semiautomatic unit selection and global phonetic decision tree, are introduced to address the issue of effective utilization of acoustic data from multiple languages via effective utilization from multiple source languages.Abstract:
One key issue in developing learning methods for multilingual acoustic modeling in large vocabulary automatic speech recognition (ASR) applications is to maximize the benefit of boosting the acoustic training data from multiple source languages while minimizing the negative effects of data impurity arising from language “mismatch”. In this paper, we introduce two learning methods, semiautomatic unit selection and global phonetic decision tree, to address this issue via effective utilization of acoustic data from multiple languages. The semi-automatic unit selection is aimed to combine the merits of both data-driven and knowledgedriven approaches to identifying the basic units in multilingual acoustic modeling. The global decision-tree method allows clustering of cross-center phones and cross-center states in the HMMs, offering the potential to discover a better sharing structure beneath the mixed acoustic dynamics and context mismatch caused by the use of multiple languages’ acoustic data. Our preliminary experiment results show that both of these learning methods improve the performance of multilingual speech recognition.read more
Citations
More filters
Journal ArticleDOI
Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet
Journal ArticleDOI
The subspace Gaussian mixture model-A structured model for speech recognition
Daniel Povey,Lukas Burget,Mohit Agarwal,Pinar Akyazi,Feng Kai,Arnab Ghoshal,Ondřej Glembek,Nagendra Kumar Goel,Martin Karafiat,Ariya Rastrow,Richard Rose,Petr Schwarz,Samuel Thomas +12 more
TL;DR: A new approach to speech recognition, in which all Hidden Markov Model states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state, appears to give better results than a conventional model.
Journal ArticleDOI
A Real-Time End-to-End Multilingual Speech Recognition Architecture
Javier Gonzalez-Dominguez,David Eustis,Ignacio Lopez-Moreno,Andrew W. Senior,Francoise Beaufays,Pedro J. Moreno +5 more
TL;DR: This work presents an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages and leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.
Posted Content
Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model
Bo Li,Tara N. Sainath,Khe Chai Sim,Michiel Bacchiani,Eugene Weinstein,Patrick Nguyen,Zhifeng Chen,Yonghui Wu,Kanishka Rao +8 more
TL;DR: In this article, the authors explore the possibility of training a single model to serve different English dialects, which simplifies the process of training multi-dialect systems without the need for separate acoustic, pronunciation and language models for each dialect.
Proceedings ArticleDOI
An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling
TL;DR: An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling, and proposes a state mapping approach to merge English states with similar Mandarin states to solve the problem of very limited data for English.
References
More filters
Journal ArticleDOI
Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet
Proceedings ArticleDOI
Tree-based state tying for high accuracy acoustic modelling
TL;DR: This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree, which is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones.
Journal ArticleDOI
Language-independent and language-adaptive acoustic modeling for speech recognition
TL;DR: Different methods for multilingual acoustic model combination and a polyphone decision tree specialization procedure are introduced for estimating acoustic models for a new target language using speech data from varied source languages, but only limited data from the target language.
Proceedings ArticleDOI
Towards language independent acoustic modeling
William Byrne,Peter Beyerlein,Juan M. Huerta,Sanjeev Khudanpur,Bhaskara Marthi,J. Morgan,Nino Peterek,Joseph Picone,D. Vergyri,T. Wang +9 more
TL;DR: This work has developed both knowledge-based and automatic methods to map phonetic units from the source languages to the target language and employed HMM adaptation techniques and discriminative model combination to combine acoustic models from the individual source languages for recognition of speech in the targetlanguage.
Related Papers (5)
Language-independent and language-adaptive acoustic modeling for speech recognition
n-gram and decision tree based language identification for written words
J. Hakkinen,Jilei Tian +1 more