scispace - formally typeset
Open AccessProceedings Article

Learning Methods in Multilingual Speech Recognition

TLDR
Two learning methods, semiautomatic unit selection and global phonetic decision tree, are introduced to address the issue of effective utilization of acoustic data from multiple languages via effective utilization from multiple source languages.
Abstract
One key issue in developing learning methods for multilingual acoustic modeling in large vocabulary automatic speech recognition (ASR) applications is to maximize the benefit of boosting the acoustic training data from multiple source languages while minimizing the negative effects of data impurity arising from language “mismatch”. In this paper, we introduce two learning methods, semiautomatic unit selection and global phonetic decision tree, to address this issue via effective utilization of acoustic data from multiple languages. The semi-automatic unit selection is aimed to combine the merits of both data-driven and knowledgedriven approaches to identifying the basic units in multilingual acoustic modeling. The global decision-tree method allows clustering of cross-center phones and cross-center states in the HMMs, offering the potential to discover a better sharing structure beneath the mixed acoustic dynamics and context mismatch caused by the use of multiple languages’ acoustic data. Our preliminary experiment results show that both of these learning methods improve the performance of multilingual speech recognition.

read more

Citations
More filters
Journal ArticleDOI

A Real-Time End-to-End Multilingual Speech Recognition Architecture

TL;DR: This work presents an end-to-end multi-language ASR architecture, developed and deployed at Google, that allows users to select arbitrary combinations of spoken languages and leverage recent advances in language identification and a novel method of real-time language selection to achieve similar recognition accuracy and nearly-identical latency characteristics as a monolingual system.
Posted Content

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

TL;DR: In this article, the authors explore the possibility of training a single model to serve different English dialects, which simplifies the process of training multi-dialect systems without the need for separate acoustic, pronunciation and language models for each dialect.
Proceedings ArticleDOI

An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling

TL;DR: An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling, and proposes a state mapping approach to merge English states with similar Mandarin states to solve the problem of very limited data for English.
References
More filters
Proceedings ArticleDOI

Tree-based state tying for high accuracy acoustic modelling

TL;DR: This paper describes a method of creating a tied-state continuous speech recognition system using a phonetic decision tree, which is shown to lead to similar recognition performance to that obtained using an earlier data-driven approach but to have the additional advantage of providing a mapping for unseen triphones.
Journal ArticleDOI

Language-independent and language-adaptive acoustic modeling for speech recognition

TL;DR: Different methods for multilingual acoustic model combination and a polyphone decision tree specialization procedure are introduced for estimating acoustic models for a new target language using speech data from varied source languages, but only limited data from the target language.
Proceedings ArticleDOI

Towards language independent acoustic modeling

TL;DR: This work has developed both knowledge-based and automatic methods to map phonetic units from the source languages to the target language and employed HMM adaptation techniques and discriminative model combination to combine acoustic models from the individual source languages for recognition of speech in the targetlanguage.
Related Papers (5)