Proceedings ArticleDOI
Subspace based for Indian languages
Aanchan Mohan,Srinivasan Umesh,Richard Rose +2 more
- pp 35-39
TLDR
In this paper, continuous density hidden Markov model (CDHMM) and subspace Gaussian mixture model (SGMM) based techniques are used to train acoustic models in four languages: Assamese, Bengali, Hindi and Marathi.Abstract:
The interest in this paper is in efficient configuration of automatic speech recognition (ASR) systems for use by under-served speaker populations. A task domain involving Indian farmers accessing information on agricultural commodities through a spoken dialog system in multiple languages is presented. To facilitate the development of ASR system for this domain, a speech corpus was collected in rural areas from speakers of four languages over wireless cellular channels. This paper investigates the problem of ASR acoustic modelling for this task domain. Continuous density hidden Markov model (CDHMM) and subspace Gaussian mixture model (SGMM) [1] based techniques are used to train acoustic models in four languages: Assamese, Bengali, Hindi and Marathi. Issues relating to limited linguistic resources with their impact on ASR word accuracy for these languages are addressed.read more
Citations
More filters
Journal ArticleDOI
Acoustic modelling for speech recognition in Indian languages in an agricultural commodities task domain
TL;DR: A cross-corpus acoustic normalization procedure is used which is a variant of speaker adaptive training (SAT) (Mohan et al., 2012a) and provides the best speech recognition performance for both languages.
Journal ArticleDOI
ASRoIL: a comprehensive survey for automatic speech recognition of Indian languages
TL;DR: The purpose of this systematic survey is to sum up the best available research on automatic speech recognition of Indian languages that is done by synthesizing the results of several studies by analyzing the possible opportunities, challenges, techniques, methods and the evidence from studies.
Posted Content
Automatic Speech Recognition and Topic Identification for Almost-Zero-Resource Languages
Matthew Wiesner,Chunxi Liu,Lucas Ondel,Craig Harman,Vimal Manohar,Jan Trmal,Zhongqiang Huang,Najim Dehak,Sanjeev Khudanpur +8 more
TL;DR: Kaldi-based systems for the DARPA LORELEI program are presented, which employ a universal phone modeling approach to ASR, and recipes for very rapid adaptation of this universal ASR system are described, which significantly outperform results obtained by many competing approaches on the NIST LoReHLT 2017 Evaluation datasets.
Proceedings ArticleDOI
Cross-lingual acoustic modeling for Indian languages based on Subspace Gaussian Mixture Models
TL;DR: It is observed that the word accuracy of cross-lingual acoustic model of Bengali was approximately 2.5% above it's CDHMM model and gave equivalent performance as it's monolingual SGMM model.
Proceedings ArticleDOI
Improved acoustic modeling of low-resource languages using shared SGMM parameters of high-resource languages
TL;DR: This paper investigates methods to improve the recognition performance of low-resource languages with limited training data by borrowing subspace parameters from a high-resource language in subspace Gaussian mixture model (SGMM) framework and gets consistent improvement in performance over conventional monolingual SGMM of the low- resource language.
References
More filters
The HTK book version 3.4
Steve Young,Gunnar Evermann,Mjf Gales,D.J. Kershaw,G.L. Moore,JJ Odell,DG Ollason,Daniel Povey,Valtchev,Philip C. Woodland +9 more
Book
Automatic speech recognition : the development of the SPHINX system
Kai-Fu Lee,Raj Reddy +1 more
TL;DR: This paper presents a meta-analysis of the SPHINX system and its applications to speech recognition, finding a good unit of speech and finding a Good Unit of Speech that learns and adapts to new environments.
Journal ArticleDOI
The subspace Gaussian mixture model-A structured model for speech recognition
Daniel Povey,Lukas Burget,Mohit Agarwal,Pinar Akyazi,Feng Kai,Arnab Ghoshal,Ondřej Glembek,Nagendra Kumar Goel,Martin Karafiat,Ariya Rastrow,Richard Rose,Petr Schwarz,Samuel Thomas +12 more
TL;DR: A new approach to speech recognition, in which all Hidden Markov Model states share the same Gaussian Mixture Model (GMM) structure with the same number of Gaussians in each state, appears to give better results than a conventional model.
Proceedings ArticleDOI
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models
Lukas Burget,Petr Schwarz,Mohit Agarwal,Pinar Akyazi,Kai Feng,Arnab Ghoshal,Ondrej Glembek,Nagendra Kumar Goel,Martin Karafiat,Daniel Povey,Ariya Rastrow,Richard Rose,Samuel Thomas +12 more
TL;DR: This work reports experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages.