scispace - formally typeset
Journal ArticleDOI

A large-vocabulary continuous speech recognition system for Hindi

Reads0
Chats0
TLDR
This paper presents two new techniques that have been used to build a large-vocabulary continuous Hindi speech recognition system and proposes a hybrid approach that combines rule-based and statistical approaches in a two-step fashion.
Abstract
In this paper we present two new techniques that have been used to build a large-vocabulary continuous Hindi speech recognition system. We present a technique for fast bootstrapping of initial phone models of a new language. The training data for the new language is aligned using an existing speech recognition engine for another language. This aligned data is used to obtain the initial acoustic models for the phones of the new language. Following this approach requires less training data. We also present a technique for generating baseforms (phonetic spellings) for phonetic languages such as Hindi. As is inherent in phonetic languages, rules generally capture the mapping of spelling to phonemes very well. However, deep linguistic knowledge is required to write all possible rules, and there are some ambiguities in the language that are difficult to capture with rules. On the other hand, pure statistical techniques for base and generation require large amounts of training data that are not readily available. We propose a hybrid approach that combines rule-based and statistical approaches in a two-step fashion. We evaluate the performance of the proposed approaches through various phonetic classification and recognition experiments.

read more

Citations
More filters
Patent

Determining text to speech pronunciation based on an utterance from a user

TL;DR: In this article, a system and methods for automatically building a native phonetic lexicon for a speech-based application trained to process a native (base) language, wherein the native lexicon includes native phonetics transcriptions (base forms) for non-native (foreign) words which are automatically derived from nonnative phonetic transcriptions of the nonnative words.
Patent

Systems and methods for building a native language phoneme lexicon having native pronunciations of non-native words derived from non-native pronunciations

TL;DR: In this paper, a system and methods for automatically building a native phonetic lexicon for a speech-based application trained to process a native (base) language is presented, where the native lexicon includes native PHR for non-native (foreign) words, which are automatically derived from nonnative phonetic transcriptions of the nonnative words.
Journal ArticleDOI

Indian Language Speech Database: A Review

TL;DR: Various Speech Database developed in different Indian Languages for speech recognition system & Text to Speech System are discussed.

Using Gaussian Mixtures for Hindi Speech Recognition System

TL;DR: The statistical framework is reviewed and an iterative procedure to select an optimum number of Gaussian mixtures that exhibits maximum accuracy in the context of Hindi speech recognition system is presented.
Proceedings ArticleDOI

Speech recognition of Malayalam numbers

TL;DR: The speaker independent speech recognition system for Malayalam digits employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and Hidden Markov model (HMM) for recognition.
References
More filters
Book

Statistical methods for speech recognition

TL;DR: The speech recognition problem hidden Markov models the acoustic model basic language modelling the Viterbi search hypothesis search on a tree and the fast match elements of information theory.
Journal ArticleDOI

A tree-based statistical language model for natural language speech recognition

TL;DR: Algorithms are presented for automatically constructing a binary decision tree designed to estimate the probability that a given word will be the next word uttered, which is compared to an equivalent trigram model and shown to be superior.

Issues in Building General Letter to Sound Rules

TL;DR: A general framework for building letter to sound (LTS) rules from a word list in a language, which can be fully automatic, though a small amount of hand seeding can give better results.
Proceedings ArticleDOI

Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task

TL;DR: In this paper, the authors discuss various experimental results using their continuous speech recognition system on the Wall Street Journal task and report experiments with different feature extraction methods, varying amounts and type of training data, and different vocabulary sizes.
Proceedings ArticleDOI

Translingual visual speech synthesis

TL;DR: This work presents a novel scheme to implement a language independent system for audio-driven facial animation given a speech recognition system for just one language, in this case, English.