scispace - formally typeset
Search or ask a question

Showing papers on "Hidden Markov model published in 1993"


Journal ArticleDOI
TL;DR: The results suggest the presence of an EF-hand calcium binding motif in a highly conserved and evolutionary preserved putative intracellular region of 155 residues in the alpha-1 subunit of L-type calcium channels which play an important role in excitation-contraction coupling.

2,033 citations


Book
01 Oct 1993
TL;DR: Connectionist Speech Recognition: A Hybrid Approach describes the theory and implementation of a method to incorporate neural network approaches into state-of-the-art continuous speech recognition systems based on Hidden Markov Models (HMMs) to improve their performance.
Abstract: From the Publisher: Connectionist Speech Recognition: A Hybrid Approach describes the theory and implementation of a method to incorporate neural network approaches into state-of-the-art continuous speech recognition systems based on Hidden Markov Models (HMMs) to improve their performance. In this framework, neural networks (and in particular, multilayer perceptrons or MLPs) have been restricted to well-defined subtasks of the whole system, i.e., HMM emission probability estimation and feature extraction. The book describes a successful five year international collaboration between the authors. The lessons learned form a case study that demonstrates how hybrid systems can be developed to combine neural networks with more traditional statistical approaches. The book illustrates both the advantages and limitations of neural networks in the framework of a statistical system. Using standard databases and comparing with some conventional approaches, it is shown that MLP probability estimation can improve recognition performance. Other approaches are discussed, though there is no such unequivocal experimental result for these methods. Connectionist Speech Recognition: A Hybrid Approach is of use to anyone intending to use neural networks for speech recognition or within the framework provided by an existing successful statistical approach. This includes research and development groups working in the field of speech recognition, both with standard and neural network approaches, as well as other pattern recognition and/or neural network researchers. This book is also suitable as a text for advanced courses on neural networks or speech processing.

1,328 citations



PatentDOI
TL;DR: The invention is a system failure monitoring method and apparatus which learns the symptom-fault mapping directly from training data and takes advantage of temporal context and estimate class probabilities conditioned on recent past history.

320 citations


Journal ArticleDOI
TL;DR: The online EM schemes have significantly reduced memory requirements and improved convergence, and they can estimate HMM parameters that vary slowly with time or undergo infrequent jump changes.
Abstract: Sequential or online hidden Markov model (HMM) signal processing schemes are derived, and their performance is illustrated by simulation. The online algorithms are sequential expectation maximization (EM) schemes and are derived by using stochastic approximations to maximize the Kullback-Leibler information measure. The schemes can be implemented either as filters or fixed-lag or sawtooth-lag smoothers. They yield estimates of the HMM parameters including transition probabilities, Markov state levels, and noise variance. In contrast to the offline EM algorithm (Baum-Welch scheme), which uses the fixed-interval forward-backward scheme, the online schemes have significantly reduced memory requirements and improved convergence, and they can estimate HMM parameters that vary slowly with time or undergo infrequent jump changes. Similar techniques are used to derive online schemes for extracting finite-state Markov chains imbedded in a mixture of white Gaussian noise (WGN) and deterministic signals of known functional form with unknown parameters. >

289 citations


Journal ArticleDOI
TL;DR: The nontraditional approach to the problem of estimating the parameters of a stochastic linear system is presented and it is shown how the evolution of the dynamics as a function of the segment length can be modeled using alternative assumptions.
Abstract: A nontraditional approach to the problem of estimating the parameters of a stochastic linear system is presented. The method is based on the expectation-maximization algorithm and can be considered as the continuous analog of the Baum-Welch estimation algorithm for hidden Markov models. The algorithm is used for training the parameters of a dynamical system model that is proposed for better representing the spectral dynamics of speech for recognition. It is assumed that the observed feature vectors of a phone segment are the output of a stochastic linear dynamical system, and it is shown how the evolution of the dynamics as a function of the segment length can be modeled using alternative assumptions. A phoneme classification task using the TIMIT database demonstrates that the approach is the first effective use of an explicit model for statistical dependence between frames of speech. >

238 citations


PatentDOI
TL;DR: The principle of minimum recognition error rate is applied by the present invention using discriminative training and various issues related to the special structure of HMMs are presented.
Abstract: A system pattern-based speech recognition, e.g., a hidden Markov model (HMM) based speech recognizer using Viterbi scoring. The principle of minimum recognition error rate is applied by the present invention using discriminative training. Various issues related to the special structure of HMMs are presented. Parameter update expressions for HMMs are provided.

238 citations


Journal ArticleDOI
TL;DR: The method has two important advantages: the probability of each residue being in each of the modeled secondary structural elements is computed using the totality of the amino acid sequence, and these probabilities are consistent with prior knowledge of realizable domain folds as encoded in each model.
Abstract: A new method has been developed to compute the probability that each amino acid in a protein sequence is in a particular secondary structural element. Each of these probabilities is computed using the entire sequence and a set of predefined structural class models. This set of structural classes is patterned after Jane Richardson's taxonomy for the domains of globular proteins. For each structural class considered, a mathematical model is constructed to represent constraints on the pattern of secondary structural elements characteristic of that class. These are stochastic models having discrete state spaces (referred to as hidden Markov models by researchers in signal processing and automatic speech recognition). Each model is a mathematical generator of amino acid sequences; the sequence under consideration is modeled as having been generated by one model in the set of candidates. The probability that each model generated the given sequence is computed using a filtering algorithm. The protein is then classified as belonging to the structural class having the most probable model. The secondary structure of the sequence is then analyzed using a "smoothing" algorithm that is optimal for that structural class model. For each residue position in the sequence, the smoother computes the probability that the residue is contained within each of the defined secondary structural elements of the model. This method has two important advantages: (1) the probability of each residue being in each of the modeled secondary structural elements is computed using the totality of the amino acid sequence, and (2) these probabilities are consistent with prior knowledge of realizable domain folds as encoded in each model. As an example of the method's utility, we present its application to flavodoxin, a prototypical alpha/beta protein having a central beta-sheet, and to thioredoxin, which belongs to a similar structural class but shares no significant sequence similarity.

203 citations


Journal ArticleDOI
02 May 1993
TL;DR: The problem of how human skill can be represented as a parametric model using a hidden Markov model (HMM) and how an HMM-based skill model can be used to learn human skill are discussed.
Abstract: In this paper, we discuss the problem of how human skill can be represented as a parametric model using a hidden Markov model (HMM), and how an HMM-based skill model can be used to learn human skill. HMM is feasible to characterize a doubly stochastic process-measurable action and immeasurable mental states-that is involved in the skill learning. We formulated the learning problem as a multidimensional HMM and developed a testbed for a variety of skill learning applications. Based on "the most likely performance" criterion, the best action sequence can be selected from all previously measured action data by modeling the skill as an HMM. The proposed method has been implemented in the teleoperation control of a space station robot system, and some important implementation issues have been discussed. The method allows a robot to learn human skill in certain tasks and to improve motion performance. >

202 citations


Proceedings Article
01 Jul 1993
TL;DR: A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the columns of a multiple alignment of that family is introduced, which can improve the quality of HMMs produced from small training sets.
Abstract: A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the colunms of a multiple alignment of that family is intro- duced This method uses Dirichlet mixture densi- ties as priors over amino acid distributions These mixture densities are determined from examina- tion of previously constructed tlMMs or multiple alignments It is shown that this Bayesian method can improve the quality of ItMMs produced from small training sets Specific experiments on the EF-hand motif are reported, for which these pri- ors are shown to produce HMMs with higher like- lihood on unseen data, and fewer fal~ positives and false negatives in a database search task

196 citations


Book ChapterDOI
27 Jun 1993
TL;DR: This paper presents a method by which a reinforcement learning agent can solve the incomplete perception problem using memory by using a hidden Markov model to represent its internal state space and creating memory capacity by splitting states of the HMM.
Abstract: This paper presents a method by which a reinforcement learning agent can solve the incomplete perception problem using memory. The agent uses a hidden Markov model (HMM) to represent its internal state space and creates memory capacity by splitting states of the HMM. The key idea is a test to determine when and how a state should be split: the agent only splits a state when doing so will help the agent predict utility. Thus the agent can create only as much memory as needed to perform the task at hand—not as much as would be required to model all the perceivable world. I call the technique UDM, for Utile Distinction Memory.

Journal ArticleDOI
TL;DR: This article describes an automatic procedure for the segmentation of speech: given either the linguistic or the phonetic content of a speech utterance, the system provides phone boundaries.

Proceedings ArticleDOI
E. Bocchieri1
27 Apr 1993
TL;DR: The author presents an efficient method for the computation of the likelihoods defined by weighted sums (mixtures) of Gaussians, which uses vector quantization of the input feature vector to identify a subset of Gaussian neighbors.
Abstract: In speech recognition systems based on continuous observation density hidden Markov models, the computation of the state likelihoods is an intensive task. The author presents an efficient method for the computation of the likelihoods defined by weighted sums (mixtures) of Gaussians. This method uses vector quantization of the input feature vector to identify a subset of Gaussian neighbors. It is shown that, under certain conditions, instead of computing the likelihoods of all the Gaussians, one needs to compute the likelihoods of only the Gaussian neighbours. Significant (up to a factor of nine) likelihood computation reductions have been obtained on various data bases, with only a small loss of recognition accuracy. >

PatentDOI
Masafumi Nishimura1, Masaaki Okochi1
TL;DR: Fenonic hidden Markov models for speech transformation candidates are combined with N-gram probabilities (where N is all integer greater than or equal to 2) to produce models of words.
Abstract: Analysis of a word input from a speech input device 1 for its features is made by a feature extractor 4 to obtain a feature vector sequence corresponding to said word, or to obtain a label sequence by applying a further transformation in a labeler 8. Fenonic hidden Markov models for speech transformation candidates are combined with N-gram probabilities (where N is all integer greater than or equal to 2) to produce models of words. The recognizer determines the probability that the speech model composed for each candidate word would output the label sequence or feature vector sequence input as speech, and outputs the candidate word corresponding to the speech model having the highest probability to a display 19.

Proceedings ArticleDOI
27 Apr 1993
TL;DR: It was found that supervised speaker adaptation based on two gender-dependent models gave a better result than that obtained with a single SI seed, compared with speaker-dependent training.
Abstract: A number of issues related to the application of Bayesian learning techniques to speaker adaptation are investigated. It is shown that the seed models required to construct prior densities to obtain the MAP (maximum a posteriori) estimate can be a speaker-independent (SI) model, a set of female and male models, or even a task-independent acoustic model. Speaker-adaptive training algorithms are shown to be effective in improving the performance of both speaker-dependent and speaker-independent speech recognition systems. The segmental MAP estimation formulation is used to perform adaptive acoustic modeling for speaker adaptation applications. Tested on an RM (resource management) task, it was found that supervised speaker adaptation based on two gender-dependent models gave a better result than that obtained with a single SI seed. Compared with speaker-dependent training, speaker adaptation achieved an equal or better performance with the same amount of training/adaptation data. >

Journal ArticleDOI
TL;DR: A new class of hidden Markov models is proposed for the acoustic representation of words in an automatic speech recognition system that is more flexible than previously reported fenone-based word models, which lead to an improved capability of modeling variations in pronunciation.
Abstract: A new class of hidden Markov models is proposed for the acoustic representation of words in an automatic speech recognition system. The models, built from combinations of acoustically based sub-word units called fenones, are derived automatically from one or more sample utterances of a word. Because they are more flexible than previously reported fenone-based word models, they lead to an improved capability of modeling variations in pronunciation. They are therefore particularly useful in the recognition of continuous speech. In addition, their construction is relatively simple, because it can be done using the well-known forward-backward algorithm for parameter estimation of hidden Markov models. Appropriate reestimation formulas are derived for this purpose. Experimental results obtained on a 5000-word vocabulary natural language continuous speech recognition task are presented to illustrate the enhanced power of discrimination of the new models. >

Proceedings ArticleDOI
19 Oct 1993
TL;DR: A segment-based speech recognition scheme is proposed to explicitly model the correlations between successive frames of an acoustic segment by using features representing the contours of spectral parameters by using several lower-order coefficients of discrete orthonormal polynomial expansions.
Abstract: A segment-based speech recognition scheme is proposed The basic idea is to explicitly model the correlations between successive frames of an acoustic segment by using features representing the contours of spectral parameters These segmental features are several lower-order coefficients of discrete orthonormal polynomial expansions The performance of the proposed scheme was examined by simulations on multi-speaker speech recognition for all 408 highly confusing first-tone Mandarin syllables A recognition rate of 774% was achieved for the case, using five 6-segment reference templates per syllable This is 130% and 66% higher than those obtained by a conventional dynamic time warping (DTW) method and a conventional hidden Markov model (CHMM) method, respectively >

Journal ArticleDOI
TL;DR: A new method for analyzing the amino acid sequences of proteins using the hidden Markov model (HMM), which is a type of stochastic model that is 'without grammar' (no rule for the appearance patterns of secondary structure).
Abstract: The purpose of this paper is to introduce a new method for analyzing the amino acid sequences of proteins using the hidden Markov model (HMM), which is a type of stochastic model. Secondary structures such as helix, sheet and turn are learned by HMMs, and these HMMs are applied to new sequences whose structures are unknown. The output probabilities from the HMMs are used to predict the secondary structures of the sequences. The authors tested this prediction system on approximately 100 sequences from a public database (Brookhaven PDB). Although the implementation is 'without grammar' (no rule for the appearance patterns of secondary structure) the result was reasonable.

Journal ArticleDOI
TL;DR: A shared-distribution hidden Markov model (HMM) is presented for speaker-independent continuous speech recognition that reduced the word error rate on the DARPA Resource Management task by 20% in comparison with the generalized-triphone model.
Abstract: A shared-distribution hidden Markov model (HMM) is presented for speaker-independent continuous speech recognition. The output distributions across different phonetic HMMs are shared with each other when they exhibit acoustic similarity. This sharing provides the freedom to use a larger number of Markov states for each phonetic model. Although an increase in the number of states will increase the total number of free parameters, with distribution sharing one can collapse redundant states while maintaining necessary ones. The shared-distribution model reduced the word error rate on the DARPA Resource Management task by 20% in comparison with the generalized-triphone model. >

PatentDOI
TL;DR: A method for training a speech recognizer in a speech recognition system is described, which comprises the steps of providing a data base containing acoustic speech units, generating a homoscedastic hidden Markov model from theoustic speech units in the data base, and loading the homoscelled hidden MarkOV model into the speech Recognizer.
Abstract: A method for training a speech recognizer in a speech recognition system is described. The method of the present invention comprises the steps of providing a data base containing acoustic speech units, generating a homoscedastic hidden Markov model from the acoustic speech units in the data base, and loading the homoscedastic hidden Markov model into the speech recognizer. The hidden Markov model loaded into the speech recognizer has a single covariance matrix which represents the tied covariance matrix of every Gaussian probability density function PDF for every state of every hidden Markov model structure in the homoscedastic hidden Markov model.

Proceedings ArticleDOI
27 Apr 1993
TL;DR: A segmental speech model is used to develop a secondary processing algorithm that rescores putative events hypothesized by a primary HMM word spotter to try to improve performance by discriminating true keywords from false alarms.
Abstract: The authors present a segmental speech model that explicitly models the dynamics in a variable-duration speech segment by using a time-varying trajectory model of the speech features in the segment. Each speech segment is represented by a set of statistics which includes a time-varying trajectory, a residual error covariance around the trajectory, and the number of frames in the segment. These statistics replace the frames in the segment and become the data that are modeled by either HMMs (hidden Markov models) or mixture models. This segment model is used to develop a secondary processing algorithm that rescores putative events hypothesized by a primary HMM word spotter to try to improve performance by discriminating true keywords from false alarms. This algorithm is evaluated on a keyword spotting task using the Road Rally Database, and performance is shown to improve significantly over that of the primary word spotter. The segmental model is also used on a TIMIT vowel classification task to evaluate its modeling capability. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: The authors present a sinusoidal partial tracking method for additive synthesis of sound by a purely combinatorial hidden Markov model which allows frequency line crossing and can be used for formant tracking.
Abstract: The authors present a sinusoidal partial tracking method for additive synthesis of sound. Partials are tracked by identifying time functions of parameters as underlying trajectories in a successive set of spectral peaks. This is done by a purely combinatorial hidden Markov model. A partial trajectory is considered as a time sequence of peaks which satisfies continuity constraints on parameter slopes. The method allows frequency line crossing and can be used for formant tracking. >

Proceedings ArticleDOI
05 Jan 1993
TL;DR: A variant of the expectation maximization algorithm known as the Viterbi algorithm is used to obtain the statistical model from the unaligned sequences, and a multiple alignment of the 400 sequences and 225 other globin sequences was obtained that agrees almost perfectly with a structural alignment by D Bashford et al. (1987).
Abstract: The authors apply hidden Markov models to the problem of statistical modeling and multiple sequence alignment of protein families. A variant of the expectation maximization algorithm known as the Viterbi algorithm is used to obtain the statistical model from the unaligned sequences. In a detailed series of experiments, they have taken 400 unaligned globin sequences, and produced a statistical model entirely automatically from the primary sequences. The authors used no prior knowledge of globin structure. Using this model, a multiple alignment of the 400 sequences and 225 other globin sequences was obtained that agrees almost perfectly with a structural alignment by D. Bashford et al. (1987). This model can also discriminate all these 625 globins from nonglobin protein sequences with greater than 99% accuracy, and can thus be used for database searches. >

Proceedings Article
29 Nov 1993
TL;DR: The algorithm is based on minimizing the statistical prediction error by extending the memory, or state length, adaptively, until the total prediction error is sufficiently small and using less than 3000 states the model's performance is far superior to that of fixed memory models with similar number of states.
Abstract: We propose a learning algorithm for a variable memory length Markov process. Human communication, whether given as text, handwriting, or speech, has multi characteristic time scales. On short scales it is characterized mostly by the dynamics that generate the process, whereas on large scales, more syntactic and semantic information is carried. For that reason the conventionally used fixed memory Markov models cannot capture effectively the complexity of such structures. On the other hand using long memory models uniformly is not practical even for as short memory as four. The algorithm we propose is based on minimizing the statistical prediction error by extending the memory, or state length, adaptively, until the total prediction error is sufficiently small. We demonstrate the algorithm by learning the structure of natural English text and applying the learned model to the correction of corrupted text. Using less than 3000 states the model's performance is far superior to that of fixed memory models with similar number of states. We also show how the algorithm can be applied to intergenic E. coli DNA base prediction with results comparable to HMM based methods.

Journal ArticleDOI
TL;DR: The authors review the basic principles of their hybrid HMM/MLP approach and describe a series of improvements that are analogous to the system modifications instituted for the leading conventional HMM systems over the last few years.
Abstract: Over the period of 1987-1991, a series of theoretical and experimental results have suggested that multilayer perceptrons (MLP) are an effective family of algorithms for the smooth estimation of high-dimension probability density functions that are useful in continuous speech recognition. The early form of this work has focused on hidden Markov models (HMM) that are independent of phonetic context. More recently, the theory has been extended to context-dependent models. The authors review the basic principles of their hybrid HMM/MLP approach and describe a series of improvements that are analogous to the system modifications instituted for the leading conventional HMM systems over the last few years. Some of these methods directly trade off computational complexity for reduced requirements of memory and memory bandwidth. Results are presented on the widely used Resource Management speech database that has been distributed by the US National Institute of Standards and Technology. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: The authors consider the estimation of powerful statistical language models using a technique that scales from very small to very large amounts of domain-dependent data, and considers the problem of robustness of a model based on a small training corpus by grouping words into obvious semantic classes.
Abstract: The authors consider the estimation of powerful statistical language models using a technique that scales from very small to very large amounts of domain-dependent data. They begin with improved modeling of the grammar statistics, based on a combination of the backing-off technique and zero-frequency techniques. These are extended to be more amenable to the particular system considered here. The resulting technique is greatly simplified, more robust, and gives improved recognition performance over either of the previous techniques. The authors also consider the problem of robustness of a model based on a small training corpus by grouping words into obvious semantic classes. This significantly improves the robustness of the resulting statistical grammar. A technique that allows the estimation of a high-order model on modest computation resources is also presented. This makes it possible to run a 4-gram statistical model of a 50-million word corpus on a workstation of only modest capability and cost. Finally, the authors discuss results from applying a 2-gram statistical language model integrated in the HMM (hidden Markov model) search, obtaining a list of the N-best recognition results, and rescoring this list with a higher-order statistical model. >

Proceedings ArticleDOI
20 Oct 1993
TL;DR: An adaptation of hidden Markov models (HMM) to automatic recognition of unrestricted handwritten words and many interesting details of a 50,000 vocabulary recognition system for US city names are described.
Abstract: The paper describes an adaptation of hidden Markov models (HMM) to automatic recognition of unrestricted handwritten words. Many interesting details of a 50,000 vocabulary recognition system for US city names are described. This system includes feature extraction, classification, estimation of model parameters, and word recognition. The feature extraction module transforms a binary image to a sequence of feature vectors. The classification module consists of a transformation based on linear discriminant analysis and Gaussian soft-decision vector quantizers which transform feature vectors into sets of symbols and associated likelihoods. Symbols and likelihoods form the input to both HMM training and recognition. HMM training performed in several successive steps requires only a small amount of gestalt labeled data on the level of characters for initialization. HMM recognition based on the Viterbi algorithm runs on subsets of the whole vocabulary. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: In general, the performance of a single state HMM was comparable with that of the multistate HMMs, indicating that the sequential modeling capabilities of HMMs were not exploited.
Abstract: Ergodic, continuous-observation, hidden Markov models (HMMs) were used to perform automatic language classification and detection of speech messages. State observation probability densities were modeled as tied Gaussian mixtures. The algorithm was evaluated on four multilanguage speech databases: a three language subset of the Spoken Language Library, a three language subset of a five-language Rome Laboratory database, the 20-language CCITT database, and the ten-language OGI (Oregon Graduate Institute) telephone speech database. In general, the performance of a single state HMM (i.e., a static Gaussian mixture classifier) was comparable with that of the multistate HMMs, indicating that the sequential modeling capabilities of HMMs were not exploited. >

Journal ArticleDOI
W. Turin1, M.M. Sondhi1
TL;DR: Modifications of the Baum-Welch reestimation algorithm are developed and applied to estimating parameters of error source models that belong to the class of hidden Markov models (HMM) using the results of computer simulation.
Abstract: A modified Baum-Welch algorithm is developed and applied to estimating parameters of error source models that belong to the class of hidden Markov models (HMM). Such models arise in the description of bursty error statistics in communication channels. A key element used repeatedly for estimating parameters of such models is the computation of the likelihood of given sequences of observations. Several recursive methods are available for efficiently computing this likelihood. However, even recursive methods can require prohibitive amounts of computation if the observation sequences are very long. Modifications of the Baum-Welch reestimation algorithm that significantly reduces the computational requirements when the observation sequences contain long stretches of identical observations are discussed. The algorithms are used here to estimate parameters of a binary error source model using the results of computer simulation. >

Proceedings Article
29 Nov 1993
TL;DR: A new approach for on-line recognition of handwritten words written in unconstrained mixed style by fitting a model of the word structure using the EM algorithm to minimize word-level errors.
Abstract: We introduce a new approach for on-line recognition of handwritten words written in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are then coded into low resolution "annotated images" where each pixel contains information about trajectory direction and curvature. The recognizer is a convolution network which can be spatially replicated. From the network output, a hidden Markov model produces word scores. The entire system is globally trained to minimize word-level errors.