scispace - formally typeset
Search or ask a question

Showing papers on "Hidden Markov model published in 2003"


Journal ArticleDOI
TL;DR: An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models.
Abstract: We present and compare various methods for computing word alignments using statistical or heuristic models. We consider the five alignment models presented in Brown, Della Pietra, Della Pietra, and Mercer (1993), the hidden Markov alignment model, smoothing techniques, and refinements. These statistical models are compared with two heuristic models based on the Dice coefficient. We present different methods for combining word alignments to perform a symmetrization of directed statistical alignment models. As evaluation criterion, we use the quality of the resulting Viterbi alignment compared to a manually produced reference alignment. We evaluate the models on the German-English Verbmobil task and the French-English Hansards task. We perform a detailed analysis of various design decisions of our statistical alignment system and evaluate these on training corpora of various sizes. An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models. In the Appendix, we present an efficient training algorithm for the alignment models presented.

4,402 citations


Journal ArticleDOI
TL;DR: This paper implemented and tested the ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 different concepts, each with about 40 training images and demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images.
Abstract: Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 different concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images.

1,163 citations


Journal ArticleDOI
TL;DR: This work introduces and test different Bayesian network classifiers for classifying expressions from video, focusing on changes in distribution assumptions, and feature dependency structures, and proposes a new architecture of hidden Markov models (HMMs) for automatically segmenting and recognizing human facial expression from video sequences.

930 citations


Journal ArticleDOI
TL;DR: This paper proposes a text independent method of emotion classification of speech that makes use of short time log frequency power coefficients (LFPC) to represent the speech signals and a discrete hidden Markov model (HMM) as the classifier.

737 citations


Proceedings ArticleDOI
06 Jul 2003
TL;DR: The paper addresses the design of working recognition engines and results achieved with respect to the alluded alternatives and describes a speech corpus consisting of acted and spontaneous emotion samples in German and English language.
Abstract: In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within the first method a global statistics framework of an utterance is classified by Gaussian mixture models using derived features of the raw pitch and energy contour of the speech signal. A second method introduces increased temporal complexity applying continuous hidden Markov models considering several states using low-level instantaneous features instead of global statistics. The paper addresses the design of working recognition engines and results achieved with respect to the alluded alternatives. A speech corpus consisting of acted and spontaneous emotion samples in German and English language is described in detail. Both engines have been tested and trained using this equivalent speech corpus. Results in recognition of seven discrete emotions exceeded 86% recognition rate. As a basis of comparison the similar judgment of human deciders classifying the same corpus at 79.8% recognition rate was analyzed.

599 citations


Proceedings Article
21 Aug 2003
TL;DR: This paper presents a novel discriminative learning technique for label sequences based on a combination of the two most successful learning algorithms, Support Vector Machines and Hidden Markov Models which it is called HM-SVMs and handles dependencies between neighboring labels using Viterbi decoding.
Abstract: This paper presents a novel discriminative learning technique for label sequences based on a combination of the two most successful learning algorithms, Support Vector Machines and Hidden Markov Models which we call Hidden Markov Support Vector Machine. The proposed architecture handles dependencies between neighboring labels using Viterbi decoding. In contrast to standard HMM training, the learning procedure is discriminative and is based on a maximum/soft margin criterion. Compared to previous methods like Conditional Random Fields, Maximum Entropy Markov Models and label sequence boosting, HM-SVMs have a number of advantages. Most notably, it is possible to learn non-linear discriminant functions via kernel functions. At the same time, HM-SVMs share the key advantages with other discriminative methods, in particular the capability to deal with overlapping features. We report experimental evaluations on two tasks, named entity recognition and part-of-speech tagging, that demonstrate the competitiveness of the proposed approach.

538 citations


Journal ArticleDOI
TL;DR: A hand gesture recognition system to recognize continuous gesture before stationary background consisting of a real time hand tracking and extraction, feature extraction, hidden Markov model (HMM) training, and gesture recognition.

511 citations


Proceedings ArticleDOI
31 May 2003
TL;DR: A classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based learning, and hidden Markov model) are combined under different conditions is presented.
Abstract: This paper presents a classifier-combination experimental framework for named entity recognition in which four diverse classifiers (robust linear classifier, maximum entropy, transformation-based learning, and hidden Markov model) are combined under different conditions. When no gazetteer or other additional training resources are used, the combined system attains a performance of 91.6F on the English development data; integrating name, location and person gazetteers, and named entity systems trained on additional, more general, data reduces the F-measure error by a factor of 15 to 21% on the English data.

467 citations


Proceedings ArticleDOI
28 Jul 2003
TL;DR: The authors used conditional random fields (CRFs) for table extraction and compared them with hidden Markov models (HMMs) and showed that CRFs support the use of many rich and overlapping layout and language features, and as a result, they perform significantly better than HMMs.
Abstract: The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often contain tables in order to communicate densely packed, multi-dimensional information. Tables do this by employing layout patterns to efficiently indicate fields and records in two-dimensional form.Their rich combination of formatting and content present difficulties for traditional language modeling techniques, however. This paper presents the use of conditional random fields (CRFs) for table extraction, and compares them with hidden Markov models (HMMs). Unlike HMMs, CRFs support the use of many rich and overlapping layout and language features, and as a result, they perform significantly better. We show experimental results on plain-text government statistical reports in which tables are located with 92% F1, and their constituent lines are classified into 12 table-related categories with 94% accuracy. We also discuss future work on undirected graphical models for segmenting columns, finding cells, and classifying them as data cells or label cells.

421 citations


Journal ArticleDOI
TL;DR: This paper adopts an anomaly detection approach by detecting possible intrusions based on program or user profiles built from normal usage data using a scheme that can be justified from the perspective of hypothesis testing.

370 citations


Proceedings ArticleDOI
18 Jun 2003
TL;DR: This paper proposes to use adaptive hidden Markov models (HMM) to perform video-based face recognition and shows that the proposed algorithm results in better performance than using majority voting of image-based recognition results.
Abstract: While traditional face recognition is typically based on still images, face recognition from video sequences has become popular. In this paper, we propose to use adaptive hidden Markov models (HMM) to perform video-based face recognition. During the training process, the statistics of training video sequences of each subject, and the temporal dynamics, are learned by an HMM. During the recognition process, the temporal characteristics of the test video sequence are analyzed over time by the HMM corresponding to each subject. The likelihood scores provided by the HMMs are compared, and the highest score provides the identity of the test video sequence. Furthermore, with unsupervised learning, each HMM is adapted with the test video sequence, which results in better modeling over time. Based on extensive experiments with various databases, we show that the proposed algorithm results in better performance than using majority voting of image-based recognition results.

01 Jan 2003
TL;DR: This paper describes an HMM-based speech synthesis system (HTS), in which speech waveform is generated from HMMs themselves, and applies it to English speech synthesis using the general speech synthesis architecture of Festival.
Abstract: This paper describes an HMM-based speech synthesis system (HTS), in which speech waveform is generated from HMMs themselves, and applies it to English speech synthesis using the general speech synthesis architecture of Festival. Similarly to other datadriven speech synthesis approaches, HTS has a compact language dependent module: a list of contextual factors. Thus, it could easily be extended to other languages, though the first version of HTS was implemented for Japanese. The resulting run-time engine of HTS has the advantage of being small: less than 1 M bytes, excluding text analysis part. Furthermore, HTS can easily change voice characteristics of synthesized speech by using a speaker adaptation technique developed for speech recognition. The relation between the HMM-based approach and other unit selection approaches is also discussed.

Journal ArticleDOI
TL;DR: This work presents a robust algorithm for multipitch tracking of noisy speech that combines an improved channel and peak selection method, a new method for extracting periodicity information across different channels, and a hidden Markov model for forming continuous pitch tracks.
Abstract: An effective multipitch tracking algorithm for noisy speech is critical for acoustic signal processing. However, the performance of existing algorithms is not satisfactory. We present a robust algorithm for multipitch tracking of noisy speech. Our approach integrates an improved channel and peak selection method, a new method for extracting periodicity information across different channels, and a hidden Markov model (HMM) for forming continuous pitch tracks. The resulting algorithm can reliably track single and double pitch tracks in a noisy environment. We suggest a pitch error measure for the multipitch situation. The proposed algorithm is evaluated on a database of speech utterances mixed with various types of interference. Quantitative comparisons show that our algorithm significantly outperforms existing ones.

Journal ArticleDOI
TL;DR: A general hidden Markov model for simultaneously estimating transition rates and probabilities of stage misclassification of chronic diseases, based on data from a trial of aortic aneurysm screening, in which the screening measurements are subject to error.
Abstract: Summary. Many chronic diseases have a natural interpretation in terms of staged progression. Multistate models based on Markov processes are a well-established method of estimating rates of transition between stages of disease. However, diagnoses of disease stages are sometimes subject to error. The paper presents a general hidden Markov model for simultaneously estimating transition rates and probabilities of stage misclassification. Covariates can be fitted to both the transition rates and the misclassification probabilities. For example, in the study of abdominal aortic aneurysms by ultrasonography, the disease is staged by severity, according to successive ranges of aortic diameter. The model is illustrated on data from a trial of aortic aneurysm screening, in which the screening measurements are subject to error. General purpose software for model implementation has been developed in the form of an R package and is made freely available.

Proceedings ArticleDOI
13 Oct 2003
TL;DR: A Dynamically Multi-Linked Hidden Markov Model (DML-HMM) is developed to interpret group activities involving multiple objects captured in an outdoor scene based on the discovery of salient dynamic interlinks among multiple temporal events using DPNs.
Abstract: Dynamic Probabilistic Networks (DPNs) are exploited for modeling the temporal relationships among a set of different object temporal events in the scene for a coherent and robust scene-level behaviour interpretation. In particular, we develop a Dynamically Multi-Linked Hidden Markov Model (DML-HMM) to interpret group activities involving multiple objects captured in an outdoor scene. The model is based on the discovery of salient dynamic interlinks among multiple temporal events using DPNs. Object temporal events are detected and labeled using Gaussian Mixture Models with automatic model order selection. A DML-HMM is built using Schwarz's Bayesian Information Criterion based factorisation resulting in its topology being intrinsically determined by the underlying causality and temporal order among different object events. Our experiments demonstrate that its performance on modelling group activities in a noisy outdoor scene is superior compared to that of a Multi-Observation Hidden Markov Model (MOHMM), a Parallel Hidden Markov Model (PaHMM) and a Coupled Hidden Markov Model (CHMM).

Journal ArticleDOI
03 Jul 2003
TL;DR: This work proposes to use Hidden Markov Models (HMMs) to account for the horizontal dependencies along the time axis in time course data and to cope with the prevalent errors and missing values and evaluates the method on published yeast cell cycle and fibroblasts serum response datasets.
Abstract: Motivation: Cellular processes cause changes over time. Observing and measuring those changes over time allows insights into the how and why of regulation. The experimental platform for doing the appropriate large-scale experiments to obtain time-courses of expression levels is provided by microarray technology. However, the proper wa yo fanalyzing the resulting time course data is still very much an issue under investigation. The inherent time dependencies in the data suggest that clustering techniques which reflect those dependencies yield improved performance. Results: We propose to use Hidden Markov Models (HMMs) to account for the horizontal dependencies along the time axis in time course data and to cope with the prevalent errors and missing values. The HMMs are used within a model-based clustering framework. We are given an umber of clusters, each represented by one Hidden Markov Model from a finite collection encompassing typical qualitative behavior. Then, our method finds in an iterative procedure cluster models and an assignment of data points to these models that maximizes the joint likelihood of clustering and models. Partially supervised learning—adding groups of labeled data to the initial collection of clusters—is supported. A graphical user interface allows quering an expression profile dataset for time course similar to a prototype graphically defined as a sequence of levels and durations. We also propose a heuristic approach to automate determination of the number of clusters. We evaluate the method on published yeast cell cycle and fibroblasts serum response datasets, and compare them, with favorable results, to the autoregressive curves method. Availability: The software is freely available at http://

Proceedings ArticleDOI
26 Oct 2003
TL;DR: This work builds a system for automatic chord transcription using speech recognition tools, and uses “pitch class profile” vectors to emphasize the tonal content of the signal, and shows that these features far outperform cepstral coefficients for this task.
Abstract: Automatic extraction of content description from commercial audio recordings has a number of important applications, from indexing and retrieval through to novel musicological analyses based on very large corpora of recorded performances. Chord sequences are a description that captures much of the character of a piece in a compact form and using a modest lexicon. Chords also have the attractive property that a piece of music can (mostly) be segmented into time intervals that consist of a single chord, much as recorded speech can (mostly) be segmented into time intervals that correspond to specific words. In this work, we build a system for automatic chord transcription using speech recognition tools. For features we use “pitch class profile” vectors to emphasize the tonal content of the signal, and we show that these features far outperform cepstral coefficients for our task. Sequence recognition is accomplished with hidden Markov models (HMMs) directly analogous to subword models in a speech recognizer, and trained by the same Expectation-Maximization (EM) algorithm. Crucially, this allows us to use as input only the chord sequences for our training examples, without requiring the precise timings of the chord changes — which are determined automatically during training. Our results on a small set of 20 early Beatles songs show frame-level accuracy of around 75% on a forced-alignment task.

Journal ArticleDOI
TL;DR: This work considers the on‐line Bayesian analysis of data by using a hidden Markov model, where inference is tractable conditional on the history of the state of the hidden component, and shows that a new particle filter algorithm is introduced and shown to produce promising results when analysing data of this type.
Abstract: We consider the on-line Bayesian analysis of data by using a hidden Markov model, where inference is tractable conditional on the history of the state of the hidden component. A new particle filter algorithm is introduced and shown to produce promising results when analysing data of this type. The algorithm is similar to the mixture Kalman filter but uses a different resampling algorithm. We prove that this resampling algorithm is computationally efficient and optimal, among unbiased resampling algorithms, in terms of minimizing a squared error loss function. In a practical example, that of estimating break points from well-log data, our new particle filter outperforms two other particle filters, one of which is the mixture Kalman filter, by between one and two orders of magnitude.

Journal ArticleDOI
TL;DR: A new forward-backward algorithm is proposed whose computational complexity is only O((MD + M/sup 2/)T), a reduction by almost a factor of D when D > M and whose memory requirement is O(MT).
Abstract: Existing algorithms for estimating the model parameters of an explicit-duration hidden Markov model (HMM) usually require computations as large as O((MD/sup 2/ + M/sup 2/)T) or O(M/sup 2/ DT), where M is the number of states; D is the maximum possible interval between state transitions; and T is the period of observations used to estimate the model parameters. Because of such computational requirements, these algorithms are not practical when we wish to construct an HMM model with large state space and large explicit state duration and process a large amount of measurement data to obtain high accuracy. We propose a new forward-backward algorithm whose computational complexity is only O((MD + M/sup 2/)T), a reduction by almost a factor of D when D > M and whose memory requirement is O(MT). As an application example, we discuss an HMM characterization of access traffic observed at a large-scale Web site: we formulate the Web access pattern in terms of an HMM with explicit duration and estimate the model parameters using our algorithm.

Journal ArticleDOI
TL;DR: A fast algorithm to approximate the Kullback-Leibler distance (KLD) between two dependence tree models is presented, which offers a saving of hundreds of times in computational complexity compared to the commonly used Monte Carlo method.
Abstract: We present a fast algorithm to approximate the Kullback-Leibler distance (KLD) between two dependence tree models. The algorithm uses the "upward" (or "forward") procedure to compute an upper bound for the KLD. For hidden Markov models, this algorithm is reduced to a simple expression. Numerical experiments show that for a similar accuracy, the proposed algorithm offers a saving of hundreds of times in computational complexity compared to the commonly used Monte Carlo method. This makes the proposed algorithm important for real-time applications, such as image retrieval.

Journal ArticleDOI
TL;DR: A computational method that uses Hidden Markov Models and an Expectation Maximization algorithm to detect cis-regulatory modules, given the weight matrices of a set of transcription factors known to work together is developed.
Abstract: Motivation: The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity. Results: We develop a computational method that uses Hidden Markov Models and an Expectation Maximization algorithm to detect such modules, given the weight matrices of a set of transcription factors known to work together. Two novel features of our probabilistic model are: (i) correlations between binding sites, known to be required for module activity, are exploited, and (ii) phylogenetic comparisons among sequences from multiple species are made to highlight a regulatory module. The novel features are shown to improve detection of modules, in experiments on synthetic as well as biological data. Availability: The source code for the programs is available upon request from the authors.

Proceedings Article
Hung Bui1
09 Aug 2003
TL;DR: It is shown that the A H M E M can repre­ sent a richer class of probabilistic plans, and at the same time derive an efficient algorithm for plan recognition based on the RaoBlackwellised Particle Filter approximate inference method.
Abstract: We present a new general framework for online probabilistic plan recognition called the Abstract Hidden Markov Memory Model (AHMEM). The new model is an extension of the existing Abstract Hidden Markov Model to allow the policy to have internal memory which can be updated in a Markov fashion. We show that the A H M E M can repre­ sent a richer class of probabilistic plans, and at the same time derive an efficient algorithm for plan recognition in the A H M E M based on the RaoBlackwellised Particle Filter approximate inference method.

Proceedings ArticleDOI
31 May 2003
TL;DR: Two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation are discussed, both of which are a character-level HMM with minimal context information and a maximum-entropy conditional markov model with substantially richer context features.
Abstract: We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level HMM with minimal context information, and the second model is a maximum-entropy conditional markov model with substantially richer context features. Our best model achieves an overall F1 of 86.07% on the English test data (92.31% on the development data). This number represents a 25% error reduction over the same model without word-internal (substring) features.

Proceedings ArticleDOI
18 Jun 2003
TL;DR: An approach is proposed for clustering time-series data that allows each sequence to belong to more than a single HMM with some probability, and the hard decision about the sequence class membership can be deferred until a later time when such a decision is required.
Abstract: An approach is proposed for clustering time-series data. The approach can be used to discover groupings of similar object motions that were observed in a video collection. A finite mixture of hidden Markov models (HMMs) is fitted to the motion data using the expectation maximization (EM) framework. Previous approaches for HMM-based clustering employ a k-means formulation, where each sequence is assigned to only a single HMM. In contrast, the formulation presented in this paper allows each sequence to belong to more than a single HMM with some probability, and the hard decision about the sequence class membership can be deferred until a later time when such a decision is required. Experiments with simulated data demonstrate the benefit of using this EM-based approach when there is more "overlap" in the processes generating the data. Experiments with real data show the promising potential of HMM-based motion clustering in a number of applications.

Journal ArticleDOI
TL;DR: The use of features derived from multiresolution analysis of speech and the Teager energy operator for classification of drivers' speech under stressed conditions and the problem of choosing a suitable temporal scale for representing categorical differences in the data is addressed.

Journal ArticleDOI
TL;DR: A new forward-backward algorithm is proposed whose complexity is similar to that of the Viterbi algorithm in terms of sequence length (quadratic in the worst case in time and linear in space) and opens the way to the maximum likelihood estimation of hidden semi-Markov chains from long sequences.
Abstract: This article addresses the estimation of hidden semi-Markov chains from nonstationary discrete sequences. Hidden semi-Markov chains are particularly useful to model the succession of homogeneous zones or segments along sequences. A discrete hidden semi-Markov chain is composed of a nonobservable state process, which is a semi-Markov chain, and a discrete output process. Hidden semi-Markov chains generalize hidden Markov chains and enable the modeling of various durational structures. From an algorithmic point of view, a new forward-backward algorithm is proposed whose complexity is similar to that of the Viterbi algorithm in terms of sequence length (quadratic in the worst case in time and linear in space). This opens the way to the maximum likelihood estimation of hidden semi-Markov chains from long sequences. This statistical modeling approach is illustrated by the analysis of branching and flowering patterns in plants.

Journal ArticleDOI
TL;DR: A statistical approach based on a hidden Markov model (HMM) is used, which takes into account several features of the band-limited speech, and enhanced speech exhibits a significantly improved quality without objectionable artifacts.

Patent
04 Mar 2003
TL;DR: In this article, a pattern recognition system and method for multi-state Hidden Markov Models (HMM) is described. But this method is not suitable for the use of HMM blocks.
Abstract: A pattern recognition system and method are provided. Aspects of the invention are particularly useful in combination with multi-state Hidden Markov Models. Pattern recognition is effected by processing Hidden Markov Model Blocks. This block-processing allows the processor to perform more operations upon data while such data is in cache memory. By so increasing cache locality, aspects of the invention provide significantly improved pattern recognition speed.

Proceedings ArticleDOI
24 Nov 2003
TL;DR: In this paper, the binarized background-subtracted image is used as the feature vector and different distance metrics, such as those based on the L/sub 1/ and L/ sub 2/ norms of the vector difference, and the normalized inner product of the vectors, are used to measure the similarity between feature vectors.
Abstract: In this paper we propose a generic framework based on hidden Markov models (HMMs) for recognition of individuals from their gait. The HMM framework is suitable, because the gait of an individual can be visualized as his adopting postures from a set, in a sequence which has an underlying structured probabilistic nature. The postures that the individual adopts can be regarded as the states of the HMM and are typical to that individual and provide a means of discrimination. The framework assumes that, during gait, the individual transitions between N discrete postures or states but it is not dependent on the particular feature vector used to represent the gait information contained in the postures. The framework, thus, provides flexibility in the selection of the feature vector. The statistical nature of the HMM lends robustness to the model. In this paper we use the binarized background-subtracted image as the feature vector and use different distance metrics, such as those based on the L/sub 1/ and L/sub 2/ norms of the vector difference, and the normalized inner product of the vectors, to measure the similarity between feature vectors. The results we obtain are better than the baseline recognition rates reported before.

Journal ArticleDOI
TL;DR: A new Bayesian formulation forParametric image segmentation is presented, based on the key idea of using a doubly stochastic prior model for the label field, which allows one to find exact optimal estimators for both this field and the model parameters by the minimization of a differentiable function.
Abstract: Parametric image segmentation consists of finding a label field that defines a partition of an image into a set of nonoverlapping regions and the parameters of the models that describe the variation of some property within each region. A new Bayesian formulation for the solution of this problem is presented, based on the key idea of using a doubly stochastic prior model for the label field, which allows one to find exact optimal estimators for both this field and the model parameters by the minimization of a differentiable function. An efficient minimization algorithm and comparisons with existing methods on synthetic images are presented, as well as examples of realistic applications to the segmentation of Magnetic Resonance volumes and to motion segmentation.