scispace - formally typeset
Search or ask a question

Showing papers on "Hidden Markov model published in 1999"


Journal ArticleDOI
TL;DR: An effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences is proposed which shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.
Abstract: In this letter, we develop a robust voice activity detector (VAD) for the application to variable-rate speech coding. The developed VAD employs the decision-directed parameter estimation method for the likelihood ratio test. In addition, we propose an effective hang-over scheme which considers the previous observations by a first-order Markov process modeling of speech occurrences. According to our simulation results, the proposed VAD shows significantly better performances than the G.729B VAD in low signal-to-noise ratio (SNR) and vehicular noise environments.

1,341 citations


Proceedings Article
29 Nov 1999
TL;DR: This paper presents an infinite Gaussian mixture model which neatly sidesteps the difficult problem of finding the "right" number of mixture components and uses an efficient parameter-free Markov Chain that relies entirely on Gibbs sampling.
Abstract: In a Bayesian mixture model it is not necessary a priori to limit the number of components to be finite. In this paper an infinite Gaussian mixture model is presented which neatly sidesteps the difficult problem of finding the "right" number of mixture components. Inference in the model is done using an efficient parameter-free Markov Chain that relies entirely on Gibbs sampling.

1,278 citations


Proceedings ArticleDOI
01 Jan 1999
TL;DR: This work compares the ability of different data modeling methods to represent normal behavior accurately and to recognize intrusions and concludes that for this particular problem, weaker methods than HMMs are likely sufficient.
Abstract: Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. We study one such observable-sequences of system calls into the kernel of an operating system. Using system-call data sets generated by several different programs, we compare the ability of different data modeling methods to represent normal behavior accurately and to recognize intrusions. We compare the following methods: simple enumeration of observed sequences; comparison of relative frequencies of different sequences; a rule induction technique; and hidden Markov models (HMMs). We discuss the factors affecting the performance of each method and conclude that for this particular problem, weaker methods than HMMs are likely sufficient.

1,245 citations


Journal ArticleDOI
TL;DR: A new model for static data is introduced, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise, which shows how independent component analysis is also a variation of the same basic generative model.
Abstract: Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model.We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models.

986 citations


Journal ArticleDOI
TL;DR: IdentiFinderTM, a hidden Markov model that learns to recognize and classify names, dates, times, and numerical quantities, is evaluated and is competitive with approaches based on handcrafted rules on mixed case text and superior on text where case information is not available.
Abstract: In this paper, we present IdentiFinderTM, a hidden Markov model that learns to recognize and classify names, dates, times, and numerical quantities. We have evaluated the model in English (based on data from the Sixth and Seventh Message Understanding Conferences [MUC-6, MUC-7] and broadcast news) and in Spanish (based on data distributed through the First Multilingual Entity Task [MET-1]), and on speech input (based on broadcast news). We report results here on standard materials only to quantify performance on data available to the community, namely, MUC-6 and MET-1. Results have been consistently better than reported by any other learning algorithm. IdentiFinder‘s performance is competitive with approaches based on handcrafted rules on mixed case text and superior on text where case information is not available. We also present a controlled experiment showing the effect of training set size on performance, demonstrating that as little as 100,000 words of training data is adequate to get performance around 90% on newswire. Although we present our understanding of why this algorithm performs so well on this class of problems, we believe that significant improvement in performance may still be possible.

905 citations


Proceedings Article
01 Jan 1999
TL;DR: An HMM-based speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM is described.
Abstract: In this paper, we describe an HMM-based speech synthesis system in which spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM. In the system, pitch and state duration are modeled by multi-space probability distribution HMMs and multi-dimensional Gaussian distributions, respectively. The distributions for spectral parameter, pitch parameter and the state duration are clustered independently by using a decision-tree based context clustering technique. Synthetic speech is generated by using an speech parameter generation algorithm from HMM and a mel-cepstrum based vocoding technique. Through informal listening tests, we have confirmed that the proposed system successfully synthesizes natural-sounding speech which resembles the speaker in the training database.

759 citations


Journal ArticleDOI
Hyeon-Kyu Lee1, Jin Hyung Kim2
TL;DR: A new method is developed using the hidden Markov model (HMM) based technique that calculates the likelihood threshold of an input pattern and provides a confirmation mechanism for the provisionally matched gesture patterns.
Abstract: A new method is developed using the hidden Markov model (HMM) based technique. To handle nongesture patterns, we introduce the concept of a threshold model that calculates the likelihood threshold of an input pattern and provides a confirmation mechanism for the provisionally matched gesture patterns. The threshold model is a weak model for all trained gestures in the sense that its likelihood is smaller than that of the dedicated gesture model for a given gesture. Consequently, the likelihood can be used as an adaptive threshold for selecting proper gesture model. It has, however, a large number of states and needs to be reduced because the threshold model is constructed by collecting the states of all gesture models in the system. To overcome this problem, the states with similar probability distributions are merged, utilizing the relative entropy measure. Experimental results show that the proposed method can successfully extract trained gestures from continuous hand motion with 93.14% reliability.

704 citations


Journal ArticleDOI
TL;DR: The approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states by forming an expectation-maximization (EM) method for training the parametric HMM.
Abstract: A method for the representation, recognition, and interpretation of parameterized gesture is presented. By parameterized gesture we mean gestures that exhibit a systematic spatial variation; one example is a point gesture where the relevant parameter is the two-dimensional direction. Our approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states. Using a linear model of dependence, we formulate an expectation-maximization (EM) method for training the parametric HMM. During testing, a similar EM algorithm simultaneously maximizes the output likelihood of the PHMM for the given sequence and estimates the quantifying parameters. Using visually derived and directly measured three-dimensional hand position measurements as input, we present results that demonstrate the recognition superiority of the PHMM over standard HMM techniques, as well as greater robustness in parameter estimation with respect to noise in the input features. Finally, we extend the PHMM to handle arbitrary smooth (nonlinear) dependencies. The nonlinear formulation requires the use of a generalized expectation-maximization (GEM) algorithm for both training and the simultaneous recognition of the gesture and estimation of the value of the parameter. We present results on a pointing gesture, where the nonlinear approach permits the natural spherical coordinate parameterization of pointing direction.

646 citations


Journal ArticleDOI
TL;DR: A Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees are derived.
Abstract: We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees. A transformation of the tree into a canonical cophenetic matrix form suggests a simple and effective proposal distribution for selecting candidate trees close to the current tree in the chain. We illustrate the algorithm with restriction site data on 9 plant species, then extend to DNA sequences from 32 species of fish. The algorithm mixes well in both examples from random starting trees, generating reproducible estimates and credible sets for the path of evolution.

510 citations


Proceedings ArticleDOI
01 Aug 1999
TL;DR: A novel method for performing blind feedback in the HMM framework, a more complex HMM that models bigram production, and several other algorithmic re nements form a state-of-the-art retrieval system that ranked among the best on the TREC-7 ad hoc retrieval task.
Abstract: We present a new method for information retrieval using hidden Markov models (HMMs). We develop a general framework for incorporating multiple word generation mechanisms within the same model. We then demonstrate that an extremely simple realization of this model substantially outperforms standard tf :idf ranking on both the TREC-6 and TREC-7 ad hoc retrieval tasks. We go on to present a novel method for performing blind feedback in the HMM framework, a more complex HMM that models bigram production, and several other algorithmic re nements. Together, these methods form a state-of-the-art retrieval system that ranked among the best on the TREC-7 ad hoc retrieval task.

480 citations


01 Jan 1999
TL;DR: It is demonstrated that a manually-constructed model that contains multiple states per extraction field outperforms a model with one state per field, and the use of distantly-labeled data to set model parameters provides a significant improvement in extraction accuracy.
Abstract: Statistical machine learning techniques, while well proven in fields such as speech recognition, are just beginning to be applied to the information extraction domain. We explore the use of hidden Markov models for information extraction tasks, specifically focusing on how to learn model structure from data and how to make the best use of labeled and unlabeled data. We show that a manually-constructed model that contains multiple states per extraction field outperforms a model with one state per field, and discuss strategies for learning the model structure automatically from data. We also demonstrate that the use of distantly-labeled data to set model parameters provides a significant improvement in extraction accuracy. Our models are applied to the task of extracting important fields from the headers of computer science research papers, and achieve an extraction accuracy of 92.9%.

Proceedings ArticleDOI
01 Jan 1999
TL;DR: This work introduces a framework for recognizing actions and objects by measuring image-, object- and action-based information from video, which is appropriate for locating and classifying objects under a variety of conditions including full occlusion.
Abstract: Our goal is to exploit human motion and object context to perform action recognition and object classification. Towards this end, we introduce a framework for recognizing actions and objects by measuring image-, object- and action-based information from video. Hidden Markov models are combined with object context to classify hand actions, which are aggregated by a Bayesian classifier to summarize activities. We also use Bayesian methods to differentiate the class of unknown objects by evaluating detected actions along with low-level, extracted object features. Our approach is appropriate for locating and classifying objects under a variety of conditions including full occlusion. We show experiments where both familiar and previously unseen objects are recognized using action and context information.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: A hidden Markov model based on multi-space probability distribution (MSD) can model pitch patterns without heuristic assumption and a reestimation algorithm is derived that can find a critical point of the likelihood function.
Abstract: This paper discusses a hidden Markov model (HMM) based on multi-space probability distribution (MSD). The HMMs are widely-used statistical models to characterize the sequence of speech spectra and have successfully been applied to speech recognition systems. From these facts, it is considered that the HMM is useful for modeling pitch patterns of speech. However, we cannot apply the conventional discrete or continuous HMMs to pitch pattern modeling since the observation sequence of the pitch pattern is composed of one-dimensional continuous values and a discrete symbol which represents "unvoiced". MSD-HMM includes discrete HMMs and continuous mixture HMMs as special cases, and further can model the sequence of observation vectors with variable dimension including zero-dimensional observations, i.e., discrete symbols. As a result, MSD-HMMs can model pitch patterns without heuristic assumption. We derive a reestimation algorithm for the extended HMM and show that it can find a critical point of the likelihood function.

01 Jan 1999
TL;DR: A statistical technique called shrinkage is used that significantly improves parameter estimation of the HMM emission probabilities in the face of sparse training data and the resulting HMM outperforms a state-of-the-art rule-learning system.
Abstract: Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling time series data, and have been applied with success to many language-related tasks such as part of speech tagging, speech recognition, text segmentation and topic detection. This paper describes the application of HMMs to another language related task--information extraction--the problem of locating textual sub-segments that answer a particular information need. In our work, the HMM state transition probabilities and word emission probabilities are learned from labeled training data. As in many machine learning problems, however, the lack of sufficient labeled training data hinders the reliability of the model. The key contribution of this paper is the use of a statistical technique called "shrinkage" that significantly improves parameter estimation of the HMM emission probabilities in the face of sparse training data. In experiments on seminar announcements and Reuters acquisitions articles, shrinkage is shown to reduce error by up to 40%, and the resulting HMM outperforms a state-of-the-art rule-learning system.

Proceedings ArticleDOI
01 Sep 1999
TL;DR: A novel approach to ASL recognition that aspires to being a solution to the scalability problems, based on parallel HMMs (PaHMMs), which model the parallel processes independently and can be trained independently, and do not require consideration of the different combinations at training time.
Abstract: The major challenge that faces American Sign Language (ASL) recognition now is to develop methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes after enforcing linguistic constraints is approximately 5.5/spl times/10/sup 8/. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. Thus, it is not feasible to train conventional hidden Markov models (HMMs) for large-scab ASL applications. Factorial HMMs and coupled HMMs are two extensions to HMMs that explicitly attempt to model several processes occuring in parallel. Unfortunately, they still require consideration of the combinations at training time. In this paper we present a novel approach to ASL recognition that aspires to being a solution to the scalability problems. It is based on parallel HMMs (PaHMMs), which model the parallel processes independently. Thus, they can also be trained independently, and do not require consideration of the different combinations at training time. We develop the recognition algorithm for PaHMMs and show that it runs in time polynomial in the number of states, and in time linear in the number of parallel processes. We run several experiments with a 22 sign vocabulary and demonstrate that PaHMMs can improve the robustness of HMM-based recognition even on a small scale. Thus, PaHMMs are a very promising general recognition scheme with applications in both gesture and ASL recognition.

Journal ArticleDOI
TL;DR: An omnifont, unlimited-vocabulary OCR system for English and Arabic based on hidden Markov models (HMM), an approach that has proven to be very successful in the area of automatic speech recognition, is presented.
Abstract: We present an omnifont, unlimited-vocabulary OCR system for English and Arabic. The system is based on hidden Markov models (HMM), an approach that has proven to be very successful in the area of automatic speech recognition. We focus on two aspects of the OCR system. First, we address the issue of how to perform OCR on omnifont and multi-style data, such as plain and italic, without the need to have a separate model for each style. The amount of training data from each style, which is used to train a single model, becomes an important issue in the face of the conditional independence assumption inherent in the use of HMMs. We demonstrate mathematically and empirically how to allocate training data among the different styles to alleviate this problem. Second, we show how to use a word-based HMM system to perform character recognition with unlimited vocabulary. The method includes the use of a trigram language model on character sequences. Using all these techniques, we have achieved character error rates of 1.1 percent on data from the University of Washington English Document Image Database and 3.3 percent on data from the DARPA Arabic OCR Corpus.

Journal ArticleDOI
TL;DR: A hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies and can be successfully used for handwritten word recognition.
Abstract: Describes a hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies. After preprocessing, a word image is segmented into letters or pseudoletters and represented by two feature sequences of equal length, each consisting of an alternating sequence of shape-symbols and segmentation-symbols, which are both explicitly modeled. The word model is made up of the concatenation of appropriate letter models consisting of elementary HMMs and an HMM-based interpolation technique is used to optimally combine the two feature sets. Two rejection mechanisms are considered depending on whether or not the word image is guaranteed to belong to the lexicon. Experiments carried out on real-life data show that the proposed approach can be successfully used for handwritten word recognition.

Book
01 Aug 1999
TL;DR: Automatic Speech and Speaker Recognition: Advanced Topics groups together in a single volume a number of important topics on speech and speaker recognition, topics which are of fundamental importance, but not yet covered in detail in existing textbooks.
Abstract: Research in the field of automatic speech and speaker recognition has made a number of significant advances in the last two decades, influenced by advances in signal processing, algorithms, architectures, and hardware. These advances include: the adoption of a statistical pattern recognition paradigm; the use of the hidden Markov modeling framework to characterize both the spectral and the temporal variations in the speech signal; the use of a large set of speech utterance examples from a large population of speakers to train the hidden Markov models of some fundamental speech units; the organization of speech and language knowledge sources into a structural finite state network; and the use of dynamic, programming based heuristic search methods to find the best word sequence in the lexical network corresponding to the spoken utterance. Automatic Speech and Speaker Recognition: Advanced Topics groups together in a single volume a number of important topics on speech and speaker recognition, topics which are of fundamental importance, but not yet covered in detail in existing textbooks. Although no explicit partition is given, the book is divided into five parts: Chapters 1-2 are devoted to technology overviews; Chapters 3-12 discuss acoustic modeling of fundamental speech units and lexical modeling of words and pronunciations; Chapters 13-15 address the issues related to flexibility and robustness; Chapter 16-18 concern the theoretical and practical issues of search; Chapters 19-20 give two examples of algorithm and implementational aspects for recognition system realization. Audience: A reference book for speech researchers and graduate students interested in pursuing potential research on the topic. May also be used as a text for advanced courses on the subject.

Proceedings Article
Matthew E. Brand1
20 Sep 1999
TL;DR: A closed-form maximum a posteriori solution for geodesics through the learned density space, thereby obtaining optimal paths over the dynamical manifold gives a completely general way to perform inference over time-series.
Abstract: The mapping between 3D body poses and 2D shadows is fundamentally many-to-many and defeats regression methods, even with windowed context. We show how to learn a function between paths in the two systems, resolving ambiguities by integrating information over the entire length of a sequence. The basis of this function is a configural and dynamical manifold that summarizes the target system's behaviour. This manifold can be modeled from data with a hidden Markov model having special topological properties that we obtain via entropy minimization. Inference is then a matter of solving for the geodesic on the manifold that best explains the evidence in the cue sequence. We give a closed-form maximum a posteriori solution for geodesics through the learned density space, thereby obtaining optimal paths over the dynamical manifold. These methods give a completely general way to perform inference over time-series; in vision they support analysis, recognition, classification and synthesis of behaviours in linear time. We demonstrate with a prototype that infers 3D from monocular monochromatic sequences (e.g., back-subtractions), without using any articulatory body model. The framework readily accommodates multiple cameras and other sources of evidence such as optical flow or feature tracking.

Journal ArticleDOI
TL;DR: This paper addresses an important step toward the goal of automatic musical accompaniment-the segmentation problem, given a score to a piece of monophonic music and a sampled recording of a performance of that score, by designing a hidden Markov model for segmentation.
Abstract: In this paper, we address an important step toward our goal of automatic musical accompaniment-the segmentation problem. Given a score to a piece of monophonic music and a sampled recording of a performance of that score, we attempt to segment the data into a sequence of contiguous regions corresponding to the notes and rests in the score. Within the framework of a hidden Markov model, we model our prior knowledge, perform unsupervised learning of the data model parameters, and compute the segmentation that globally minimizes the posterior expected number of segmentation errors. We also show how to produce "online" estimates of score position. We present examples of our experimental results, and readers are encouraged to access actual sound data we have made available from these experiments.

Book
04 May 1999
TL;DR: HMM Training.
Abstract: Statistical Speech Recognition. Speech Database. Speech Signal Analysis. HMMs and Initialization. HMM Training. Language Models. Recognition. Evaluation and Parameter Setting. References. Index.

Proceedings ArticleDOI
20 Sep 1999
TL;DR: This work has developed a dual on/off database, named IRONOFF, that contains a large number of isolated characters, digits, and cursive words written by French writers and has been designed so that, given an online point, it can be mapped at the correct location in the corresponding scanned image, and conversely, each offline pixel can be temporally indexed.
Abstract: Databases for character recognition algorithms are of fundamental interest for the training of statistics based recognition methods (neural networks, hidden Markov models) as well as for benchmarking existing recognition systems. Such databases currently exist, but none of them gives access to the online data (pen trajectory) and offline data (digital images) for the same writing signal. We have developed such a dual on/off database, named IRONOFF. Currently, it contains a large number of isolated characters, digits, and cursive words written by French writers. We have designed this database so that, given an online point, it can be mapped at the correct location in the corresponding scanned image, and conversely, each offline pixel can be temporally indexed. Since we think this database is of interest for a large part of the research community, it is publicly available.

Journal ArticleDOI
TL;DR: The model characterizes the sequence of measurements by assuming that its probability density function depends on the state of an underlying Markov chain, and the parameter vector includes distribution parameters and transition probabilities between the states.
Abstract: The analysis of routinely collected surveillance data is an important challenge in public health practice. We present a method based on a hidden Markov model for monitoring such time series. The model characterizes the sequence of measurements by assuming that its probability density function depends on the state of an underlying Markov chain. The parameter vector includes distribution parameters and transition probabilities between the states. Maximum likelihood estimates are obtained with a modified EM algorithm. Extensions are provided to take into account trend and seasonality in the data. The method is demonstrated on two examples: the first seeks to characterize influenza-like illness incidence rates with a mixture of Gaussian distributions, and the other, poliomyelitis counts with mixture of Poisson distributions. The results justify a wider use of this method for analysing surveillance data.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: A new approach to content-based video indexing using hidden Markov models (HMMs), in which one feature vector is calculated for each image of the video sequence, that allows the classification of complex video sequences.
Abstract: This paper presents a new approach to content-based video indexing using hidden Markov models (HMMs). In this approach one feature vector is calculated for each image of the video sequence. These feature vectors are modeled and classified using HMMs. This approach has many advantages compared to other video indexing approaches. The system has automatic learning capabilities. It is trained by presenting manually indexed video sequences. To improve the system we use a video model, that allows the classification of complex video sequences. The presented approach works three times faster than real-time. We tested our system on TV broadcast news. The rate of 97.3% correctly classified frames shows the efficiency of our system.

Proceedings ArticleDOI
15 Mar 1999
TL;DR: An embedded hidden Markov model (HMM)-based approach for face detection and recognition that uses an efficient set of observation vectors obtained from the 2D-DCT coefficients that can model the two dimensional data better than the one-dimensional HMM and is computationally less complex than the two-dimensional model.
Abstract: We describe an embedded hidden Markov model (HMM)-based approach for face detection and recognition that uses an efficient set of observation vectors obtained from the 2D-DCT coefficients. The embedded HMM can model the two dimensional data better than the one-dimensional HMM and is computationally less complex than the two-dimensional HMM. This model is appropriate for face images since it exploits an important facial characteristic: frontal faces preserve the same structure of "super states" from top to bottom, and also the same left-to-right structure of "states" inside each of these "super states".

Proceedings ArticleDOI
01 Aug 1999
TL;DR: This work defines the problem of decomposing human-written summary sentences and proposes a novel Hidden Markov Model solution to the problem and sheds light on the generation of summary text by cutting and pasting.
Abstract: We define the problem of decomposing human-written summary sentences and propose a novel Hidden Markov Model solution to the problem. Human summarizers often rely on cutting and pasting of the full document to generate summaries. Decomposing a human-written summary sentence requires determining: (1) whether it is constructed by cutting and pasting, (2) what components in the sentence come from the original document, and (3) where in the document the components come from. Solving the decomposition problem can potentially lead to the automatic acquisition of large corpora for summarization. It also sheds light on the generation of summary text by cutting and pasting. The evaluation shows that the proposed decomposition algorithm performs well.

Proceedings ArticleDOI
20 Jun 1999
TL;DR: An extension to the hidden Markov model for part-of-speech tagging using second-order approximations for both contextual and lexical probabilities that increases the accuracy of the tagger to state of the art levels is described.
Abstract: This paper describes an extension to the hidden Markov model for part-of-speech tagging using second-order approximations for both contextual and lexical probabilities. This model increases the accuracy of the tagger to state of the art levels. These approximations make use of more contextual information than standard statistical systems. New methods of smoothing the estimated probabilities are also introduced to address the sparse data problem.

Dissertation
01 Jan 1999
TL;DR: This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis, by proposing a new, extremely general, optimization algorithm that may be used to learn the optimal parameter values of arbitrary latent variable models.
Abstract: The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis It is divided into two parts The first begins with an exposition of the general techniques of latent variable modeling A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it This model is applied in the second part of the thesis The second part brings the technology of part I to bear on two important problems in experimental neuroscience The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions A latent variable model is proposed Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data

Journal ArticleDOI
01 Jan 1999-Proteins
TL;DR: The SAM‐T98 method is an iterative hidden Markov model–based method for constructing protein family profiles that is purely sequence‐based, using no structural information, and yet was able to predict structures as well as all but five of the structure‐based methods in CASP3.
Abstract: This paper presents results of blind predictions submitted to the CASP3 protein structure prediction experiment. We made predictions using the SAM-T98 method, an iterative hidden Markov model-based method for constructing protein family profiles. The method is purely sequence-based, using no structural information, and yet was able to predict structures as well as all but five of the structure-based methods in CASP3.

Proceedings Article
01 Jan 1999
TL;DR: A Bayesian counterpart of the well known maximum likelihood linear regression (MLLR) adaption is formulated based on maximum a posteriori (MAP) estimation, where a prior distribution of the transformation parameters is used.
Abstract: In the past few years, transformation-based model adaptation techniques have been widely used to help reducing acoustic mismatch between training and testing conditions of automatic speech recognizers. The estimation of the transformation parameters is usually carried out using estimation paradigms based on classical statistics such as maximum likelihood, mainly because of their conceptual and computational simplicity. However, it appears necessary to introduce some constraints on the possible values of the transformation parameters to avoid getting unreasonable estimates that might perturb the underlying structure of the acoustic space. In this paper, we propose to introduce such constraints using Bayesian statistics, where a prior distribution of the transformation parameters is used. A Bayesian counterpart of the well known maximum likelihood linear regression (MLLR) adaption is formulated based on maximum a posteriori (MAP) estimation. Supervised, unsupervised and incremental non-native speaker adaptation experiments are carried out to compare the proposed MAPLR approach to MLLR. Experimental results show that MAPLR outperforms MLLR.