Showing papers on "Hidden Markov model published in 2007"

PDF

Open Access

Journal Article•DOI•

[...]

Sushmita Mitra, T. Acharya¹•Institutions (1)

01 May 2007

TL;DR: A survey on gesture recognition with particular emphasis on hand gestures and facial expressions is provided, and applications involving hidden Markov models, particle filtering and condensation, finite-state machines, optical flow, skin color, and connectionist models are discussed in detail.

...read moreread less

Abstract: Gesture recognition pertains to recognizing meaningful expressions of motion by a human, involving the hands, arms, face, head, and/or body. It is of utmost importance in designing an intelligent and efficient human-computer interface. The applications of gesture recognition are manifold, ranging from sign language through medical rehabilitation to virtual reality. In this paper, we provide a survey on gesture recognition with particular emphasis on hand gestures and facial expressions. Applications involving hidden Markov models, particle filtering and condensation, finite-state machines, optical flow, skin color, and connectionist models are discussed in detail. Existing challenges and future research possibilities are also highlighted

...read moreread less

1,797 citations

Journal Article•DOI•

Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server.

[...]

Lukas Käll¹, Anders Krogh², Erik L. L. Sonnhammer²•Institutions (2)

Karolinska Institutet¹, University of Copenhagen²

01 Jul 2007-Nucleic Acids Research

TL;DR: A hidden Markov model, Phobius, is designed that combines transmembrane topology and signal peptide predictions, and also allows constrained and homology-enriched predictions.

...read moreread less

Abstract: When using conventional transmembrane topology and signal peptide predictors, such as TMHMM and SignalP, there is a substantial overlap between these two types of predictions. Applying these methods to five complete proteomes, we found that 30–65% of all predicted signal peptides and 25–35% of all predicted transmembrane topologies overlap. This impairs predictions of 5–10% of the proteome, hence this is an important issue in protein annotation. To address this problem, we previously designed a hidden Markov model, Phobius, that combines transmembrane topology and signal peptide predictions. The method makes an optimal choice between transmembrane segments and signal peptides, and also allows constrained and homologyenriched predictions. We here present a web interface (http:// phobius.cgb.ki.se and http://phobius.binf.ku.dk) to access Phobius.

...read moreread less

1,410 citations

Reference Entry•DOI•

An introduction to hidden Markov models.

[...]

Benjamin Schuster-Böckler¹, Alex Bateman¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jun 2007-Current protocols in human genetics

TL;DR: In this paper, the concept of hidden Markov models in computational biology is introduced and described using simple biological examples, requiring as little mathematical knowledge as possible, and an overview of their current applications are presented.

...read moreread less

Abstract: This unit introduces the concept of hidden Markov models in computational biology. It describes them using simple biological examples, requiring as little mathematical knowledge as possible. The unit also presents a brief history of hidden Markov models and an overview of their current applications before concluding with a discussion of their limitations.

...read moreread less

1,305 citations

Proceedings Article•DOI•

Eyeblink-based Anti-Spoofing in Face Recognition from a Generic Webcamera

[...]

Gang Pan¹, Lin Sun¹, Zhaohui Wu¹, Shihong Lao²•Institutions (2)

Zhejiang University¹, Omron²

26 Dec 2007

TL;DR: A real-time liveness detection approach against photograph spoofing in face recognition, by recognizing spontaneous eyeblinks, which is a non-intrusive manner, which outperforms the cascaded Adaboost and HMM in task of eyeblink detection.

...read moreread less

Abstract: We present a real-time liveness detection approach against photograph spoofing in face recognition, by recognizing spontaneous eyeblinks, which is a non-intrusive manner. The approach requires no extra hardware except for a generic webcamera. Eyeblink sequences often have a complex underlying structure. We formulate blink detection as inference in an undirected conditional graphical framework, and are able to learn a compact and efficient observation and transition potentials from data. For purpose of quick and accurate recognition of the blink behavior, eye closity, an easily-computed discriminative measure derived from the adaptive boosting algorithm, is developed, and then smoothly embedded into the conditional model. An extensive set of experiments are presented to show effectiveness of our approach and how it outperforms the cascaded Adaboost and HMM in task of eyeblink detection.

...read moreread less

611 citations

Journal Article•DOI•

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

[...]

Erik McDermott¹, Timothy J. Hazen², J.-P Le Roux³, Atsushi Nakamura¹, Shigeru Katagiri¹ - Show less +1 more•Institutions (3)

Nippon Telegraph and Telephone¹, Massachusetts Institute of Technology², Pierre-and-Marie-Curie University³

01 Jan 2007-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary speech recognition tasks.

...read moreread less

Abstract: The minimum classification error (MCE) framework for discriminative training is a simple and general formalism for directly optimizing recognition accuracy in pattern recognition problems. The framework applies directly to the optimization of hidden Markov models (HMMs) used for speech recognition problems. However, few if any studies have reported results for the application of MCE training to large-vocabulary, continuous-speech recognition tasks. This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary (up to 100 k word) speech recognition tasks: the Corpus of Spontaneous Japanese lecture speech transcription task, a telephone-based name recognition task, and the MIT Jupiter telephone-based conversational weather information task. On these tasks, starting from maximum likelihood (ML) baselines, MCE training yielded relative reductions in word error ranging from 7% to 20%. Furthermore, this paper evaluates the use of different methods for optimizing the MCE criterion function, as well as the use of precomputed recognition lattices to speed up training. An overview of the MCE framework is given, with an emphasis on practical implementation issues

...read moreread less

581 citations

Journal Article•DOI•

Hidden Conditional Random Fields

[...]

Ariadna Quattoni¹, Sy Bor Wang¹, Louis-Philippe Morency¹, Michael Collins¹, Trevor Darrell¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A discriminative latent variable model for classification problems in structured domains where inputs can be represented by a graph of local observations and a hidden-state conditional random field framework learns a set of latent variables conditioned on local features.

...read moreread less

Abstract: We present a discriminative latent variable model for classification problems in structured domains where inputs can be represented by a graph of local observations. A hidden-state conditional random field framework learns a set of latent variables conditioned on local features. Observations need not be independent and may overlap in space and time.

...read moreread less

578 citations

The HMM-based speech synthesis system (HTS) version 2.0.

[...]

Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, Keiichi Tokuda - Show less +3 more

01 Jan 2007

TL;DR: This paper describes HTS version 2.0 in detail, as well as future release plans, which include a number of new features which are useful for both speech synthesis researchers and developers.

...read moreread less

Abstract: A statistical parametric speech synthesis system based on hidden Markov models (HMMs) has grown in popularity over the last few years. This system simultaneously models spectrum, excitation, and duration of speech using context-dependent HMMs and generates speech waveforms from the HMMs themselves. Since December 2002, we have publicly released an open-source software toolkit named HMM-based speech synthesis system (HTS) to provide a research and development platform for the speech synthesis community. In December 2006, HTS version 2.0 was released. This version includes a number of new features which are useful for both speech synthesis researchers and developers. This paper describes HTS version 2.0 in detail, as well as future release plans.

...read moreread less

546 citations

Proceedings Article•DOI•

Action Recognition from Arbitrary Views using 3D Exemplars

[...]

Daniel Weinland¹, Edmond Boyer¹, Rémi Ronfard•Institutions (1)

French Institute for Research in Computer Science and Automation¹

26 Dec 2007

TL;DR: A new framework is proposed where actions are model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM, where a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations.

...read moreread less

Abstract: In this paper, we address the problem of learning compact, view-independent, realistic 3D models of human actions recorded with multiple cameras, for the purpose of recognizing those same actions from a single or few cameras, without prior knowledge about the relative orientations between the cameras and the subjects. To this aim, we propose a new framework where we model actions using three dimensional occupancy grids, built from multiple viewpoints, in an exemplar-based HMM. The novelty is, that a 3D reconstruction is not required during the recognition phase, instead learned 3D exemplars are used to produce 2D image information that is compared to the observations. Parameters that describe image projections are added as latent variables in the recognition process. In addition, the temporal Markov dependency applied to view parameters allows them to evolve during recognition as with a smoothly moving camera. The effectiveness of the framework is demonstrated with experiments on real datasets and with challenging recognition scenarios.

...read moreread less

509 citations

Journal Article•DOI•

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis

[...]

Tomoki Toda¹, Keiichi Tokuda•Institutions (1)

Nara Institute of Science and Technology¹

01 May 2007-The IEICE transactions on information and systems

TL;DR: In this article, the authors proposed a parameter generation algorithm for an HMM-based speech synthesis technique. But the generated trajectory is often excessively smoothed due to the statistical processing. And the over-smoothing effect usually causes muffled sounds.

...read moreread less

Abstract: This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.

...read moreread less

469 citations

Proceedings Article•DOI•

Cyclostationary Approaches to Signal Detection and Classification in Cognitive Radio

[...]

Kyou-Woong Kim¹, Ihsan Akbar¹, K.K. Bae¹, Jung-sun Urn², Chad M. Spooner, Jeffrey H. Reed³ - Show less +2 more•Institutions (3)

University of Virginia¹, Electronics and Telecommunications Research Institute², Virginia Tech³

01 Apr 2007

TL;DR: It is found that the CDP-based detector and the HMM-based classifier can detect and classify incoming signals at a range of low SNRs.

...read moreread less

Abstract: Spectrum awareness is currently one of the most challenging problems in cognitive radio (CR) design. Detection and classification of very low SNR signals with relaxed information on the signal parameters being detected is critical for proper CR functionality as it enables the CR to react and adapt to the changes in its radio environment. In this work, the cycle frequency domain profile (CDP) is used for signal detection and preprocessing for signal classification. Signal features are extracted from CDP using a threshold-test method. For classification, a Hidden Markov Model (HMM) has been used to process extracted signal features due to its robust pattern-matching capability. We also investigate the effects of varied observation length on signal detection and classification. It is found that the CDP-based detector and the HMM-based classifier can detect and classify incoming signals at a range of low SNRs.

...read moreread less

432 citations

Proceedings Article•DOI•

Latent-Dynamic Discriminative Models for Continuous Gesture Recognition

[...]

Louis-Philippe Morency¹, Ariadna Quattoni¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

17 Jun 2007

TL;DR: A discriminative framework for simultaneous sequence segmentation and labeling which can capture both intrinsic and extrinsic class dynamics and incorporates hidden state variables which model the sub-structure of a class sequence and learn dynamics between class labels.

...read moreread less

Abstract: Many problems in vision involve the prediction of a class label for each frame in an unsegmented sequence. In this paper, we develop a discriminative framework for simultaneous sequence segmentation and labeling which can capture both intrinsic and extrinsic class dynamics. Our approach incorporates hidden state variables which model the sub-structure of a class sequence and learn dynamics between class labels. Each class label has a disjoint set of associated hidden states, which enables efficient training and inference in our model. We evaluated our method on the task of recognizing human gestures from unsegmented video streams and performed experiments on three different datasets of head and eye gestures. Our results demonstrate that our model compares favorably to Support Vector Machines, Hidden Markov Models, and Conditional Random Fields on visual gesture recognition tasks.

...read moreread less

Proceedings Article•DOI•

Conditional random fields for activity recognition

[...]

Douglas L. Vail¹, Manuela Veloso¹, John Lafferty¹•Institutions (1)

Carnegie Mellon University¹

14 May 2007

TL;DR: It is found that the discriminatively trained CRF performs as well as or better than an HMM even when the model features do not violate the independence assumptions of the HMM, and it is confirmed that CRFs are robust against any degradation in performance.

...read moreread less

Abstract: Activity recognition is a key component for creating intelligent, multi-agent systems. Intrinsically, activity recognition is a temporal classification problem. In this paper, we compare two models for temporal classification: hidden Markov models (HMMs), which have long been applied to the activity recognition problem, and conditional random fields (CRFs). CRFs are discriminative models for labeling sequences. They condition on the entire observation sequence, which avoids the need for independence assumptions between observations. Conditioning on the observations vastly expands the set of features that can be incorporated into the model without violating its assumptions. Using data from a simulated robot tag domain, chosen because it is multi-agent and produces complex interactions between observations, we explore the differences in performance between the discriminatively trained CRF and the generative HMM. Additionally, we examine the effect of incorporating features which violate independence assumptions between observations; such features are typically necessary for high classification accuracy. We find that the discriminatively trained CRF performs as well as or better than an HMM even when the model features do not violate the independence assumptions of the HMM. In cases where features depend on observations from many time steps, we confirm that CRFs are robust against any degradation in performance.

...read moreread less

Proceedings Article•

A fully Bayesian approach to unsupervised part-of-speech tagging

[...]

Sharon Goldwater, Thomas L. Griffiths¹•Institutions (1)

University of California, Berkeley¹

01 Jun 2007

TL;DR: This model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE.

...read moreread less

Abstract: Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary.

...read moreread less

Journal Article•DOI•

A fusion model of HMM, ANN and GA for stock market forecasting

[...]

Md. Rafiul Hassan¹, Baikunth Nath¹, Michael Kirley¹•Institutions (1)

University of Melbourne¹

01 Jul 2007-Expert Systems With Applications

TL;DR: A fusion model by combining the Hidden Markov Model (HMM), Artificial Neural Networks (ANN) and Genetic Algorithms (GA) to forecast financial market behaviour is proposed and implemented.

...read moreread less

Abstract: In this paper we propose and implement a fusion model by combining the Hidden Markov Model (HMM), Artificial Neural Networks (ANN) and Genetic Algorithms (GA) to forecast financial market behaviour. The developed tool can be used for in depth analysis of the stock market. Using ANN, the daily stock prices are transformed to independent sets of values that become input to HMM. We draw on GA to optimize the initial parameters of HMM. The trained HMM is used to identify and locate similar patterns in the historical data. The price differences between the matched days and the respective next day are calculated. Finally, a weighted average of the price differences of similar patterns is obtained to prepare a forecast for the required next day. Forecasts are obtained for a number of securities in the IT sector and are compared with a conventional forecast method.

...read moreread less

Journal Article•DOI•

HMM-based on-line signature verification: Feature extraction and signature modeling

[...]

Julian Fierrez¹, Javier Ortega-Garcia¹, Daniel Ramos¹, Joaquin Gonzalez-Rodriguez¹•Institutions (1)

Autonomous University of Madrid¹

01 Dec 2007-Pattern Recognition Letters

TL;DR: A function-based approach to on-line signature verification using a set of time sequences and Hidden Markov Models (HMMs) is presented and is compared to other state-of-the-art systems based on the results of the SVC 2004.

...read moreread less

Journal Article•DOI•

A segmental hidden semi-Markov model (HSMM)-based diagnostics and prognostics framework and methodology

[...]

Ming Dong¹, David He²•Institutions (2)

Shanghai Jiao Tong University¹, University of Illinois at Chicago²

01 Jul 2007-Mechanical Systems and Signal Processing

TL;DR: A statistical modelling methodology for performing both diagnosis and prognosis in a unified framework based on segmental hidden semi-Markov models (HSMMs), which can be used to predict the useful remaining life of a system.

...read moreread less

Proceedings Article•

Hidden Topic Markov Models

[...]

Amit Gruber¹, Yair Weiss, Michal Rosen-Zvi•Institutions (1)

Hebrew University of Jerusalem¹

11 Mar 2007

TL;DR: This paper proposes modeling the topics of words in the document as a Markov chain, and shows that incorporating this dependency allows us to learn better topics and to disambiguate words that can belong to different topics.

...read moreread less

Abstract: Algorithms such as Latent Dirichlet Allocation (LDA) have achieved significant progress in modeling word document relationships. These algorithms assume each word in the document was generated by a hidden topic and explicitly model the word distribution of each topic as well as the prior distribution over topics in the document. Given these parameters, the topics of all words in the same document are assumed to be independent. In this paper, we propose modeling the topics of words in the document as a Markov chain. Specifically, we assume that all words in the same sentence have the same topic, and successive sentences are more likely to have the same topics. Since the topics are hidden, this leads to using the well-known tools of Hidden Markov Models for learning and inference. We show that incorporating this dependency allows us to learn better topics and to disambiguate words that can belong to different topics. Quantitatively, we show that we obtain better perplexity in modeling documents with only a modest increase in learning and inference complexity.

...read moreread less

Book Chapter•DOI•

An application of recurrent neural networks to discriminative keyword spotting

[...]

Santiago Fernández¹, Alex Graves¹, Jürgen Schmidhuber²•Institutions (2)

Dalle Molle Institute for Artificial Intelligence Research¹, Technische Universität München²

09 Sep 2007

TL;DR: A discriminative keyword spotting system based on recurrent neural networks only, that uses information from long time spans to estimate word-level posterior probabilities of sub-word units, is presented.

...read moreread less

Abstract: The goal of keyword spotting is to detect the presence of specific spoken words in unconstrained speech. The majority of keyword spotting systems are based on generative hidden Markov models and lack discriminative capabilities. However, discriminative keyword spotting systems are currently based on frame-level posterior probabilities of sub-word units. This paper presents a discriminative keyword spotting system based on recurrent neural networks only, that uses information from long time spans to estimate word-level posterior probabilities. In a keyword spotting task on a large database of unconstrained speech the system achieved a keyword spotting accuracy of 84.5%

...read moreread less

Journal Article•DOI•

Object Trajectory-Based Activity Classification and Recognition Using Hidden Markov Models

[...]

Faisal Bashir¹, Ashfaq Khokhar¹, Dan Schonfeld¹•Institutions (1)

University of Illinois at Chicago¹

01 Jul 2007-IEEE Transactions on Image Processing

TL;DR: This paper presents novel classification algorithms for recognizing object activity using object motion trajectory, and uses hidden Markov models (HMMs) with a data-driven design in terms of number of states and topology.

...read moreread less

Abstract: Motion trajectories provide rich spatiotemporal information about an object's activity. This paper presents novel classification algorithms for recognizing object activity using object motion trajectory. In the proposed classification system, trajectories are segmented at points of change in curvature, and the subtrajectories are represented by their principal component analysis (PCA) coefficients. We first present a framework to robustly estimate the multivariate probability density function based on PCA coefficients of the subtrajectories using Gaussian mixture models (GMMs). We show that GMM-based modeling alone cannot capture the temporal relations and ordering between underlying entities. To address this issue, we use hidden Markov models (HMMs) with a data-driven design in terms of number of states and topology (e.g., left-right versus ergodic). Experiments using a database of over 5700 complex trajectories (obtained from UCI-KDD data archives and Columbia University Multimedia Group) subdivided into 85 different classes demonstrate the superiority of our proposed HMM-based scheme using PCA coefficients of subtrajectories in comparison with other techniques in the literature.

...read moreread less

Proceedings Article•

Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks

[...]

Alex Graves, Marcus Liwicki¹, Horst Bunke¹, Jürgen Schmidhuber², Santiago Fernández² - Show less +1 more•Institutions (2)

University of Bern¹, Dalle Molle Institute for Artificial Intelligence Research²

03 Dec 2007

TL;DR: A system capable of directly transcribing raw online handwriting data is described, consisting of an advanced recurrent neural network with an output layer designed for sequence labelling, combined with a probabilistic language model.

...read moreread less

Abstract: In online handwriting recognition the trajectory of the pen is recorded during writing. Although the trajectory provides a compact and complete representation of the written output, it is hard to transcribe directly, because each letter is spread over many pen locations. Most recognition systems therefore employ sophisticated preprocessing techniques to put the inputs into a more localised form. However these techniques require considerable human effort, and are specific to particular languages and alphabets. This paper describes a system capable of directly transcribing raw online handwriting data. The system consists of an advanced recurrent neural network with an output layer designed for sequence labelling, combined with a probabilistic language model. In experiments on an unconstrained online database, we record excellent results using either raw or preprocessed data, well outperforming a state-of-the-art HMM based system in both cases.

...read moreread less

Journal Article•DOI•

A discriminative model for polyphonic piano transcription

[...]

Graham E. Poliner¹, Daniel P. W. Ellis¹•Institutions (1)

Columbia University¹

01 Jan 2007-EURASIP Journal on Advances in Signal Processing

TL;DR: A discriminative model for polyphonic piano transcription is presented and a frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.

...read moreread less

Abstract: We present a discriminative model for polyphonic piano transcription. Support vector machines trained on spectral features are used to classify frame-level note instances. The classifier outputs are temporally constrained via hidden Markov models, and the proposed system is used to transcribe both synthesized and real piano recordings. A frame-level transcription accuracy of 68% was achieved on a newly generated test set, and direct comparisons to previous approaches are provided.

...read moreread less

Journal Article•DOI•

On-line signature recognition based on VQ-DTW

[...]

Marcos Faundez-Zanuy

01 Mar 2007-Pattern Recognition

TL;DR: Experimental results reveal that the first proposed combination of VQ and DTW (by means of score fusion) outperforms the other algorithms and achieves a minimum detection cost function (DCF) value equal to 1.37% for random forgeries and 5.42% for skilled forgeries.

...read moreread less

Proceedings Article•DOI•

Statistical Parametric Speech Synthesis

[...]

Black, Zen, Tokuda

01 Jan 2007

Journal Article•DOI•

Hidden semi-Markov model-based methodology for multi-sensor equipment health diagnosis and prognosis

[...]

Ming Dong¹, David He²•Institutions (2)

Shanghai Jiao Tong University¹, University of Illinois at Chicago²

01 May 2007-European Journal of Operational Research

TL;DR: An integrated platform for multi-sensor equipment diagnosis and prognosis based on hidden semi-Markov model (HSMM), which shows that the increase of correct diagnostic rate is indeed very promising and the equipment prognosis can be implemented in the same integrated framework.

...read moreread less

Posted Content•

A Hidden Markov Model of Customer Relationship Dynamics

[...]

Oded Netzer, James M. Lattin, V. Seenu Srinivasan

01 Jan 2007

TL;DR: This research constructs and estimates a nonhomogeneous hidden Markov model to model the transitions among latent relationship states and effects on buying behavior, and uses a hierarchical Bayes approach to capture the unobserved heterogeneity across customers.

...read moreread less

Abstract: This research models the dynamics of customer relationships using typical transaction data. It permits the evaluation of the effectiveness of customer-brand encounters on the dynamics of customer relationships and the subsequent buying behavior. Our approach to modeling relationship dynamics is structurally different from existing approaches. In the proposed model, customer-brand encounters may have an enduring impact by shifting the customer to a different (unobservable) relationship state. We constructed and estimated a hidden Markov model (HMM) to model the transitions among latent relationship states and effects on buying behavior. This model enables to dynamically segment the firm's customer base, and to examine methods by which the firm can alter the long-term buying behavior. We use a hierarchical Bayes approach to capture the unobserved heterogeneity across customers. We calibrate the model in the context of alumni relations using a longitudinal gift-giving dataset. Using the proposed model, we are able to probabilistically classify the alumni base into three relationship states, and estimate the marginal impact of alumni-university interactions on moving the alumni between these states. The application of the model for marketing decisions is illustrated using a "what-if" analysis of a reunion marketing campaign. Additionally, we demonstrate improved prediction ability on a validation sample.

...read moreread less

Journal Article•DOI•

Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005

[...]

Heiga Zen¹, Tomoki Toda², Masaru Nakamura¹, Keiichi Tokuda¹•Institutions (2)

Nagoya Institute of Technology¹, Nara Institute of Science and Technology²

01 Jan 2007-The IEICE transactions on information and systems

TL;DR: The technical details, building processes, and performance of the basic HMM-based speech synthesis system, and new features integrated into Nitech-HTS 2005 such as STRAIGHT-based vocoding, HSMM- based acoustic modeling, and a speech parameter generation algorithm considering GV are described.

...read moreread less

Abstract: In January 2005, an open evaluation of corpus-based text-to-speech synthesis systems using common speech datasets, named Blizzard Challenge 2005, was conducted. Nitech group participated in this challenge, entering an HMM-based speech synthesis system called Nitech-HTS 2005. This paper describes the technical details, building processes, and performance of our system. We first give an overview of the basic HMM-based speech synthesis system, and then describe new features integrated into Nitech-HTS 2005 such as STRAIGHT-based vocoding, HSMM-based acoustic modeling, and a speech parameter generation algorithm considering GV. Constructed Nitech-HTS 2005 voices can generate speech waveforms at 0.3 ×RT (real-time ratio) on a 1.6 GHz Pentium 4 machine, and footprints of these voices are less than 2 Mbytes. Subjective listening tests showed that the naturalness and intelligibility of the Nitech-HTS 2005 voices were much better than expected.

...read moreread less

Journal Article•DOI•

A Hidden Semi-Markov Model-Based Speech Synthesis System

[...]

Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayasih, Tadashi Kitamura - Show less +1 more

01 May 2007-The IEICE transactions on information and systems

TL;DR: Subjective listening test results show that use of HSMMs improves the reported naturalness of synthesized Speech Synthesis, which can be viewed as an HMM with explicit state duration PDFs.

...read moreread less

Abstract: A statistical speech synthesis system based on the hidden Markov model (HMM) was recently proposed. In this system, spectrum, excitation, and duration of speech are modeled simultaneously by context-dependent HMMs, and speech parameter vector sequences are generated from the HMMs themselves. This system defines a speech synthesis problem in a generative model framework and solves it based on the maximum likelihood (ML) criterion. However, there is an inconsistency: although state duration probability density functions (PDFs) are explicitly used in the synthesis part of the system, they have not been incorporated into its training part. This inconsistency can make the synthesized speech sound less natural. In this paper, we propose a statistical speech synthesis system based on a hidden semi-Markov model (HSMM), which can be viewed as an HMM with explicit state duration PDFs. The use of HSMMs can solve the above inconsistency because we can incorporate the state duration PDFs explicitly into both the synthesis and the training parts of the system. Subjective listening test results show that use of HSMMs improves the reported naturalness of synthesized speech.

...read moreread less

Proceedings Article•

Learning Multilevel Distributed Representations for High-Dimensional Sequences

[...]

Ilya Sutskever, Geoffrey E. Hinton

11 Mar 2007

TL;DR: A new family of non-linear sequence models that are substantially more powerful than hidden Markov models or linear dynamical systems are described, and their performance is demonstrated using synthetic video sequences of two balls bouncing in a box.

...read moreread less

Abstract: We describe a new family of non-linear sequence models that are substantially more powerful than hidden Markov models or linear dynamical systems. Our models have simple approximate inference and learning procedures that work well in practice. Multilevel representations of sequential data can be learned one hidden layer at a time, and adding extra hidden layers improves the resulting generative models. The models can be trained with very high-dimensional, very non-linear data such as raw pixel sequences. Their performance is demonstrated using synthetic video sequences of two balls bouncing in a box.

...read moreread less

Proceedings Article•

Applying Many-to-Many Alignments and Hidden Markov Models to Letter-to-Phoneme Conversion

[...]

Sittichai Jiampojamarn¹, Grzegorz Kondrak¹, Tarek Sherif¹•Institutions (1)

University of Alberta¹

01 Apr 2007

TL;DR: This work presents a novel technique of training with many-to-many alignments of letters and phonemes, and applies an HMM method in conjunction with a local classification model to predict a global phoneme sequence given a word.

...read moreread less

Abstract: Letter-to-phoneme conversion generally requires aligned training data of letters and phonemes. Typically, the alignments are limited to one-to-one alignments. We present a novel technique of training with many-to-many alignments. A letter chunking bigram prediction manages double letters and double phonemes automatically as opposed to preprocessing with fixed lists. We also apply an HMM method in conjunction with a local classification model to predict a global phoneme sequence given a word. The many-to-many alignments result in significant improvements over the traditional one-to-one approach. Our system achieves state-of-the-art performance on several languages and data sets.

...read moreread less

Journal Article•DOI•

Mixed Hidden Markov Models

[...]

Rachel MacKay Altman¹•Institutions (1)

Simon Fraser University¹

01 Mar 2007-Journal of the American Statistical Association

TL;DR: A new class of models, mixed HMMs (MHMMs), where both covariates and random effects are used to capture differences among processes, are presented, and it is shown that the model can describe the heterogeneity among patients.

...read moreread less

Abstract: Hidden Markov models (HMMs) are a useful tool for capturing the behavior of overdispersed, autocorrelated data. These models have been applied to many different problems, including speech recognition, precipitation modeling, and gene finding and profiling. Typically, HMMs are applied to individual stochastic processes; HMMs for simultaneously modeling multiple processes—as in the longitudinal data setting—have not been widely studied. In this article I present a new class of models, mixed HMMs (MHMMs), where I use both covariates and random effects to capture differences among processes. I define the models using the framework of generalized linear mixed models and discuss their interpretation. I then provide algorithms for parameter estimation and illustrate the properties of the estimators via a simulation study. Finally, to demonstrate the practical uses of MHMMs, I provide an application to data on lesion counts in multiple sclerosis patients. I show that my model, while parsimonious, can describe the...

...read moreread less

Collapse