scispace - formally typeset
Search or ask a question

Showing papers on "Hidden Markov model published in 2017"


Journal ArticleDOI
TL;DR: The proposed hybrid CTC/attention end-to-end ASR is applied to two large-scale ASR benchmarks, and exhibits performance that is comparable to conventional DNN/HMM ASR systems based on the advantages of both multiobjective learning and joint decoding without linguistic resources.
Abstract: Conventional automatic speech recognition (ASR) based on a hidden Markov model (HMM)/deep neural network (DNN) is a very complicated system consisting of various modules such as acoustic, lexicon, and language models. It also requires linguistic resources, such as a pronunciation dictionary, tokenization, and phonetic context-dependency trees. On the other hand, end-to-end ASR has become a popular alternative to greatly simplify the model-building process of conventional ASR systems by representing complicated modules with a single deep network architecture, and by replacing the use of linguistic resources with a data-driven learning method. There are two major types of end-to-end architectures for ASR; attention-based methods use an attention mechanism to perform alignment between acoustic frames and recognized symbols, and connectionist temporal classification (CTC) uses Markov assumptions to efficiently solve sequential problems by dynamic programming. This paper proposes hybrid CTC/attention end-to-end ASR, which effectively utilizes the advantages of both architectures in training and decoding. During training, we employ the multiobjective learning framework to improve robustness and achieve fast convergence. During decoding, we perform joint decoding by combining both attention-based and CTC scores in a one-pass beam search algorithm to further eliminate irregular alignments. Experiments with English (WSJ and CHiME-4) tasks demonstrate the effectiveness of the proposed multiobjective learning over both the CTC and attention-based encoder–decoder baselines. Moreover, the proposed method is applied to two large-scale ASR benchmarks (spontaneous Japanese and Mandarin Chinese), and exhibits performance that is comparable to conventional DNN/HMM ASR systems based on the advantages of both multiobjective learning and joint decoding without linguistic resources.

724 citations


Proceedings ArticleDOI
05 Mar 2017
TL;DR: This paper proposed a joint Connectionist Temporal Classification (CTC) and attention-based encoder-decoder framework for end-to-end speech recognition, which can improve robustness and achieve fast convergence by using a joint CTC-attention model.
Abstract: Recently, there has been an increasing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. One approach is the attention-based encoder-decoder framework that learns a mapping between variable-length input and output sequences in one step using a purely data-driven method. The attention model has often been shown to improve the performance over another end-to-end approach, the Connectionist Temporal Classification (CTC), mainly because it explicitly uses the history of the target character without any conditional independence assumptions. However, we observed that the performance of the attention has shown poor results in noisy condition and is hard to learn in the initial training stage with long input sequences. This is because the attention model is too flexible to predict proper alignments in such cases due to the lack of left-to-right constraints as used in CTC. This paper presents a novel method for end-to-end speech recognition to improve robustness and achieve fast convergence by using a joint CTC-attention model within the multi-task learning framework, thereby mitigating the alignment issue. An experiment on the WSJ and CHiME-4 tasks demonstrates its advantages over both the CTC and attention-based encoder-decoder baselines, showing 5.4–14.6% relative improvements in Character Error Rate (CER).

645 citations


Journal ArticleDOI
TL;DR: The design of asynchronous controller, which covers the well-known mode-independent controller and synchronous controller as special cases, is addressed and the DC motor device is applied to demonstrate the practicability of the derived asynchronous synthesis scheme.
Abstract: The issue of asynchronous passive control is addressed for Markov jump systems in this technical note. The asynchronization phenomenon appears between the system modes and controller modes, which is described by a hidden Markov model. Accordingly, a hidden Markov jump model is used to name the resultant closed-loop system. By utilizing the matrix inequality technique, three equivalent sufficient conditions are obtained, which can guarantee the hidden Markov jump systems to be stochastically passive. Based on the established conditions, the design of asynchronous controller, which covers the well-known mode-independent controller and synchronous controller as special cases, is addressed. The DC motor device is applied to demonstrate the practicability of the derived asynchronous synthesis scheme.

413 citations


Proceedings ArticleDOI
29 Mar 2017
TL;DR: In this article, a variational generative adversarial network (GAN) is proposed to generate images in a specific category with randomly drawn values on a latent attribute vector, which can be applied to other tasks, such as image inpainting, super-resolution, and data augmentation for training better face recognition models.
Abstract: We present variational generative adversarial networks, a general learning framework that combines a variational auto-encoder with a generative adversarial network, for synthesizing images in fine-grained categories, such as faces of a specific person or objects in a category. Our approach models an image as a composition of label and latent attributes in a probabilistic model. By varying the fine-grained category label fed into the resulting generative model, we can generate images in a specific category with randomly drawn values on a latent attribute vector. Our approach has two novel aspects. First, we adopt a cross entropy loss for the discriminative and classifier network, but a mean discrepancy objective for the generative network. This kind of asymmetric loss function makes the GAN training more stable. Second, we adopt an encoder network to learn the relationship between the latent space and the real image space, and use pairwise feature matching to keep the structure of generated images. We experiment with natural images of faces, flowers, and birds, and demonstrate that the proposed models are capable of generating realistic and diverse samples with fine-grained category labels. We further show that our models can be applied to other tasks, such as image inpainting, super-resolution, and data augmentation for training better face recognition models.

345 citations


Journal ArticleDOI
TL;DR: This paper uses two finite mixture models to capture the structural information of the data from binary classification and proposes a structural MPM, which can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi–min margin machine under certain special conditions.
Abstract: Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems. However, MPM only considers the prior probability distribution of each class with a given mean and covariance matrix, which does not efficiently exploit the structural information of data. In this paper, we use two finite mixture models to capture the structural information of the data from binary classification. For each subdistribution in a finite mixture model, only its mean and covariance matrix are assumed to be known. Based on the finite mixture models, we propose a structural MPM (SMPM). SMPM can be solved effectively by a sequence of the second-order cone programming problems. Moreover, we extend a linear model of SMPM to a nonlinear model by exploiting kernelization techniques. We also show that the SMPM can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi–min margin machine under certain special conditions. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of SMPM.

337 citations


Journal ArticleDOI
TL;DR: This research trains Hidden Markov Models (HMMs) on both static and dynamic feature sets and compares the resulting detection rates over a substantial number of malware families, finding a fully dynamic approach generally yields the best detection rates.
Abstract: In this research, we compare malware detection techniques based on static, dynamic, and hybrid analysis. Specifically, we train Hidden Markov Models (HMMs) on both static and dynamic feature sets and compare the resulting detection rates over a substantial number of malware families. We also consider hybrid cases, where dynamic analysis is used in the training phase, with static techniques used in the detection phase, and vice versa. In our experiments, a fully dynamic approach generally yields the best detection rates. We discuss the implications of this research for malware detection based on hybrid techniques.

306 citations


Journal ArticleDOI
TL;DR: The experimental results on three challenging depth video datasets demonstrate that the proposed online HAR method using the proposed multi-fused features outperforms the state-of-the-art HAR methods in terms of recognition accuracy.

281 citations


Journal ArticleDOI
TL;DR: An advance is presented that enables the HMM to handle very large amounts of data, making possible the inference of very reproducible and interpretable dynamic brain networks in a range of different datasets, including task, rest, MEG and fMRI, with potentially thousands of subjects.

255 citations


Journal ArticleDOI
TL;DR: The MSMBuilder package is designed with a particular focus on the analysis of atomistic simulations of biomolecular dynamics such as protein folding and conformational change and includes complementary algorithms for understanding time-series data such as hidden Markov models and time-structure based independent component analysis.

238 citations


Journal ArticleDOI
TL;DR: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database, a public dataset that is created to support comparative research benchmarking.
Abstract: Objective : State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods : In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results : Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion : Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance : The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.

211 citations


Journal ArticleDOI
TL;DR: It is demonstrated why well-established formal procedures for model selection, such as those based on standard information criteria, tend to favor models with numbers of states that are undesirably large in situations where states shall be meaningful entities.
Abstract: We discuss the notorious problem of order selection in hidden Markov models, that is of selecting an adequate number of states, highlighting typical pitfalls and practical challenges arising when analyzing real data. Extensive simulations are used to demonstrate the reasons that render order selection particularly challenging in practice despite the conceptual simplicity of the task. In particular, we demonstrate why well-established formal procedures for model selection, such as those based on standard information criteria, tend to favor models with numbers of states that are undesirably large in situations where states shall be meaningful entities. We also offer a pragmatic step-by-step approach together with comprehensive advice for how practitioners can implement order selection. Our proposed strategy is illustrated with a real-data case study on muskox movement. Supplementary materials accompanying this paper appear online.

Journal ArticleDOI
TL;DR: A novel multi-sensor fusion framework for Sign Language Recognition (SLR) using Coupled Hidden Markov Model (CHMM), which provides interaction in state-space instead of observation states as used in classical HMM that fails to model correlation between inter-modal dependencies.

Journal ArticleDOI
TL;DR: A NILM algorithm based on the joint use of active and reactive power in the Additive Factorial Hidden Markov Models framework is proposed, which outperforms AFAMAP, Hart’s algorithm, and Hart's with MAP respectively.

Journal ArticleDOI
TL;DR: A Bayesian framework for face sketch synthesis is proposed, which provides a systematic interpretation for understanding the common properties and intrinsic difference in different methods from the perspective of probabilistic graphical models.
Abstract: Exemplar-based face sketch synthesis has been widely applied to both digital entertainment and law enforcement. In this paper, we propose a Bayesian framework for face sketch synthesis, which provides a systematic interpretation for understanding the common properties and intrinsic difference in different methods from the perspective of probabilistic graphical models. The proposed Bayesian framework consists of two parts: the neighbor selection model and the weight computation model. Within the proposed framework, we further propose a Bayesian face sketch synthesis method. The essential rationale behind the proposed Bayesian method is that we take the spatial neighboring constraint between adjacent image patches into consideration for both aforementioned models, while the state-of-the-art methods neglect the constraint either in the neighbor selection model or in the weight computation model. Extensive experiments on the Chinese University of Hong Kong face sketch database demonstrate that the proposed Bayesian method could achieve superior performance compared with the state-of-the-art methods in terms of both subjective perceptions and objective evaluations.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper proposes a multimodal gesture recognition method based on a ResC3D network, which leverages the advantages of both residual and C3D model, together with a canonical correlation analysis based fusion scheme for blending features.
Abstract: Gesture recognition is an important issue in computer vision. Recognizing gestures with videos remains a challenging task due to the barriers of gesture-irrelevant factors. In this paper, we propose a multimodal gesture recognition method based on a ResC3D network. One key idea is to find a compact and effective representation of video sequences. Therefore, the video enhancement techniques, such as Retinex and median filter are applied to eliminate the illumination variation and noise in the input video, and a weighted frame unification strategy is utilized to sample key frames. Upon these representations, a ResC3D network, which leverages the advantages of both residual and C3D model, is developed to extract features, together with a canonical correlation analysis based fusion scheme for blending features. The performance of our method is evaluated in the Chalearn LAP isolated gesture recognition challenge. It reaches 67.71% accuracy and ranks the 1st place in this challenge.

Journal ArticleDOI
TL;DR: It has been observed that, accuracies can be improved if data from both sensors are fused as compared to single sensor-based recognition, and results are combined to boost-up the recognition performance.

Book ChapterDOI
22 Sep 2017
TL;DR: The goal is to determine the average annual temperature at a particular location on earth over a series of years, before thermometers were invented, by looking for indirect evidence of the temperature.
Abstract: Suppose we want to determine the average annual temperature at a particular location on earth over a series of years. To make it interesting, suppose the years we are concerned with lie in the distant past, before thermometers were invented. Since we can’t go back in time, we instead look for indirect evidence of the temperature. To simplify the problem, we only consider two annual temperatures, “hot” and “cold”. Suppose that modern evidence indicates that the probability of a hot year followed by another hot year is 0.7 and the probability that a cold year is followed by another cold year is 0.6. We’ll assume that these probabilities held in the distant past as well. The information so far can be summarized as H C

Journal ArticleDOI
TL;DR: Recently, momentuHMM as discussed by the authors has been proposed for inferring latent animal behaviors from telemetry data using discrete-time hidden Markov models (HMMs) and user-specified probability distributions for an unlimited number of data streams.
Abstract: Discrete-time hidden Markov models (HMMs) have become an immensely popular tool for inferring latent animal behaviors from telemetry data. Here we introduce an open-source R package, momentuHMM, that addresses many of the deficiencies in existing HMM software. Features include: 1) data pre-processing and visualization; 2) user-specified probability distributions for an unlimited number of data streams and latent behavior states; 3) biased and correlated random walk movement models, including "activity centers" associated with attractive or repulsive forces; 4) user-specified design matrices and constraints for covariate modelling of parameters using formulas familiar to most R users; 5) multiple imputation methods that account for measurement error and temporally-irregular or missing data; 6) seamless integration of spatio-temporal covariate raster data; 7) cosinor and spline models for cyclical and other complicated patterns; 8) model checking and selection; and 9) simulation. momentuHMM considerably extends the capabilities of existing HMM software while accounting for common challenges associated with telemetery data. It therefore facilitates more realistic hypothesis-driven animal movement analyses that have hitherto been largely inaccessible to non-statisticians. While motivated by telemetry data, the package can be used for analyzing any type of data that is amenable to HMMs. Practitioners interested in additional features are encouraged to contact the authors.

Proceedings ArticleDOI
01 Jul 2017
TL;DR: This paper proposes joint decoding algorithm for end-to-end ASR with a hybrid CTC/attention architecture, which effectively utilizes both advantages in decoding.
Abstract: End-to-end automatic speech recognition (ASR) has become a popular alternative to conventional DNN/HMM systems because it avoids the need for linguistic resources such as pronunciation dictionary, tokenization, and context-dependency trees, leading to a greatly simplified model-building process. There are two major types of end-to-end architectures for ASR: attention-based methods use an attention mechanism to perform alignment between acoustic frames and recognized symbols, and connectionist temporal classification (CTC), uses Markov assumptions to efficiently solve sequential problems by dynamic programming. This paper proposes joint decoding algorithm for end-to-end ASR with a hybrid CTC/attention architecture, which effectively utilizes both advantages in decoding. We have applied the proposed method to two ASR benchmarks (spontaneous Japanese and Mandarin Chinese), and showing the comparable performance to conventional state-of-the-art DNN/HMM ASR systems without linguistic resources.

Journal ArticleDOI
TL;DR: This paper investigates the effect of living habits on the models of spatio-temporal prediction and next-place prediction, and selects one from these two models for an individual to achieve effective mobility prediction at users’ points of interest.
Abstract: With the emergence of smartphones and location-based services, user mobility prediction has become a critical enabler for a wide range of applications, like location-based advertising, early warning systems, and citywide traffic planning. A number of techniques have been proposed to either conduct spatio-temporal mobility prediction or forecast the next-place. However, both produce diverse prediction performance for different users and display poor performance for some users. This paper focuses on investigating the effect of living habits on the models of spatio-temporal prediction and next-place prediction, and selects one from these two models for an individual to achieve effective mobility prediction at users’ points of interest. Based on the hidden Markov model (HMM), a spatio-temporal predictor and a next-place predictor are proposed. Living habits are analyzed in terms of entropy, upon which users are clustered into distinct groups. With large-scale factual mobile data captured from a big city, we compare the proposed HMM-based predictors with existing state-of-the-art predictors and apply them to different user groups. The results demonstrate the robust performance of the two proposed mobility predictors, which outperform the state of the art for various user groups.

Journal ArticleDOI
TL;DR: In this paper, an unsupervised learning approach is used to infer new aspects of animal behaviour when biologically meaningful response variables are used, with the caveat that the states may not map to specific behaviours.
Abstract: 1.Use of accelerometers is now widespread within animal biotelemetry as they provide a means of measuring an animal's activity in a meaningful and quantitative way where direct observation is not possible. In sequential acceleration data there is a natural dependence between observations of behaviour, a fact that has been largely ignored in most analyses. 2.Analyses of acceleration data where serial dependence has been explicitly modelled have largely relied on hidden Markov models (HMMs). Depending on the aim of an analysis, an HMM can be used for state prediction or to make inferences about drivers of behaviour. For state prediction, a supervised learning approach can be applied. That is, an HMM is trained to classify unlabelled acceleration data into a finite set of pre-specified categories. An unsupervised learning approach can be used to infer new aspects of animal behaviour when biologically meaningful response variables are used, with the caveat that the states may not map to specific behaviours. 3.We will provide the details necessary to implement and assess an HMM in both the supervised and unsupervised learning context and discuss the data requirements of each case. We outline two applications to marine and aerial systems (shark and eagle) taking the unsupervised learning approach, which is more readily applicable to animal activity measured in the field. HMMs were used to infer the effects of temporal, atmospheric and tidal inputs on animal behaviour. 4.Animal accelerometer data allow ecologists to identify important correlates and drivers of animal activity (and hence behaviour). The HMM framework is well suited to deal with the main features commonly observed in accelerometer data, and can easily be extended to suit a wide range of types of animal activity data. The ability to combine direct observations of animal activity with statistical models, which account for the features of accelerometer data, offers a new way to quantify animal behaviour, energetic expenditure and deepen our insights into individual behaviour as a constituent of populations and ecosystems. This article is protected by copyright. All rights reserved.

Journal ArticleDOI
TL;DR: A Hidden Markov Model (HMM), one of the commonly encountered statistical methods, is engaged here to detect anomalies in multivariate time series.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: This paper proposes several ways of using context information for DA classification, all in the deep learning framework, and demonstrates that incorporating context information significantly improves DA classification and achieves new state-of-the-art performance.
Abstract: Previous work on dialog act (DA) classification has investigated different methods, such as hidden Markov models, maximum entropy, conditional random fields, graphical models, and support vector machines A few recent studies explored using deep learning neural networks for DA classification, however, it is not clear yet what is the best method for using dialog context or DA sequential information, and how much gain it brings This paper proposes several ways of using context information for DA classification, all in the deep learning framework The baseline system classifies each utterance using the convolutional neural networks (CNN) Our proposed methods include using hierarchical models (recurrent neural networks (RNN) or CNN) for DA sequence tagging where the bottom layer takes the sentence CNN representation as input, concatenating predictions from the previous utterances with the CNN vector for classification, and performing sequence decoding based on the predictions from the sentence CNN model We conduct thorough experiments and comparisons on the Switchboard corpus, demonstrate that incorporating context information significantly improves DA classification, and show that we achieve new state-of-the-art performance for this task

Journal ArticleDOI
TL;DR: A depth video-based novel method for HAR is presented using robust multi- features and embedded Hidden Markov Models (HMMs) to recognize daily life activities of elderly people living alone in indoor environment such as smart homes.
Abstract: Increase in number of elderly people who are living independently needs especial care in the form of healthcare monitoring systems. Recent advancements in depth video technologies have made human activity recognition (HAR) realizable for elderly healthcare applications. In this paper, a depth video-based novel method for HAR is presented using robust multi-features and embedded Hidden Markov Models (HMMs) to recognize daily life activities of elderly people living alone in indoor environment such as smart homes. In the proposed HAR framework, initially, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, multi-view differentiation and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, modeled, trained and recognized with specific embedded HMM having active feature values. Furthermore, we construct a new online human activity dataset by a depth sensor to evaluate the proposed features. Our experiments on three depth datasets demonstrated that the proposed multi-features are efficient and robust over the state of the art features for human action and activity recognition.

Journal ArticleDOI
TL;DR: It is shown that an unnormalized joint state and attack distribution conditioned on the sensor measurement information evolves in a linear recursive form, based on which the optimal estimates can be further calculated by evaluating the normalized marginal conditional distributions.
Abstract: The problem of secure state estimation and attack detection in cyber-physical systems is considered in this paper. A stochastic modeling framework is first introduced, based on which the attacked system is modeled as a finite-state hidden Markov model with switching transition probability matrices controlled by a Markov decision process. Based on this framework, a joint state and attack estimation problem is formulated and solved. Utilizing the change of probability measure approach, we show that an unnormalized joint state and attack distribution conditioned on the sensor measurement information evolves in a linear recursive form, based on which the optimal estimates can be further calculated by evaluating the normalized marginal conditional distributions. The estimation results are further applied to secure estimation of stable linear Gaussian systems, and extensions to more general systems are also discussed. The effectiveness of the results are illustrated by numerical examples and comparative simulation.

Journal ArticleDOI
TL;DR: In this paper, an offline, iterated particle filter is presented to facilitate statistical inference in general state space hidden Markov models, given a model and a sequence of observations, the associated margina...
Abstract: We present an offline, iterated particle filter to facilitate statistical inference in general state space hidden Markov models. Given a model and a sequence of observations, the associated margina...

Journal ArticleDOI
TL;DR: A novel weighted hidden Markov model (HMM)-based approach is proposed for tool wear monitoring and tool life prediction, using the signals provided by TCM techniques, and it outperforms the conventional HMM approach.
Abstract: Tool wear is one of the important indicators to reflect the health status of a machining system. In order to obtain tool’s wear status, tool condition monitoring (TCM) utilizes advanced sensor techniques, hoping to find out the wear status through those sensor signals. In this paper, a novel weighted hidden Markov model (HMM)-based approach is proposed for tool wear monitoring and tool life prediction, using the signals provided by TCM techniques. To describe the dynamic nature of wear evolution, a weighted HMM is first developed, which takes wear rate as the hidden state and formulates multiple HMMs in a weighted manner to include sufficient historical information. Explicit formulas to estimate the model parameters are also provided. Then, a particular probabilistic approach using the weighted HMM is proposed to estimate tool wear and predict tool’s remaining useful life during tool operation. The proposed weighted HMM-based approach is tested on a real dataset of a high-speed CNC milling machine cutters. The experimental results show that this approach is effective in estimating tool wear and predicting tool life, and it outperforms the conventional HMM approach.

Journal ArticleDOI
TL;DR: This paper presents a novel map-matching solution that combines the widely used approach based on a hidden Markov model (HMM) with the concept of drivers’ route choice, which uses an HMM tailored for noisy and sparse data to generate partial map-matched paths in an online manner.
Abstract: With the growing use of crowdsourced location data from smartphones for transportation applications, the task of map-matching raw location sequence data to travel paths in the road network becomes more important. High-frequency sampling of smartphone locations using accurate but power-hungry positioning technologies is not practically feasible as it consumes an undue amount of the smartphone’s bandwidth and battery power. Hence, there exists a need to develop robust algorithms for map-matching inaccurate and sparse location data in an accurate and timely manner. This paper addresses the above-mentioned need by presenting a novel map-matching solution that combines the widely used approach based on a hidden Markov model (HMM) with the concept of drivers’ route choice. Our algorithm uses an HMM tailored for noisy and sparse data to generate partial map-matched paths in an online manner. We use a route choice model, estimated from real drive data, to reassess each HMM-generated partial path along with a set of feasible alternative paths. We evaluated the proposed algorithm with real world as well as synthetic location data under varying levels of measurement noise and temporal sparsity. The results show that the map-matching accuracy of our algorithm is significantly higher than that of the state of the art, especially at high levels of noise.

Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this article, a temporal attention-gated model (TAGM) is proposed to measure the relevance of each observation (time step) of a sequence and then use a novel gated recurrent network to learn the hidden representation for the final prediction.
Abstract: Typical techniques for sequence classification are designed for well-segmented sequences which have been edited to remove noisy or irrelevant parts. Therefore, such methods cannot be easily applied on noisy sequences expected in real-world applications. In this paper, we present the Temporal Attention-Gated Model (TAGM) which integrates ideas from attention models and gated recurrent networks to better deal with noisy or unsegmented sequences. Specifically, we extend the concept of attention model to measure the relevance of each observation (time step) of a sequence. We then use a novel gated recurrent network to learn the hidden representation for the final prediction. An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence. We demonstrate the merits of our TAGM approach, both for prediction accuracy and interpretability, on three different tasks: spoken digit recognition, text-based sentiment analysis and visual event recognition.

Journal ArticleDOI
TL;DR: A two-stage continuous hidden Markov model framework, which also takes advantage of the innate hierarchical structure of basic activities, which effectively reduces feature computation overhead, but also allows for varying number of states and iterations.
Abstract: Human activity recognition has been gaining more and more attention from researchers in recent years, particularly with the use of widespread and commercially available devices such as smartphones....