scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic time warping published in 2009"


Journal ArticleDOI
TL;DR: The dtw package allows R users to compute time series alignments mixing freely a variety of continuity constraints, restriction windows, endpoints, local distance definitions, and so on.
Abstract: Dynamic time warping is a popular technique for comparing time series, providing both a distance measure that is insensitive to local compression and stretches and the warping which optimally deforms one of the two input series onto the other. A variety of algorithms and constraints have been discussed in the literature. The dtw package provides an unification of them; it allows R users to compute time series alignments mixing freely a variety of continuity constraints, restriction windows, endpoints, local distance definitions, and so on. The package also provides functions for visualizing alignments and constraints using several classic diagram types.

833 citations


Journal ArticleDOI
TL;DR: A unified framework for simultaneously performing spatial segmentation, temporal segmentsation, and recognition is introduced and can be applied to continuous image streams where gestures are performed in front of moving, cluttered backgrounds.
Abstract: Within the context of hand gesture recognition, spatiotemporal gesture segmentation is the task of determining, in a video sequence, where the gesturing hand is located and when the gesture starts and ends. Existing gesture recognition methods typically assume either known spatial segmentation or known temporal segmentation, or both. This paper introduces a unified framework for simultaneously performing spatial segmentation, temporal segmentation, and recognition. In the proposed framework, information flows both bottom-up and top-down. A gesture can be recognized even when the hand location is highly ambiguous and when information about when the gesture begins and ends is unavailable. Thus, the method can be applied to continuous image streams where gestures are performed in front of moving, cluttered backgrounds. The proposed method consists of three novel contributions: a spatiotemporal matching algorithm that can accommodate multiple candidate hand detections in every frame, a classifier-based pruning framework that enables accurate and early rejection of poor matches to gesture models, and a subgesture reasoning algorithm that learns which gesture models can falsely match parts of other longer gestures. The performance of the approach is evaluated on two challenging applications: recognition of hand-signed digits gestured by users wearing short-sleeved shirts, in front of a cluttered background, and retrieval of occurrences of signs of interest in a video database containing continuous, unsegmented signing in American sign language (ASL).

392 citations


Proceedings ArticleDOI
01 Dec 2009
TL;DR: An unsupervised learning framework is presented to address the problem of detecting spoken keywords by using segmental dynamic time warping to compare the Gaussian posteriorgrams between keyword samples and test utterances and obtaining the keyword detection result.
Abstract: In this paper, we present an unsupervised learning framework to address the problem of detecting spoken keywords. Without any transcription information, a Gaussian Mixture Model is trained to label speech frames with a Gaussian posteriorgram. Given one or more spoken examples of a keyword, we use segmental dynamic time warping to compare the Gaussian posteriorgrams between keyword samples and test utterances. The keyword detection result is then obtained by ranking the distortion scores of all the test utterances. We examine the TIMIT corpus as a development set to tune the parameters in our system, and the MIT Lecture corpus for more substantial evaluation. The results demonstrate the viability and effectiveness of our unsupervised learning framework on the keyword spotting task.

350 citations


Proceedings ArticleDOI
01 Dec 2009
TL;DR: A query-by-example approach to spoken term detection in audio files designed for low-resource situations in which limited or no in-domain training material is available and accurate word-based speech recognition capability is unavailable.
Abstract: This paper examines a query-by-example approach to spoken term detection in audio files. The approach is designed for low-resource situations in which limited or no in-domain training material is available and accurate word-based speech recognition capability is unavailable. Instead of using word or phone strings as search terms, the user presents the system with audio snippets of desired search terms to act as the queries. Query and test materials are represented using phonetic posteriorgrams obtained from a phonetic recognition system. Query matches in the test data are located using a modified dynamic time warping search between query templates and test utterances. Experiments using this approach are presented using data from the Fisher corpus.

305 citations


Journal ArticleDOI
TL;DR: It is shown that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure.
Abstract: In a way similar to the string-to-string correction problem, we address discrete time series similarity in light of a time-series-to-time-series-correction problem for which the similarity between two time series is measured as the minimum cost sequence of edit operations needed to transform one time series into another. To define the edit operations, we use the paradigm of a graphical editing process and end up with a dynamic programming algorithm that we call time warp edit distance (TWED). TWED is slightly different in form from dynamic time warping (DTW), longest common subsequence (LCSS), or edit distance with real penalty (ERP) algorithms. In particular, it highlights a parameter that controls a kind of stiffness of the elastic measure along the time axis. We show that the similarity provided by TWED is a potentially useful metric in time series retrieval applications since it could benefit from the triangular inequality property to speed up the retrieval process while tuning the parameters of the elastic measure. In that context, a lower bound is derived to link the matching of time series into down sampled representation spaces to the matching into the original space. The empiric quality of the TWED distance is evaluated on a simple classification task. Compared to edit distance, DTW, LCSS, and ERP, TWED has proved to be quite effective on the considered experimental task.

298 citations


Proceedings Article
07 Dec 2009
TL;DR: In this article, an extension of canonical correlation analysis (CCA) for spatio-temporal alignment of human motion between two subjects is presented. But the alignment of two or more subjects performing similar activities is a challenging problem due to the large temporal scale difference between human actions as well as the inter/intra subject variability.
Abstract: Alignment of time series is an important problem to solve in many scientific disciplines. In particular, temporal alignment of two or more subjects performing similar activities is a challenging problem due to the large temporal scale difference between human actions as well as the inter/intra subject variability. In this paper we present canonical time warping (CTW), an extension of canonical correlation analysis (CCA) for spatio-temporal alignment of human motion between two subjects. CTW extends previous work on CCA in two ways: (i) it combines CCA with dynamic time warping (DTW), and (ii) it extends CCA by allowing local spatial deformations. We show CTW's effectiveness in three experiments: alignment of synthetic data, alignment of motion capture data of two subjects performing similar actions, and alignment of similar facial expressions made by two people. Our results demonstrate that CTW provides both visually and qualitatively better alignment than state-of-the-art techniques based on DTW.

241 citations


Journal ArticleDOI
TL;DR: The open-end variant of the DTW algorithm is suitable for the classification of truncated quantitative time series, even in the presence of noise.

231 citations


Journal ArticleDOI
TL;DR: This work uses bounding techniques to find closest neighbors quickly and finds that LB_Improved-based search is faster than LB_Keogh over random-walk and shape time series.

193 citations


Journal ArticleDOI
TL;DR: A statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization is introduced.

181 citations


Proceedings Article
01 Dec 2009
TL;DR: SparseDTW as discussed by the authors exploits the existence of similarity and/or correlation between the time series to compute the dynamic time warping distance between two time series that always yields the optimal result.
Abstract: We present a new space-efficient approach, (SparseDTW), to compute the Dynamic Time Warping (DTW) distance between two time series that always yields the optimal result. This is in contrast to other known approaches which typically sacrifice optimality to attain space efficiency. The main idea behind our approach is to dynamically exploit the existence of similarity and/or correlation between the time series. The more the similarity between the time series the less space required to compute the DTW between them. To the best of our knowledge, all other techniques to speedup DTW, impose apriori constraints and do not exploit similarity characteristics that may be present in the data. We conduct experiments and demonstrate that SparseDTW outperforms previous approaches.

131 citations


Journal ArticleDOI
TL;DR: This study shows that, human actions can be simply represented by pose without dealing with the complex representation of dynamics, and proves that with a simple and compact representation, this system can achieve robust recognition of human actions, compared to complex representations.

Journal ArticleDOI
01 Jun 2009
TL;DR: The indexing technique can be used to index star light curves, an important type of astronomical data, without modification and with all the most popular distance measures including Euclidean distance, dynamic time warping and Longest Common Subsequence.
Abstract: Shape matching and indexing is important topic in its own right, and is a fundamental subroutine in most shape data mining algorithms. Given the ubiquity of shape, shape matching is an important problem with applications in domains as diverse as biometrics, industry, medicine, zoology and anthropology. The distance/similarity measure for used for shape matching must be invariant to many distortions, including scale, offset, noise, articulation, partial occlusion, etc. Most of these distortions are relatively easy to handle, either in the representation of the data or in the similarity measure used. However, rotation invariance is noted in the literature as being an especially difficult challenge. Current approaches typically try to achieve rotation invariance in the representation of the data, at the expense of discrimination ability, or in the distance measure, at the expense of efficiency. In this work, we show that we can take the slow but accurate approaches and dramatically speed them up. On real world problems our technique can take current approaches and make them four orders of magnitude faster without false dismissals. Moreover, our technique can be used with any of the dozens of existing shape representations and with all the most popular distance measures including Euclidean distance, dynamic time warping and Longest Common Subsequence. We further show that our indexing technique can be used to index star light curves, an important type of astronomical data, without modification.

Journal ArticleDOI
TL;DR: The FFT system shows promise both as a stand-alone system and especially in combination with approaches that are based on local features, and as an approach using global features, the system possesses many advantages.
Abstract: We present a novel online signature verification system based on the Fast Fourier Transform. The advantage of using the Fourier domain is the ability to compactly represent an online signature using a fixed number of coefficients. The fixed-length representation leads to fast matching algorithms and is essential in certain applications. The challenge on the other hand is to find the right preprocessing steps and matching algorithm for this representation. We report on the effectiveness of the proposed method, along with the effects of individual preprocessing and normalization steps, based on comprehensive tests over two public signature databases. We also propose to use the pen-up duration information in identifying forgeries. The best results obtained on the SUSIG-Visual subcorpus and the MCYT-100 database are 6.2% and 12.1% error rate on skilled forgeries, respectively. The fusion of the proposed system with our state-of-the-art Dynamic Time Warping (DTW) system lowers the error rate of the DTW system by up to about 25%. While the current error rates are higher than state-of-the-art results for these databases, as an approach using global features, the system possesses many advantages. Considering also the suggested improvements, the FFT system shows promise both as a stand-alone system and especially in combination with approaches that are based on local features.

Journal ArticleDOI
TL;DR: A systematic model-based approach to learn the nature of such temporal variations (time warps) while simultaneously allowing for the spatial variations in the descriptors and discusses the relative advantages and disadvantages of both approaches.
Abstract: Pattern recognition in video is a challenging task because of the multitude of spatio-temporal variations that occur in different videos capturing the exact same event. While traditional pattern-theoretic approaches account for the spatial changes that occur due to lighting and pose, very little has been done to address the effect of temporal rate changes in the executions of an event. In this paper, we provide a systematic model-based approach to learn the nature of such temporal variations (time warps) while simultaneously allowing for the spatial variations in the descriptors. We illustrate our approach for the problem of action recognition and provide experimental justification for the importance of accounting for rate variations in action recognition. The model is composed of a nominal activity trajectory and a function space capturing the probability distribution of activity-specific time warping transformations. We use the square-root parameterization of time warps to derive geodesics, distance measures, and probability distributions on the space of time warping functions. We then design a Bayesian algorithm which treats the execution rate function as a nuisance variable and integrates it out using Monte Carlo sampling, to generate estimates of class posteriors. This approach allows us to learn the space of time warps for each activity while simultaneously capturing other intra- and interclass variations. Next, we discuss a special case of this approach which assumes a uniform distribution on the space of time warping functions and show how computationally efficient inference algorithms may be derived for this special case. We discuss the relative advantages and disadvantages of both approaches and show their efficacy using experiments on gait-based person identification and activity recognition.

Journal ArticleDOI
TL;DR: A novel variation on dynamic time warping (DTW) for aligning chromatogram signals based on a morphological dilation of the two signals is highlighted, which significantly reduces the number of nondiagonal moves.
Abstract: In this article we highlight a novel variation on dynamic time warping (DTW) for aligning chromatogram signals. We are interested in sets of signals that can be aligned well locally, but not globally, by shifting individual signals in time. This kind of alignment is often sufficient for aligning gas chromatography data. Regular DTW often “over-warps” signals and introduces artificial features into the aligned data. To overcome this we introduce a variable penalty into the DTW process. The penalty is added to the distance metric whenever a nondiagonal step is taken. We select our penalty based on a morphological dilation of the two signals. We showcase our method by aligning GC/MS datafiles from 712 blood plasma samples processed in 23 batches over the course of 6 months. The use of variable penalty DTW significantly reduces the number of nondiagonal moves. In the examples presented here, this reduction is by a factor of 30, with no cost to visual quality of the alignment.

Journal ArticleDOI
TL;DR: This work shows that the newly introduced multidimensional DTW concept requires significantly less decoding time while providing the same data fusion flexibility as the AHMM, and can be applied in a wide range of real-time multimodal classification tasks.

Journal ArticleDOI
TL;DR: This paper attempts to define and segment subunits using computer vision techniques, which also can be basically explained by sign language linguistics and correlates highly with the definition of syllables in sign language while sharing characteristics of syllable in spoken languages.

Journal ArticleDOI
TL;DR: The proposed system outperforms DTW-based and HMM-based ones, even though these have proved to be very efficient in on-line signature recognition, with storage requirements between 9 and 90 times lesser and a processing speed between 181 and 713 times greater than the DTW -based systems.

Journal ArticleDOI
TL;DR: The Derivative time series Segment Approximation (DSA) representation model, which originally features derivative estimation, segmentation and segment approximation, is presented to provide both high sensitivity in capturing the main trends of time series and data compression.

Journal ArticleDOI
01 Aug 2009
TL;DR: This paper proposes a novel filter-and-refine DTW algorithm called Anticipatory DTW, which exploits previously unused information from the filter step during the refinement, allowing for faster rejection of false candidates.
Abstract: Time series arise in many different applications in the form of sensor data, stocks data, videos, and other time-related information. Analysis of this data typically requires searching for similar time series in a database. Dynamic Time Warping (DTW) is a widely used high-quality distance measure for time series. As DTW is computationally expensive, efficient algorithms for fast computation are crucial.In this paper, we propose a novel filter-and-refine DTW algorithm called Anticipatory DTW. Existing algorithms aim at efficiently finding similar time series by filtering the database and computing the DTW in the refinement step. Unlike these algorithms, our approach exploits previously unused information from the filter step during the refinement, allowing for faster rejection of false candidates. We characterize a class of applicable filters for our approach, which comprises state-of-the-art lower bounds of the DTW.Our novel anticipatory pruning incurs hardly any over-head and no false dismissals. We demonstrate substantial efficiency improvements in thorough experiments on synthetic and real world time series databases and show that our technique is highly scalable to multivariate, long time series and wide DTW bands.

Proceedings ArticleDOI
06 May 2009
TL;DR: This work emphasizes the importance of the correctness of this averaging subroutine and proposes a novel shape averaging method, called Prioritized Shape Averaging (PSA), using hierarchical clustering approach, which achieves a lower discrepancy distance between an averaged sequence and every original sequence than existing method on various domains.
Abstract: Dynamic Time Warping (DTW) distance measure has increasingly been used as a similarity measurement for various data mining tasks in place of traditional Euclidean distance metric due to its superiority in sequence-alignment flexibility. However, in some tasks where shape averaging is required, e.g., in template matching and k-means clustering problems, current averaging methods are inaccurate in that they produce undesired templates and cluster representatives. In this work, we emphasize the importance of the correctness of this averaging subroutine and propose a novel shape averaging method, called Prioritized Shape Averaging (PSA), using hierarchical clustering approach. In experimental evaluation, our proposed method, PSA, achieves a lower discrepancy distance between an averaged sequence and every original sequence than existing method on various domains.

Proceedings ArticleDOI
09 Nov 2009
TL;DR: An adaptive solution to secure the authentication process of cellular phones has been proposed and gait and location tracks of the owner are used as the metrics for authentication.
Abstract: In this paper an adaptive solution to secure the authentication process of cellular phones has been proposed. Gait and location tracks of the owner are used as the metrics for authentication. The cellular phone is envisioned to become as adaptive as a pet animal of the owner. The cellular phone learns various intrinsic attributes of the owner like his voice, face, hand and fingerprint geometry and interesting patterns in the owner's daily life and remembers those to continually check against any anomalous behavior that may occur due to the stealing of the phone. The checking is done level wise. Higher level of authentication is more stringent. Only when the cellular phone recognizes significant anomaly in a lower level, it goes one level up in the security hierarchy. The iPhone's accelerometer and A-GPS module have been utilized to record gait and location signatures. A fast and memory efficient variation of Dynamic Time Warping (DTW) algorithm called FastDTW has been used to compute the similarity score between gait samples.

Proceedings ArticleDOI
31 Mar 2009
TL;DR: Comparison between isolated HMMs and hybrid of HMMs/ANN proves that the approach introduced is more effective, and the average recognition rate of five emotion states has reached 81.7%.
Abstract: This paper proposes a new approach for emotion recognition based on a hybrid of hidden Markov models (HMMs) and artificial neural network (ANN), using both utterance and segment level information from speech. To combine the advantage on capability to dynamic time warping of HMMs and pattern recognition of ANN, the utterance is viewed as a series of voiced segments, and feature vectors extracted from the segments are normalized into fixed coefficients using orthogonal polynomials methods, and then, distortions are calculated as an input of ANN. Meanwhile, the utterance as a whole is modeled by HMMs, and likelihood probabilities derived from the HMMs are normalized to be another input of ANN. Adopting Beihang University Database of Emotional Speech (BHUDES) and Berlin database of emotional speech, comparison between isolated HMMs and hybrid of HMMs/ANN proves that the approach introduced in this paper is more effective, and the average recognition rate of five emotion states has reached 81.7%.

Journal ArticleDOI
27 Oct 2009-Sensors
TL;DR: A comparative study on the different techniques of classifying human leg motions that are performed using two low-cost uniaxial piezoelectric gyroscopes worn on the leg indicates that BDM results in the highest correct classification rate with relatively small computational cost.
Abstract: This paper provides a comparative study on the different techniques of classifying human leg motions that are performed using two low-cost uniaxial piezoelectric gyroscopes worn on the leg. A number of feature sets, extracted from the raw inertial sensor data in different ways, are used in the classification process. The classification techniques implemented and compared in this study are: Bayesian decision making (BDM), a rule-based algorithm (RBA) or decision tree, least-squares method (LSM), k-nearest neighbor algorithm (k-NN), dynamic time warping (DTW), support vector machines (SVM), and artificial neural networks (ANN). A performance comparison of these classification techniques is provided in terms of their correct differentiation rates, confusion matrices, computational cost, and training and storage requirements. Three different cross-validation techniques are employed to validate the classifiers. The results indicate that BDM, in general, results in the highest correct classification rate with relatively small computational cost.

Journal ArticleDOI
TL;DR: The proposed method showed good potential for the non-invasive diagnosis and monitoring of joint disorders such as osteoarthritis and a back-propagation neural network was used to classify the normal and abnormal VAG signals.

Journal ArticleDOI
01 Jan 2009
TL;DR: A new approach of static handwritten signature verication based on Dynamic Time Warping (DTW) by using only ve genuine signatures for training is proposed in this paper and it is observed that the False Acceptance Rate (FAR) of the proposed system decreases as the number of genuine training samples increases.
Abstract: Static signature verication has a signicant use in establishing the authenticity of bank checks, insurance and legal documents based on the signatures they carry. As an individual signs only a few times on the forms for opening an account with any bank or for insurance related purposes, the number of genuine signature templates available in banking and insurance applications is limited, a new approach of static handwritten signature verication based on Dynamic Time Warping (DTW) by using only ve genuine signatures for training is proposed in this paper. Initially the genuine and test signatures belonging to an individual are normalized after calculating the aspect ratios of the genuine signatures. The horizontal and vertical projection features of a signature are extracted using discrete Radon transform and the two vectors are combined to form a combined projection feature vector. The feature vectors of two signatures are matched using DTW algorithm. The closed area formed by the matching path around the diagonal of the DTW-grid is computed and is multiplied with the dierence cost between the feature vectors. A threshold is calculated for each genuine sample during the training. The test signature is compared with each genuine sample and a matching score is calculated. A decision to accept or reject is made on the average of such scores. The entire experimentations were performed on a global signature database (GPDS-Signature Database) of 2106 signatures with 936 genuine signatures and 1170 skilled forgeries. To evaluate the performance, experiments were carried out with 4 to 5 genuine samples for training and with dierent ‘scores’. The proposed as well as the existing DTW-method were implemented and compared. It is observed that the proposed method is superior in terms of Equal Error Rate (EER) and Total Error Rate (TER) when 4 or 5 genuine signatures were used for training. Also it is observed that the False Acceptance Rate (FAR) of the proposed system decreases as the number of genuine training samples increases.

Book ChapterDOI
27 Aug 2009
TL;DR: A positive semidefinite similarity function with the same intuitive appeal as cross-correlation is introduced to facilitate this application of machine learning algorithms to the automatic classification of astronomy star surveys using time series of star brightness.
Abstract: We present a method for applying machine learning algorithms to the automatic classification of astronomy star surveys using time series of star brightness. Currently such classification requires a large amount of domain expert time. We show that a combination of phase invariant similarity and explicit features extracted from the time series provide domain expert level classification. To facilitate this application, we investigate the cross-correlation as a general phase invariant similarity function for time series. We establish several theoretical properties of cross-correlation showing that it is intuitively appealing and algorithmically tractable, but not positive semidefinite, and therefore not generally applicable with kernel methods. As a solution we introduce a positive semidefinite similarity function with the same intuitive appeal as cross-correlation. An experimental evaluation in the astronomy domain as well as several other data sets demonstrates the performance of the kernel and related similarity functions.

Proceedings ArticleDOI
01 Jan 2009
TL;DR: This paper proposes time-adaptive descriptors that capture the structure of self-similarity matrices while being invariant to the impact of time warps between views, and presents quantitative comparison results between time-fixed and time- adaptations for image sequences with different frame rates.
Abstract: This paper deals with the temporal synchronization of image sequences. Two instances of this problem are considered: (a) synchronization of human actions and (b) synchronization of dynamic scenes with view changes. To address both tasks and to reliably handle large view variations, we use self-similarity matrices which remain stable across views. We propose time-adaptive descriptors that capture the structure of these matrices while being invariant to the impact of time warps between views. Synchronizing two sequences is then performed by aligning their temporal descriptors using the Dynamic Time Warping algorithm. We present quantitative comparison results between time-fixed and time-adaptive descriptors for image sequences with different frame rates. We also illustrate the performance of the approach on several challenging videos with large view variations, drastic independent camera motions and within-class variability of human actions.

Book ChapterDOI
04 Jun 2009
TL;DR: A new DTW-based on-line signature verification system is presented and evaluated and its performance is among the best systems reported in the state of the art.
Abstract: A new DTW-based on-line signature verification system is presented and evaluated. The system is specially designed to operate under realistic conditions, it needs only a small number of genuine signatures to operate and it can be deployed in almost any signature capable capture device. Optimal features sets have been obtained experimentally, in order to adapt the system to environments with different levels of security. The system has been evaluated using four on-line signature databases (MCYT, SVC2004, BIOMET and MyIDEA) and its performance is among the best systems reported in the state of the art. Average EERs over these databases lay between 0.41% and 2.16% for random and skilled forgeries respectively.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: The distance between two signatures is computed by dynamic time warping (DTW) method and the reference signatures are used to assign special parameters for each signer, which makes the system cover the intra signer variation.
Abstract: This work describes an enhanced technique for on-line signature verification. The distance between two signatures is computed by dynamic time warping (DTW) method. The reference signatures are used to assign special parameters for each signer, which makes the system cover the intra signer variation. Several features are extracted. Systems with single and multi-features are tested. Curvature change and speed enhance success verification rate. The experiments have been carried out using the SUSIG online signature database. The best result for ROC area under curve is 99.5 with equal error rate 3.48%, and the best result for equal error rate is 3.06% with ROC area under curve 99.43.