scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic time warping published in 2008"


01 Jan 2008

544 citations


Journal ArticleDOI
TL;DR: A new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece) is presented.
Abstract: We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the music information retrieval (MIR) community along in the past, as it provides a direct and objective way to evaluate music similarity algorithms. This paper first presents a series of experiments carried out with two state-of-the-art methods for cover song identification. We have studied several components of these (such as chroma resolution and similarity, transposition, beat tracking or dynamic time warping constraints), in order to discover which characteristics would be desirable for a competitive cover song identifier. After analyzing many cross-validated results, the importance of these characteristics is discussed, and the best performing ones are finally applied to the newly proposed method. Multiple evaluations of this one confirm a large increase in identification accuracy when comparing it with alternative state-of-the-art approaches.

274 citations


Proceedings ArticleDOI
01 Sep 2008
TL;DR: In this paper, the authors propose aligned cluster analysis (ACA), a robust method to temporally segment streams of motion capture data into actions, which extends standard kernel k-means clustering in two ways: the cluster means contain a variable number of features, and a dynamic time warping (DTW) kernel is used to achieve temporal invariance.
Abstract: Temporal segmentation of human motion into actions is a crucial step for understanding and building computational models of human motion. Several issues contribute to the challenge of this task. These include the large variability in the temporal scale and periodicity of human actions, as well as the exponential nature of all possible movement combinations. We formulate the temporal segmentation problem as an extension of standard clustering algorithms. In particular, this paper proposes aligned cluster analysis (ACA), a robust method to temporally segment streams of motion capture data into actions. ACA extends standard kernel k-means clustering in two ways: (1) the cluster means contain a variable number of features, and (2) a dynamic time warping (DTW) kernel is used to achieve temporal invariance. Experimental results, reported on synthetic data and the Carnegie Mellon Motion Capture database, demonstrate its effectiveness.

197 citations


Journal ArticleDOI
TL;DR: It is found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SD TW, and combining DFFM mappings of multiple SDTW models with SDTW likelihoods can provide significant improvement over SDTW.
Abstract: To recognize speech, handwriting, or sign language, many hybrid approaches have been proposed that combine dynamic time warping (DTW) or hidden Markov models (HMMs) with discriminative classifiers. However, all methods rely directly on the likelihood models of DTW/HMM. We hypothesize that time warping and classification should be separated because of conflicting likelihood modeling demands. To overcome these restrictions, we propose using statistical DTW (SDTW) only for time warping, while classifying the warped features with a different method. Two novel statistical classifiers are proposed - combined discriminative feature detectors (CDFDs) and quadratic classification on DF Fisher mapping (Q-DFFM) - both using a selection of discriminative features (DFs), and are shown to outperform HMM and SDTW. However, we have found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SDTW. A proof-of-concept experiment, combining DFFM mappings of multiple SDTW models with SDTW likelihoods, shows that, also for model-combining, hybrid classification can provide significant improvement over SDTW. Although recognition is mainly based on 3D hand motion features, these results can be expected to generalize to recognition with more detailed measurements such as hand/body pose and facial expression.

178 citations


Book ChapterDOI
12 Oct 2008
TL;DR: This paper considers the existing shapes as a group, and study their similarity measures to the query shape in a graph structure, and learns a better metric through graph transduction by propagating the model through existing shapes, in a way similar to com- puting geodesics in shape manifold.
Abstract: Shape retrieval/matching is a very important topic in computer vision. The recent progress in this domain has been mostly driven by designing smart features for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this problem by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure. Our method is general and can be built on top of any existing shape matching algorithms. It learns a better metric through graph transduction by propagating the model through existing shapes, in a way similar to computing geodesics in shape manifold. However, the proposed method does not require learning the shape manifold explicitly and it does not require knowing any class labels of existing shapes. The presented experimental results demonstrate that the proposed approach yields significant improvements over the state-of-art shape matching algorithms. We obtained a retrieval rate of 91% on the MPEG-7 data set, which is the highest ever reported in the literature.

151 citations


Journal ArticleDOI
TL;DR: In this article, a curve-synchronization method that uses every trajectory in the sample as a reference to obtain pairwise warping functions in the first step is presented. And then, these initial pairwise Warping functions are then used to create improved estimators of the underlying individual warping function in the second step.
Abstract: SUMMARY Data collected by scientists are increasingly in the form of trajectories or curves. Often these can be viewed as realizations of a composite process driven by both amplitude and time variation. We consider the situation in which functional variation is dominated by time variation, and develop a curve-synchronization method that uses every trajectory in the sample as a reference to obtain pairwise warping functions in the first step. These initial pairwise warping functions are then used to create improved estimators of the underlying individual warping functions in the second step. A truncated averaging process is used to obtain robust estimation of individual warping functions. The method compares well with other available time-synchronization approaches and is illustrated with Berkeley growth data and gene expression data for multiple sclerosis.

149 citations


Journal ArticleDOI
TL;DR: In this article, the authors define a new type of registration process, in which the warping functions optimize the fit of a principal components decomposition to the aligned curves, effectively the features that this process aligns.
Abstract: A registration method can be defined as a process of aligning features of a sample of curves by monotone transformations of their domain. The aligned curves exhibit only amplitude variation, and the domain transformations, called warping functions, capture the phase variation in the original curves. In this article we precisely define a new type of registration process, in which the warping functions optimize the fit of a principal components decomposition to the aligned curves. The principal components are effectively the features that this process aligns. We discuss the relationship of registration to closure of a function space under convex operations, and define consistency for registration methods. We define an explicit decomposition of functional variation into amplitude and phase partitions, and develop an algorithm for combining registration with principal components analysis, and apply it to simulated and real data.

144 citations


Proceedings ArticleDOI
01 Jun 2008
TL;DR: The feasibility of the similarity based approach for DTW is investigated by applying the method to a large set of time-series classification problems.
Abstract: Effective use of support vector machines (SVMs) in classification necessitates the appropriate choice of a kernel. Designing problem specific kernels involves the definition of a similarity measure, with the condition that kernels are positive semi-definite (PSD). An alternative approach which places no such restrictions on the similarity measure is to construct a set of inputs and let each example be represented by its similarity to all the examples in this set and then apply a conventional SVM to this transformed data. Dynamic time warping (DTW) is a well established distance measure for time series but has been of limited use in SVMs since it is not obvious how it can be used to derive a PSD kernel. The feasibility of the similarity based approach for DTW is investigated by applying the method to a large set of time-series classification problems.

106 citations


Journal ArticleDOI
TL;DR: A new methodology in which spatial correlogram is used for the detection of the presence of spatial autocorrelations and for the classification of defect patterns on the wafer map and it is shown that the method is robust to random noise and has a robust performance regardless of defect location and size.
Abstract: A wafer map is a graphical illustration of the locations of defective chips on a wafer. Defective chips are likely to exhibit a spatial dependence across the wafer map, which contains useful information on the process of integrated circuit (IC) fabrication. An analysis of wafer map data helps to better understand ongoing process problems. This paper proposes a new methodology in which spatial correlogram is used for the detection of the presence of spatial autocorrelations and for the classification of defect patterns on the wafer map. After the detection of spatial autocorrelation based on our proposed spatial randomness test using spatial correlogram, the dynamic time warping algorithm which provides nonlinear alignments between two sequences to find optimal warping path is adopted for the automatic classification of spatial patterns based on spatial correlogram. We also develop generalized join-count (JC)-based statistics and then propose a procedure to determine the optimal weights of JC-based statistics. The proposed method is illustrated using real-life examples and simulated data sets. The experimental results show that our method is robust to random noise and has a robust performance regardless of defect location and size.

105 citations


Book Chapter
01 Jan 2008
TL;DR: DTW is considered as one effective method in speech pattern recognition, however the bad side of this method is that it requires a long processing time plus large storage capacity, especially for real time recognitions.
Abstract: Template matching is an alternative to perform speech recognition. However, the template matching encountered problems due to speaking rate variability, in which there exist timing differences between the two utterances. Speech has a constantly changing signal, thus it is almost impossible to get the same signal for two same utterances. The problem of time differences can be solved through DTW algorithm: warping the template with the test utterance based on their similarities. So, DTW algorithm actually is a procedure, which combines both warping and distance measurement. DTW is considered as one effective method in speech pattern recognition, however the bad side of this method is that it requires a long processing time plus large storage capacity, especially for real time recognitions.

102 citations


Book ChapterDOI
01 Jan 2008
TL;DR: The intrinsic dependence that the lexical content of the password phrase has on the accuracy is demonstrated and several research results will be presented and analyzed to show key techniques used in text-dependent speaker recognition systems from different sites.
Abstract: Text-dependent speaker recognition characterizes a speaker recognition task, such as verification or identification, in which the set of words (or lexicon) used during the testing phase is a subset of the ones present during the enrollment phase. The restricted lexicon enables very short enrollment (or registration) and testing sessions to deliver an accurate solution but, at the same time, represents scientific and technical challenges. Because of the short enrollment and testing sessions, text-dependent speaker recognition technology is particularly well suited for deployment in large-scale commercial applications. These are the bases for presenting an overview of the state of the art in text-dependent speaker recognition as well as emerging research avenues. In this chapter, we will demonstrate the intrinsic dependence that the lexical content of the password phrase has on the accuracy. Several research results will be presented and analyzed to show key techniques used in text-dependent speaker recognition systems from different sites. Among these, we mention multichannel speaker model synthesis and continuous adaptation of speaker models with threshold tracking. Since text-dependent speaker recognition is the most widely used voice biometric in commercial deployments, several

Journal ArticleDOI
TL;DR: A robust and efficient framework that uses dynamic time warping (DTW) as the core recognizer to perform online temporal fusion on either the raw data or the features is proposed and performance results are compared with a Hidden Markov Model (HMM) based system.

Book ChapterDOI
01 Jan 2008
TL;DR: Al WrUP{yhe–l`[9T`WZaUV—PXyXšk›LrUP¬YheY em‘–T VneWZz ™PfWZ”HP�’Pb[qT].
Abstract: ŒNP i5PnybdbVneghPŽTDY"PfWro\fi"]X\ VŽ\ [:)‘’em[“P†roT [9i ”/VheZWgW7PX[•yhe–l`[9T`WZaUV—PŽx PbVhe ˜Hd™TCWZeZ\ [9š ›Lr}PœyneslC[JT  WZaUV—PXyT VžPŸTCd™ faUemVžPži†aoyhe[Al¡Tœi e–l`eZWZe–¢bem[olœWtT5X‘£PbWH”

Journal ArticleDOI
TL;DR: A novel approach for estimating the global motion between frames using a curve warping technique known as dynamic time warping, which guarantees robustness also in presence of sharp illumination changes and moving objects.
Abstract: The widespread diffusion of hand-held devices with video recording capabilities requires the adoption of reliable digital Stabilization methods to enjoy the acquired sequences without disturbing jerkiness. In order to effectively get rid of the unwanted camera movements, an estimate of the global motion between adjacent frames is necessary. This paper presents a novel approach for estimating the global motion between frames using a curve warping technique known as dynamic time warping. The proposed algorithm guarantees robustness also in presence of sharp illumination changes and moving objects.

Proceedings ArticleDOI
Hairong Lv1, Zhonglin Lin1, Wenjun Yin1, Jin Dong1
26 Aug 2008
TL;DR: Three methods (global features of pressure sequences, dynamic time warping and traditional keystroke dynamics) are proposed for the emotion recognition task; then the three methods are combined together using a classifier fusion technique.
Abstract: This paper describes a new approach to emotion recognition based on pressure sensor keyboards. The pressure sensor keyboard is a new product that occurs in the market recently, which produces a pressure sequence when keystroke occurs. The analysis of the pressure sequence should be a novel research area. It has been used for identity verification in our previous research. In this paper, we use the pressure sequence for emotion recognition. Three methods (global features of pressure sequences, dynamic time warping and traditional keystroke dynamics) are proposed for the emotion recognition task; then we combined the three methods together using a classifier fusion technique. Several experiments were performed on a database containing 3000 samples (from 50 individuals, including six emotions: neutral, anger, fear, happiness, sadness and surprise) and the best result were achieved utilizing all the method, obtaining an overall accuracy of 93.4%. Our technique of emotion recognition has been used for intelligent game controlling and several other applications.

Proceedings ArticleDOI
09 Jun 2008
TL;DR: A method that significantly improves the efficiency of subsequence matching in large time series data sets under the dynamic time warping (DTW) distance measure is introduced, called EBSM, shorthand for Embedding-Based Subsequence Matching.
Abstract: A method for approximate subsequence matching is introduced, that significantly improves the efficiency of subsequence matching in large time series data sets under the dynamic time warping (DTW) distance measure. Our method is called EBSM, shorthand for Embedding-Based Subsequence Matching. The key idea is to convert subsequence matching to vector matching using an embedding. This embedding maps each database time series into a sequence of vectors, so that every step of every time series in the database is mapped to a vector. The embedding is computed by applying full dynamic time warping between reference objects and each database time series. At runtime, given a query object, an embedding of that object is computed in the same manner, by running dynamic time warping between the reference objects and the query. Comparing the embedding of the query with the database vectors is used to efficiently identify relatively few areas of interest in the database sequences. Those areas of interest are then fully explored using the exact DTW-based subsequence matching algorithm. Experiments on a large, public time series data set produce speedups of over one order of magnitude compared to brute-force search, with very small losses (

Journal ArticleDOI
TL;DR: The current results suggest that DTW is a valid and objective technique for letter-form analysis in handwriting and may hence be useful to evaluate the rehabilitation treatments of children suffering from poor handwriting.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: The use of the average-template with multiple features, first used in speech recognition, to better capture the intra-class variations for each action is proposed and the efficacy of this algorithm using the low dimensional feature to robustly recognize human actions is demonstrated.
Abstract: In this paper, we propose a fast method to recognize human actions which accounts for intra-class variability in the way an action is performed. We propose the use of a low dimensional feature vector which consists of (a) the projections of the width profile of the actor on to an ldquoaction basisrdquo and (b) simple spatio-temporal features. The action basis is built using eigenanalysis of walking sequences of different people. Given the limited amount of training data, Dynamic Time Warping (DTW) is used to perform recognition. We propose the use of the average-template with multiple features, first used in speech recognition, to better capture the intra-class variations for each action. We demonstrate the efficacy of this algorithm using our low dimensional feature to robustly recognize human actions. Furthermore, we show that view-invariant recognition can be performed by using a simple data fusion of two orthogonal views. For the actions that are still confusable, a temporal discriminative weighting scheme is used to distinguish between them. The effectiveness of our method is demonstrated by conducting experiments on the multi-view IXMAS dataset of persons performing various actions.

Patent
Mert Cevik1, Fuliang Weng1
13 Jun 2008
TL;DR: In this article, a dynamic time warping (DTW) based pattern comparison algorithm is used to find the best matching parts between a correction utterance and an original utterance.
Abstract: Embodiments of a method and system for detecting repeated patterns in dialog systems are described. The system includes a dynamic time warping (DTW) based pattern comparison algorithm that is used to find the best matching parts between a correction utterance and an original utterance. Reference patterns are generated from the correction utterance by an unsupervised segmentation scheme. No significant information about the position of the repeated parts in the correction utterance is assumed, as each reference pattern is compared with the original utterance from the beginning of the utterance to the end. A pattern comparison process with DTW is executed without knowledge of fixed end-points. A recursive DTW computation is executed to find the best matching parts that are considered as the repeated parts as well as the end-points of the utterance.

Book ChapterDOI
06 Sep 2008
TL;DR: In this paper, Hidden Markov Models (HMM) are used to model a laparoscopic cholecystectomy and the use of a model merging approach is proposed to build the HMM topology and compared to other methods of initializing a HMM.
Abstract: The amount of signals that can be recorded during a surgery, like tracking data or state of instruments, is constantly growing. These signals can be used to better understand surgical workflow and to build surgical assist systems that are aware of the current state of a surgery. This is a crucial issue for designing future systems that provide context-sensitive information and user interfaces. In this paper, Hidden Markov Models (HMM) are used to model a laparoscopic cholecystectomy. Seventeen signals, representing tool usage, from twelve surgeries are used to train the model. The use of a model merging approach is proposed to build the HMM topology and compared to other methods of initializing a HMM. The merging method allows building a model at a very fine level of detail that also reveals the workflow of a surgery in a human-understandable way. Results for detecting the current phase of a surgery and for predicting the remaining time of the procedure are presented.

Proceedings ArticleDOI
01 Nov 2008
TL;DR: The paper shows the memory efficiency offered by using speech detection for separating the words from silence and the improved system performance achieved by using Dynamic Time Warping while keeping in view the overall design process, supported by experimental results.
Abstract: Speech Recognition is a technology enabling human interaction with machines. The design of a speech recognition system capable of 100% accuracy is far from solved. This paper describes an isolated word, speaker dependent speech recognition system capable of recognizing spoken words at sufficiently high accuracy. The system has been tested and verified on MATLAB as well as the TMS320 C6713 DSK with an overall accuracy exceeding 90%. The paper shows the memory efficiency offered by using speech detection for separating the words from silence and the improved system performance achieved by using Dynamic Time Warping while keeping in view the overall design process, supported by experimental results. In future, speech recognition can serve as a means of data interoperability and distribution by allowing a mobile user (client) to retrieve information from the data networks (GPRS, WEB) using a client server architecture. The satellite system can be used as a wireless medium for accessing the data network.

Journal ArticleDOI
01 May 2008
TL;DR: A new method to align individual lines in a sequence of images based on dynamic time warping, finding a continuous path through a cost matrix that measures the similarity between regions of two frames being aligned.
Abstract: Imaging modalities that use a mechanically rotated endoscopic probe to scan a tubular volume, such as an artery, often suffer from image degradation due to nonuniform rotation distortion (NURD). In this paper, we present a new method to align individual lines in a sequence of images. It is based on dynamic time warping, finding a continuous path through a cost matrix that measures the similarity between regions of two frames being aligned. The path represents the angular mismatch corresponding to the NURD. The prime advantage of this novel approach compared to earlier work is the line-to-line continuity, which accurately captures slow intraframe variations in rotational velocity of the probe. The algorithm is optimized using data from a clinically available intravascular optical coherence tomography (OCT) instrument in a realistic vessel phantom. Its efficacy is demonstrated on an in vivo recording, and compared with conventional global rotation block matching. Intravascular OCT is a particularly challenging modality for motion correction because, in clinical situations, the image is generally undersampled, and correlation between the speckle in different lines or frames is absent. The algorithm can be adapted to ingest data frame-by-frame, and can be implemented to work in real time.

Journal ArticleDOI
TL;DR: A method to adapt the DTW technique in order to deal with the length and the density profile, which are common features used in classifying chromosomes, has the main advantage of requiring only a small training set in comparison with the conventional methods based on Bayesian classifiers or neural networks.

Journal ArticleDOI
TL;DR: A full-body layered deformable model (LDM) inspired by manually labeled silhouettes for automatic model-based gait recognition from part-level gait dynamics in monocular video sequences and can serve as an analysis tool for studying factors affecting the gait under various conditions.
Abstract: This paper proposes a full-body layered deformable model (LDM) inspired by manually labeled silhouettes for automatic model-based gait recognition from part-level gait dynamics in monocular video sequences. The LDM is defined for the fronto-parallel gait with 22 parameters describing the human body part shapes (widths and lengths) and dynamics (positions and orientations). There are four layers in the LDM and the limbs are deformable. Algorithms for LDM-based human body pose recovery are then developed to estimate the LDM parameters from both manually labeled and automatically extracted silhouettes, where the automatic silhouette extraction is through a coarse-to-fine localization and extraction procedure. The estimated LDM parameters are used for model-based gait recognition by employing the dynamic time warping for matching and adopting the combination scheme in AdaBoost.M2. While the existing model-based gait recognition approaches focus primarily on the lower limbs, the estimated LDM parameters enable us to study full-body model-based gait recognition by utilizing the dynamics of the upper limbs, the shoulders and the head as well. In the experiments, the LDM-based gait recognition is tested on gait sequences with differences in shoe-type, surface, carrying condition and time. The results demonstrate that the recognition performance benefits from not only the lower limb dynamics, but also the dynamics of the upper limbs, the shoulders and the head. In addition, the LDM can serve as an analysis tool for studying factors affecting the gait under various conditions.

Journal ArticleDOI
TL;DR: A method for the automatic handwritten signature verification (AHSV) that relies on global features that summarize different aspects of signature shape and dynamics of signature production and shows that the correctness of the algorithm detecting the signature is more acceptable.

Journal ArticleDOI
TL;DR: A novel clustering method is developed that combines an initial pairwise curve alignment to adjust for time variation within likely clusters and shows excellent performance over standard clustering methods in terms of cluster quality measures in simulations and for yeast and human fibroblast data sets.
Abstract: Current clustering methods are routinely applied to gene expression time course data to find genes with similar activation patterns and ultimately to understand the dynamics of biological processes. As the dynamic unfolding of a biological process often involves the activation of genes at different rates, successful clustering in this context requires dealing with varying time and shape patterns simultaneously. This motivates the combination of a novel pairwise warping with a suitable clustering method to discover expression shape clusters. We develop a novel clustering method that combines an initial pairwise curve alignment to adjust for time variation within likely clusters. The cluster-specific time synchronization method shows excellent performance over standard clustering methods in terms of cluster quality measures in simulations and for yeast and human fibroblast data sets. In the yeast example, the discovered clusters have high concordance with the known biological processes.

Journal ArticleDOI
TL;DR: A new data mining technique used to classify normal and pre-seizure electroencephalograms is proposed that is superior to the standard SVM and improves the brain activity classification.
Abstract: A new data mining technique used to classify normal and pre-seizure electroencephalograms is proposed. The technique is based on a dynamic time warping kernel combined with support vector machines (SVMs). The experimental results show that the technique is superior to the standard SVM and improves the brain activity classification.

Proceedings ArticleDOI
05 Nov 2008
TL;DR: Results indicate that only a small set of examples is required to perform reliable motion classification in a motion-capture database on the basis of a dynamic time warping (DTW) approach.
Abstract: Automatic generation of metadata is an important component of multimedia search-by-content systems as it both avoids the need for manual annotation as well as minimising subjective descriptions and human errors. This paper explores the automatic attachment of basic descriptions (or dasiaTagspsila) to human motion held in a motion-capture database on the basis of a dynamic time warping (DTW) approach. The captured motion is held in the Acclaim ASF/AMC format commonly used in game and movie motion capture work and the approach allows for the comparison and classification of motion from different subjects. The work analyses the bone rotations important to a small set of movements and results indicate that only a small set of examples is required to perform reliable motion classification.

Journal Article
TL;DR: A Dynamic Time Warping technique which reduces significantly the data processing time and memory size of multi-dimensional time series sampled by the biometric smart pen device BiSP is presented.
Abstract: The purpose of this paper is to present a Dynamic Time Warping technique which reduces significantly the data processing time and memory size of multi-dimensional time series sampled by the biometric smart pen device BiSP. The acquisition device is a novel ballpoint pen equipped with a diversity of sensors for monitoring the kinematics and dynamics of handwriting movement. The DTW algorithm has been applied for time series analysis of five different sensor channels providing pressure, acceleration and tilt data of the pen generated during handwriting on a paper pad. But the standard DTW has processing time and memory space problems which limit its practical use for online handwriting recognition. To face with this problem the DTW has been applied to the sum of the five sensor signals after an adequate down-sampling of the data. Preliminary results have shown that processing time and memory size could significantly be reduced without deterioration of performance in single character and word recognition. Further excellent accuracy in recognition was achieved which is mainly due to the reduced dynamic time warping RDTW technique and a novel pen device BiSP.

Journal Article
TL;DR: This work proposes a model-based method for tracking hand motion in space, thereby estimating the hand motion trajectory and demonstrates that the proposed trajectory estima- tor and classifier is suitable for Human Computer Interaction (HCI) platform.
Abstract: One very interesting field of research in Pattern Recog- nition that has gained much attention in recent times is Gesture Recognition. In this paper, we consider a form of dynamic hand gestures that are characterized by total movement of the hand (arm) in space. For these types of gestures, the shape of the hand (palm) during gesturing does not bear any significance. In our work, we propose a model-based method for tracking hand motion in space, thereby estimating the hand motion trajectory. We employ the dynamic time warping (DTW) algorithm for time alignment and normalization of spatio-temporal variations that exist among samples belonging to the same gesture class. During training, one template trajectory and one prototype feature vector are generated for every gesture class. Features used in our work include some static and dynamic motion trajectory features. Recognition is accomplished in two stages. In the first stage, all unlikely gesture classes are eliminated by comparing the input gesture trajectory to all the template trajectories. In the next stage, feature vector extracted from the input gesture is compared to all the class prototype feature vectors using a distance classifier. Experimental results demonstrate that our proposed trajectory estima- tor and classifier is suitable for Human Computer Interaction (HCI) platform.