scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic time warping published in 2013"


Journal ArticleDOI
TL;DR: Experimental studies show that TSF using simple features such as mean, standard deviation and slope is computationally efficient and outperforms strong competitors such as one-nearest-neighbor classifiers with dynamic time warping.

403 citations


Journal ArticleDOI
TL;DR: A framework to classify time series based on a bag-of-features representation (TSBF) that provides a feature-based approach that can handle warping (although differently from DTW), and experimental results show that TSBF provides better results than competitive methods on benchmark datasets from the UCR time series database.
Abstract: Time series classification is an important task with many challenging applications. A nearest neighbor (NN) classifier with dynamic time warping (DTW) distance is a strong solution in this context. On the other hand, feature-based approaches have been proposed as both classifiers and to provide insight into the series, but these approaches have problems handling translations and dilations in local patterns. Considering these shortcomings, we present a framework to classify time series based on a bag-of-features representation (TSBF). Multiple subsequences selected from random locations and of random lengths are partitioned into shorter intervals to capture the local information. Consequently, features computed from these subsequences measure properties at different locations and dilations when viewed from the original series. This provides a feature-based approach that can handle warping (although differently from DTW). Moreover, a supervised learner (that handles mixed data types, different units, etc.) integrates location information into a compact codebook through class probability estimates. Additionally, relevant global features can easily supplement the codebook. TSBF is compared to NN classifiers and other alternatives (bag-of-words strategies, sparse spatial sample kernels, shapelets). Our experimental results show that TSBF provides better results than competitive methods on benchmark datasets from the UCR time series database.

320 citations


Journal ArticleDOI
TL;DR: In this paper, a nonlinear accumulation of alignment errors is used to estimate relative time (or depth) shifts between two seismic images, where shifts are large and vary rapidly with time and space, and where images are contaminated with noise or for other reasons are not shifted versions of one another.
Abstract: The problem of estimating relative time (or depth) shifts between two seismic images is ubiquitous in seismic data processing. This problem is especially difficult where shifts are large and vary rapidly with time and space, and where images are contaminated with noise or for other reasons are not shifted versions of one another. A new solution to this problem requires only simple extensions of a classic dynamic time warping algorithm for speech recognition. A key component of that classic algorithm is a nonlinear accumulation of alignment errors. By applying the same nonlinear accumulator repeatedly in all directions along all sampled axes of a multidimensional image, I obtain a new and effective method for dynamic image warping (DIW). In tests where known shifts vary rapidly, this new method is more accurate than methods based on crosscorrelations of windowed images. DIW also aligns seismic reflectors well in examples where shifts are unknown, for images with differences not limited to time shifts.

264 citations


Journal ArticleDOI
14 Jun 2013-Sensors
TL;DR: In this article, a novel method for fully automatic facial expression recognition in facial image sequences is presented, where facial landmarks are automatically tracked in consecutive video frames, using displacements based on elastic bunch graph matching displacement estimation.
Abstract: Facial expressions are widely used in the behavioral interpretation of emotions, cognitive science, and social interactions. In this paper, we present a novel method for fully automatic facial expression recognition in facial image sequences. As the facial expression evolves over time facial landmarks are automatically tracked in consecutive video frames, using displacements based on elastic bunch graph matching displacement estimation. Feature vectors from individual landmarks, as well as pairs of landmarks tracking results are extracted, and normalized, with respect to the first frame in the sequence. The prototypical expression sequence for each class of facial expression is formed, by taking the median of the landmark tracking results from the training facial expression sequences. Multi-class AdaBoost with dynamic time warping similarity distance between the feature vector of input facial expression and prototypical facial expression, is used as a weak classifier to select the subset of discriminative feature vectors. Finally, two methods for facial expression recognition are presented, either by using multi-class AdaBoost with dynamic time warping, or by using support vector machine on the boosted feature vectors. The results on the Cohn-Kanade (CK+) facial expression database show a recognition accuracy of 95.17% and 97.35% using multi-class AdaBoost and support vector machines, respectively.

233 citations


Journal ArticleDOI
TL;DR: This work shows that by using a combination of four novel ideas the authors can search and mine massive time series for the first time, and demonstrates the following unintuitive fact: in large datasets they can exactly search under Dynamic Time Warping much more quickly than the current state-of-the-art Euclidean distance search algorithms.
Abstract: Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms, including classification, clustering, motif discovery, anomaly detection, and so on. The difficulty of scaling a search to large datasets explains to a great extent why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine massive time series for the first time. We demonstrate the following unintuitive fact: in large datasets we can exactly search under Dynamic Time Warping (DTW) much more quickly than the current state-of-the-art Euclidean distance search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We explain how our ideas allow us to solve higher-level time series data mining problems such as motif discovery and clustering at scales that would otherwise be untenable. Moreover, we show how our ideas allow us to efficiently support the uniform scaling distance measure, a measure whose utility seems to be underappreciated, but which we demonstrate here. In addition to mining massive datasets with up to one trillion datapoints, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.

196 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: Several supervised and unsupervised approaches to the problem of embedding speech segments of arbitrary length into fixed-dimensional spaces in which simple distances serve as a proxy for linguistically meaningful (phonetic, lexical, etc.) dissimilarities are explored.
Abstract: Measures of acoustic similarity between words or other units are critical for segmental exemplar-based acoustic models, spoken term discovery, and query-by-example search. Dynamic time warping (DTW) alignment cost has been the most commonly used measure, but it has well-known inadequacies. Some recently proposed alternatives require large amounts of training data. In the interest of finding more efficient, accurate, and low-resource alternatives, we consider the problem of embedding speech segments of arbitrary length into fixed-dimensional spaces in which simple distances (such as cosine or Euclidean) serve as a proxy for linguistically meaningful (phonetic, lexical, etc.) dissimilarities. Such embeddings would enable efficient audio indexing and permit application of standard distance learning techniques to segmental acoustic modeling. In this paper, we explore several supervised and unsupervised approaches to this problem and evaluate them on an acoustic word discrimination task. We identify several embedding algorithms that match or improve upon the DTW baseline in low-resource settings.

140 citations


Journal ArticleDOI
TL;DR: A novel metric for time series, called Move-Split-Merge (MSM), is proposed, which uses as building blocks three fundamental operations: Move, Split, and Merge, which can be applied in sequence to transform any time series into any other time series.
Abstract: A novel metric for time series, called Move-Split-Merge (MSM), is proposed. This metric uses as building blocks three fundamental operations: Move, Split, and Merge, which can be applied in sequence to transform any time series into any other time series. A Move operation changes the value of a single element, a Split operation converts a single element into two consecutive elements, and a Merge operation merges two consecutive elements into one. Each operation has an associated cost, and the MSM distance between two time series is defined to be the cost of the cheapest sequence of operations that transforms the first time series into the second one. An efficient, quadratic-time algorithm is provided for computing the MSM distance. MSM has the desirable properties of being metric, in contrast to the Dynamic Time Warping (DTW) distance, and invariant to the choice of origin, in contrast to the Edit Distance with Real Penalty (ERP) metric. At the same time, experiments with public time series data sets demonstrate that MSM is a meaningful distance measure, that oftentimes leads to lower nearest neighbor classification error rate compared to DTW and ERP.

136 citations


Proceedings Article
21 Feb 2013
TL;DR: This work proposes a weighted DTW method that weights joints by optimizing a discriminant ratio and demonstrates the recognition performance of the proposed weightedDTW with respect to the conventional DTW and state-of-the-art Kinect.
Abstract: With Microsoft’s launch of Kinect in 2010, and release of Kinect SDK in 2011, numerous applications and research projects exploring new ways in human-computer interaction have been enabled. Gesture recognition is a technology often used in human-computer interaction applications. Dynamic time warping (DTW) is a template matching algorithm and is one of the techniques used in gesture recognition. To recognize a gesture, DTW warps a time sequence of joint positions to reference time sequences and produces a similarity value. However, all body joints are not equally important in computing the similarity of two sequences. We propose a weighted DTW method that weights joints by optimizing a discriminant ratio. Finally, we demonstrate the recognition performance of our proposed weighted DTW with respect to the conventional DTW and state-of-

131 citations


Journal ArticleDOI
TL;DR: A new distance function based on a derivative is proposed, which considers the general shape of a time series rather than point-to-point function comparison, and is used in classification with the nearest neighbor rule.
Abstract: Over recent years the popularity of time series has soared. Given the widespread use of modern information technology, a large number of time series may be collected during business, medical or biological operations, for example. As a consequence there has been a dramatic increase in the amount of interest in querying and mining such data, which in turn has resulted in a large number of works introducing new methodologies for indexing, classification, clustering and approximation of time series. In particular, many new distance measures between time series have been introduced. In this paper, we propose a new distance function based on a derivative. In contrast to well-known measures from the literature, our approach considers the general shape of a time series rather than point-to-point function comparison. The new distance is used in classification with the nearest neighbor rule. In order to provide a comprehensive comparison, we conducted a set of experiments, testing effectiveness on 20 time series datasets from a wide variety of application domains. Our experiments show that our method provides a higher quality of classification on most of the examined datasets.

117 citations


Proceedings ArticleDOI
06 May 2013
TL;DR: A recognition procedure based on Dynamic Time Warping and Mahalanobis distance is found to ensure good classification results and enhance the consistency of the recognition via the use of a classifier allowing unknown as an answer.
Abstract: The automatic assessment of the level of independence of a person, based on the recognition of a set of Activities of Daily Living, is among the most challenging research fields in Ambient Intelligence. The article proposes a framework for the recognition of motion primitives, relying on Gaussian Mixture Modeling and Gaussian Mixture Regression for the creation of activity models. A recognition procedure based on Dynamic Time Warping and Mahalanobis distance is found to: (i) ensure good classification results; (ii) exploit the properties of GMM and GMR modeling to allow for an easy run-time recognition; (iii) enhance the consistency of the recognition via the use of a classifier allowing unknown as an answer.

114 citations


Proceedings ArticleDOI
25 Aug 2013
TL;DR: This work presents a holistic word recognition framework that represents the scene text image and synthetic images generated from lexicon words using gradient-based features, and recognizes the text in the image by matching the scene and synthetic image features with the novel weighted Dynamic Time Warping (wDTW) approach.
Abstract: Recognizing text in images taken in the wild is a challenging problem that has received great attention in recent years. Previous methods addressed this problem by first detecting individual characters, and then forming them into words. Such approaches often suffer from weak character detections, due to large intra-class variations, even more so than characters from scanned documents. We take a different view of the problem and present a holistic word recognition framework. In this, we first represent the scene text image and synthetic images generated from lexicon words using gradient-based features. We then recognize the text in the image by matching the scene and synthetic image features with our novel weighted Dynamic Time Warping (wDTW) approach. We perform experimental analysis on challenging public datasets, such as Street View Text and ICDAR 2003. Our proposed method significantly outperforms our earlier work in Mishra et al. (CVPR 2012), as well as many other recent works, such as Novikova et al. (ECCV 2012), Wang et al. al.(ICPR 2012), Wang et al.(ICCV 2011).

Proceedings Article
01 Jan 2013
TL;DR: The experimental results involving 35 signatures from 18 subjects and a brute-force attacker have shown that KinWrite can achieve a 100% precision and a 70% recall for verifying honest users, encouraging us to carry out a much larger scale study towards designing a foolproof system.
Abstract: Password-based authentication is easy to use but its security is bounded by how much a user can remember. Biometrics-based authentication requires no memorization but ‘resetting’ a biometric password may not always be possible. In this paper, we propose a user-friendly authentication system (KinWrite) that allows users to choose arbitrary, short and easy-to-memorize passwords while providing resilience to password cracking and password theft. KinWrite lets users write their passwords in 3D space and captures the handwriting motion using a low cost motion input sensing device—Kinect. The low resolution and noisy data captured by Kinect, combined with low consistency of in-space handwriting, have made it challenging to verify users. To overcome these challenges, we exploit the Dynamic Time Warping (DTW) algorithm to quantify similarities between handwritten passwords. Our experimental results involving 35 signatures from 18 subjects and a brute-force attacker have shown that KinWrite can achieve a 100% precision and a 70% recall (the worst case) for verifying honest users, encouraging us to carry out a much larger scale study towards designing a foolproof system.

Journal ArticleDOI
TL;DR: The new Cross Dynamic Time Warping (DTW) Metric gives the best performance for gait recognition where users are identified correctly in 89.3% of the cases and the false positive probability is as low as 1.4%.

Proceedings ArticleDOI
09 Dec 2013
TL;DR: Experimental results have shown that the precision and recall scores for 20 classes of the proposed multi-modal gesture recognition framework can achieve 0.8829 and0.8890 respectively, which proves that the method is able to correctly reject false detection caused by single classifier.
Abstract: This paper proposes a novel multi-modal gesture recognition framework and introduces its application to continuous sign language recognition. A Hidden Markov Model is used to construct the audio feature classifier. A skeleton feature classifier is trained to provided complementary information based on the Dynamic Time Warping model. The confidence scores generated by two classifiers are firstly normalized and then combined to produce a weighted sum for the final recognition. Experimental results have shown that the precision and recall scores for 20 classes of our multi-modal recognition framework can achieve 0.8829 and 0.8890 respectively, which proves that our method is able to correctly reject false detection caused by single classifier. Our approach scored 0.12756 in mean Levenshtein distance and was ranked 1st in the Multi-modal Gesture Recognition Challenge in 2013.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: Although recurrence plots cannot provide the best accuracy rates for all data sets, it is demonstrated that it can be predicted ahead of time that the method will outperform the time representation with Euclidean and Dynamic Time Warping distances.
Abstract: There is a huge increase of interest for time series methods and techniques. Virtually every piece of information collected from human, natural, and biological processes is susceptible to changes over time, and the study of how these changes occur is a central issue in fully understanding such processes. Among all time series mining tasks, classification is likely to be the most prominent one. In time series classification there is a significant body of empirical research that indicates that k-nearest neighbor rule in the time domain is very effective. However, certain time series features are not easily identified in this domain and a change in representation may reveal some significant and unknown features. In this work, we propose the use of recurrence plots as representation domain for time series classification. Our approach measures the similarity between recurrence plots using Campana-Keogh (CK-1) distance, a Kolmogorov complexity-based distance that uses video compression algorithms to estimate image similarity. We show that recurrence plots allied to CK-1 distance lead to significant improvements in accuracy rates compared to Euclidean distance and Dynamic Time Warping in several data sets. Although recurrence plots cannot provide the best accuracy rates for all data sets, we demonstrate that we can predict ahead of time that our method will outperform the time representation with Euclidean and Dynamic Time Warping distances.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: A novel discriminative learning-based temporal alignment method, called maximum margin temporal warping (MMTW), to align two action sequences and measure their matching score, based on the latent structure SVM formulation.
Abstract: Temporal misalignment and duration variation in video actions largely influence the performance of action recognition, but it is very difficult to specify effective temporal alignment on action sequences. To address this challenge, this paper proposes a novel discriminative learning-based temporal alignment method, called maximum margin temporal warping (MMTW), to align two action sequences and measure their matching score. Based on the latent structure SVM formulation, the proposed MMTW method is able to learn a phantom action template to represent an action class for maximum discrimination against other classes. The recognition of this action class is based on the associated learned alignment of the input action. Extensive experiments on five benchmark datasets have demonstrated that this MMTW model is able to significantly promote the accuracy and robustness of action recognition under temporal misalignment and variations.

Proceedings ArticleDOI
02 Dec 2013
TL;DR: A technique for gait cycle extraction by incorporating the Piecewise Linear Approximation (PLA) technique is presented and two new approaches to classify gait features extracted from the cycle-based segmentation by using Support Vector Machines (SVMs) are presented.
Abstract: Biometric gait authentication using Personal Mobile Device (PMD) based accelerometer sensors offers a user-friendly, unobtrusive, and periodic way of authenticating individuals on PMD. In this paper, we present a technique for gait cycle extraction by incorporating the Piecewise Linear Approximation (PLA) technique. We also present two new approaches to classify gait features extracted from the cycle-based segmentation by using Support Vector Machines (SVMs); a) pre-computed data matrix, b) pre-computed kernel matrix. In the first approach, we used Dynamic Time Warping (DTW) distance to compute data matrices, and in the later DTW is used for constructing an elastic similarity measure based kernel function called Gaussian Dynamic Time Warp (GDTW) kernel. Both approaches utilize the DTW similarity measure and can be used for classifying equal length gait cycles, as well as different length gait cycles. To evaluate our approaches we used normal walk biometric gait data of 51 participants. This gait data is collected by attaching a PMD to the belt around the waist, on the right-hand side of the hip. Results show that these new approaches need to be studied more, and potentially lead us to design more robust and reliable gait authentication systems using PMD based accelerometer sensor.

Journal Article
TL;DR: An approach of isolated speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping and to cope with different speaking speeds in speech recognition Dynamic time Warping is used.
Abstract: This paper describes an approach of isolated speech recognition by using the Mel-Scale Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW). Several features are extracted from speech signal of spoken words. An experimental database of total five speakers, speaking 10 digits each is collected under acoustically controlled room is taken. MFCC are extracted from speech signal of spoken words. To cope with different speaking speeds in speech recognition Dynamic Time Warping (DTW) is used. DTW is an algorithm, which is used for measuring similarity between two sequences, which may vary in time or speed.

Journal ArticleDOI
TL;DR: A new approach to trace the dynamic patterns of task-based functional connectivity, by combining signal segmentation, dynamic time warping (DTW), and Quality Threshold (QT) clustering techniques, is presented.

Book ChapterDOI
10 Sep 2013
TL;DR: A new robust method for inertial MEM (MicroElectroMechanical systems) 3D gesture recognition based on Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN) for gesture classification from raw MEM data is presented.
Abstract: This paper presents a new robust method for inertial MEM (MicroElectroMechanical systems) 3D gesture recognition. The linear acceleration and the angular velocity, respectively provided by the accelerometer and the gyrometer, are sampled in time resulting in 6D values at each time step which are used as inputs for the gesture recognition system. We propose to build a system based on Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTM-RNN) for gesture classification from raw MEM data. We also compare this system to a geometric approach using DTW (Dynamic Time Warping) and a statistical method based on HMM (Hidden Markov Model) from filtered and denoised MEM data. Experimental results on 22 individuals producing 14 gestures in the air show that the proposed approach outperforms classical classification methods with a classification mean rate of 95.57% and a standard deviation of 0.50 for 616 test gestures. Furthermore, these experiments underline that combining accelerometer and gyrometer information gives better results that using a single inertial description.

Journal ArticleDOI
TL;DR: In this article, the authors used dynamic time warping (DTW) to process the motor current signals for detecting and quantifying common faults in a downstream two-stage reciprocating compressor.

Proceedings ArticleDOI
26 May 2013
TL;DR: This work quantitatively represent emotion flow within an utterance by estimating short-time affective characteristics and shows that the similarity-based pattern modeling outperforms both a feature-based baseline and static modeling and provides insight into typical high-level patterns of emotion.
Abstract: Human emotion changes continuously and sequentially. This results in dynamics intrinsic to affective communication. One of the goals of automatic emotion recognition research is to computationally represent and analyze these dynamic patterns. In this work, we focus on the global utterance-level dynamics. We are motivated by the hypothesis that global dynamics have emotion-specific variations that can be used to differentiate between emotion classes. Consequently, classification systems that focus on these patterns will be able to make accurate emotional assessments. We quantitatively represent emotion flow within an utterance by estimating short-time affective characteristics. We compare time-series estimates of these characteristics using Dynamic Time Warping, a time-series similarity measure. We demonstrate that this similarity can effectively recognize the affective label of the utterance. The similarity-based pattern modeling outperforms both a feature-based baseline and static modeling. It also provides insight into typical high-level patterns of emotion. We visualize these dynamic patterns and the similarities between the patterns to gain insight into the nature of emotion expression.

01 Jan 2013
TL;DR: A novel way to control and interact with computers by moving ngers in the air is proposed and a fast algorithm using dynamic time warping to recognize characters in online fashion is proposed.
Abstract: Recent technologies in vision sensors are capable of capturing 3D nger positions and movements. We propose a novel way to control and interact with computers by moving ngers in the air. The positions of ngers are precisely captured by a computer vision device. By tracking the moving patterns of ngers, we can then recognize users’ intended control commands or input information. We demonstrate this human input approach through an example application of handwriting recognition. By treating the input as a time series of 3D positions, we propose a fast algorithm using dynamic time warping to recognize characters in online fashion. We employ various optimization techniques to recognize in real time as one writes. Experiments show promising recognition performance and speed.

Proceedings ArticleDOI
26 May 2013
TL;DR: A dynamic time-warping (DTW) based framework applied to the Kinect's skeletal information for user access control is proposed and yields promising results.
Abstract: The Kinect has primarily been used as a gesture-driven device for motion-based controls. To date, Kinect-based research has predominantly focused on improving tracking and gesture recognition across a wide base of users. In this paper, we propose to use the Kinect for biometrics; rather than accommodating a wide range of users we exploit each user's uniqueness in terms of gestures. Unlike pure biometrics, such as iris scanners, face detectors, and fingerprint recognition which depend on irrevocable biometric data, the Kinect can provide additional revocable gesture information. We propose a dynamic time-warping (DTW) based framework applied to the Kinect's skeletal information for user access control. Our approach is validated in two scenarios: user identification, and user authentication on a dataset of 20 individuals performing 8 unique gestures. We obtain an overall 4.14%, and 1.89% Equal Error Rate (EER) in user identification, and user authentication, respectively, for a gesture and consistently outperform related work on this dataset. Given the natural noise present in the real-time depth sensor this yields promising results.

Journal ArticleDOI
TL;DR: A genetic algorithm coupled with Dynamic Time Warping (DTW) is proposed to solve the issues of misalignment among the reference systems and the lack of synchronization among the devices in a Vicon environment.
Abstract: This paper presents a methodology for a reliable comparison among Inertial Measurement Units or attitude estimation devices in a Vicon environment. The misalignment among the reference systems and the lack of synchronization among the devices are the main problems for the correct performance evaluation using Vicon as reference measurement system. We propose a genetic algorithm coupled with Dynamic Time Warping (DTW) to solve these issues. To validate the efficacy of the methodology, a performance comparison is implemented between the WB-3 ultra-miniaturized Inertial Measurement Unit (IMU), developed by our group, with the commercial IMU InertiaCube3? by InterSense.

Journal ArticleDOI
TL;DR: A three-phase gait recognition method that analyses the spatio-temporal shape and dynamic motion characteristics of a human subject's silhouettes to identify the subject in the presence of most of the challenging factors that affect existing gait recognized systems is presented.

Journal ArticleDOI
23 Oct 2013-PLOS ONE
TL;DR: For each of the three clustering techniques, a seven-level Parsons algorithm provided better clustering than the correlation and dynamic time warping algorithms, and was closer to the near-perfect visual categorisations of human judges.
Abstract: Bottlenose dolphins (Tursiops truncatus) produce many vocalisations, including whistles that are unique to the individual producing them. Such “signature whistles” play a role in individual recognition and maintaining group integrity. Previous work has shown that humans can successfully group the spectrographic representations of signature whistles according to the individual dolphins that produced them. However, attempts at using mathematical algorithms to perform a similar task have been less successful. A greater understanding of the encoding of identity information in signature whistles is important for assessing similarity of whistles and thus social influences on the development of these learned calls. We re-examined 400 signature whistles from 20 individual dolphins used in a previous study, and tested the performance of new mathematical algorithms. We compared the measure used in the original study (correlation matrix of evenly sampled frequency measurements) to one used in several previous studies (similarity matrix of time-warped whistles), and to a new algorithm based on the Parsons code, used in music retrieval databases. The Parsons code records the direction of frequency change at each time step, and is effective at capturing human perception of music. We analysed similarity matrices from each of these three techniques, as well as a random control, by unsupervised clustering using three separate techniques: k-means clustering, hierarchical clustering, and an adaptive resonance theory neural network. For each of the three clustering techniques, a seven-level Parsons algorithm provided better clustering than the correlation and dynamic time warping algorithms, and was closer to the near-perfect visual categorisations of human judges. Thus, the Parsons code captures much of the individual identity information present in signature whistles, and may prove useful in studies requiring quantification of whistle similarity.

Proceedings ArticleDOI
19 Oct 2013
TL;DR: This paper empirically compares 48 measures on 42 time series data sets and shows that Complex Invariant Distance DTW (CIDDTW) significantly outperforms DTW and that CIDDTw, DTW, CID, Minkowski L-p (p-norm difference with data set-crafted "p" parameter), Lorentzian L-infinity, Manhattan L-1, Average L-2, Dice distance, and Jaccard distance outperform ED.
Abstract: Distance and dissimilarity functions are of undoubted importance to Time Series Data Mining. There are literally hundreds of methods proposed in the literature that rely on a dissimilarity measure as the main manner to compare objects. One notable example is the 1-Nearest Neighbor classification algorithm. These methods frequently outperform more complex methods in tasks such as classification, clustering, prediction, and anomaly detection. All these methods leave open the distance or dissimilarity function, being Euclidean distance (ED) and Dynamic Time Warping (DTW) the two most used dissimilarity measures in the literature. This paper empirically compares 48 measures on 42 time series data sets. Our objective is to call the attention of the research community about other dissimilarity measures besides ED and DTW, some of them able to significantly outperform these measures in classification. Our results show that Complex Invariant Distance DTW (CIDDTW) significantly outperforms DTW and that CIDDTW, DTW, CID, Minkowski L-p (p-norm difference with data set-crafted "p" parameter), Lorentzian L-infinity, Manhattan L-1, Average L-1/L-infinity (arithmetic average), Dice distance, and Jaccard distance outperform ED, but only CIDDTW, DTW, and CID outperform ED with statistical significance.

Journal ArticleDOI
01 Jun 2013
TL;DR: The experimental evaluation and case studies show that CrossMatch can incrementally discover common local patterns in data streams within constant time (per update) and space and provide a theoretical analysis and prove that the algorithm does not sacrifice accuracy.
Abstract: Subsequence matching is a basic problem in the field of data stream mining. In recent years, there has been significant research effort spent on efficiently finding subsequences similar to a query sequence. Another challenging issue in relation to subsequence matching is how we identify common local patterns when both sequences are evolving. This problem arises in trend detection, clustering, and outlier detection. Dynamic time warping (DTW) is often used for subsequence matching and is a powerful similarity measure. However, the straightforward method using DTW incurs a high computation cost for this problem. In this paper, we propose a one-pass algorithm, CrossMatch, that achieves the above goal. CrossMatch addresses two important challenges: (1) how can we identify common local patterns efficiently without any omission? (2) how can we find common local patterns in data stream processing? To tackle these challenges, CrossMatch incorporates three ideas: (1) a scoring function, which computes the DTW distance indirectly to reduce the computation cost, (2) a position matrix, which stores starting positions to keep track of common local patterns in a streaming fashion, and (3) a streaming algorithm, which identifies common local patterns efficiently and outputs them on the fly. We provide a theoretical analysis and prove that our algorithm does not sacrifice accuracy. Our experimental evaluation and case studies show that CrossMatch can incrementally discover common local patterns in data streams within constant time (per update) and space.

Proceedings ArticleDOI
15 Jul 2013
TL;DR: A fast and memory efficient Dynamic Time Warping (MES-DTW) algorithm for the task of Query-by-Example Spoken Term Detection (QbE-STD) and describes the system used to perform it, including an energy-based quantification for speech/non-speech detection and an overlap detector for putative matches.
Abstract: In this paper we propose a fast and memory efficient Dynamic Time Warping (MES-DTW) algorithm for the task of Query-by-Example Spoken Term Detection (QbE-STD). The proposed algorithm is based on the subsequence-DTW (S-DTW) algorithm, which allows the search for small spoken queries within a much bigger search collection of spoken documents by considering fixed start-end points in the query and discovering optimal matching subsequences along the search collection. The proposed algorithm applies some modifications to S-DTW that make it better suited for the QbE-STD task, including a way to perform the matching with virtually no system memory, optimal when querying large scale databases. We also describe the system used to perform QbE-STD, including an energy-based quantification for speech/non-speech detection and an overlap detector for putative matches. We test the system proposed using the Mediaeval 2012 spoken-web-search dataset and show that, in addition to the memory savings, the proposed algorithm brings an advantage in terms of matching accuracy (up to 0.235 absolute MTWV increase) and speed (around 25% faster) in comparison to the original S-DTW.