Topic
Dynamic time warping
About: Dynamic time warping is a research topic. Over the lifetime, 6013 publications have been published within this topic receiving 133130 citations.
Papers published on a yearly basis
Papers
More filters
••
25 Mar 2012TL;DR: Experimental results show that the ASMtokenizer outperforms a conventional GMM tokenizer and a language-mismatched phoneme recognizer, and the performance is significantly improved by applying unsupervised speaker normalization techniques.
Abstract: The framework of posteriorgram-based template matching has been shown to be successful for query-by-example spoken term detection (STD). This framework employs a tokenizer to convert query examples and test utterances into frame-level posteriorgrams, and applies dynamic time warping to match the query posteriorgrams with test posteriorgrams to locate possible occurrences of the query term. It is not trivial to design a reliable tokenizer due to heterogeneous test conditions and the limitation of training resources. This paper presents a study of using acoustic segment models (ASMs) as the tokenizer. ASMs can be obtained following an unsupervised iterative procedure without any training transcriptions. The STD performance of the ASM tokenizer is evaluated on Fisher Corpus with comparison to three alternative tokenizers. Experimental results show that the ASM tokenizer outperforms a conventional GMM tokenizer and a language-mismatched phoneme recognizer. In addition, the performance is significantly improved by applying unsupervised speaker normalization techniques.
66 citations
••
TL;DR: A novel time-analysis framework is presented for large-scale comorbidity studies and the proposed methodology for the temporal assessment of common disease trajectories could serve as the preliminary basis of a disease prediction system.
Abstract: Time is a crucial parameter in the assessment of comorbidities in population-based studies, as it permits to identify more complex disease patterns apart from the pairwise disease associations. So far, it has been, either, completely ignored or only, taken into account by assessing the temporal directionality of identified comorbidity pairs. In this work, a novel time-analysis framework is presented for large-scale comorbidity studies. The disease-history vectors of patients of a regional Spanish health dataset are represented as time sequences of ordered disease diagnoses. Statistically significant pairwise disease associations are identified and their temporal directionality is assessed. Subsequently, an unsupervised clustering algorithm, based on Dynamic Time Warping, is applied on the common disease trajectories in order to group them according to the temporal patterns that they share. The proposed methodology for the temporal assessment of such trajectories could serve as the preliminary basis of a disease prediction system.
65 citations
••
TL;DR: A novel multi-scale Gesture Model is presented here as a set of 3D spatio-temporal surfaces of a time-varying contour that achieves high recognition rates and three approaches, which differ mainly in endpoint localization, are proposed.
65 citations
01 Jan 2003
TL;DR: Investigations into a number of different matching techniques for word images, including shape context matching, SSD correlation, Euclidean Distance Mapping and dynamic time warping are described.
Abstract: Indexing and searching collections of handwritten archival documents and manuscripts has always been a challenge because handwriting recognizers do not perform well on such noisy documents. Given a collection of documents written by a single author (or a few authors), one can apply a technique called word spotting. The approach is to cluster word images based on their visual appearance, after segmenting them from the documents. Annotation can then be performed for clusters rather than documents. Given segmented pages, matching handwritten word images in historical documents is a great challenge due to the variations in handwriting and the noise in the images. We describe investigations into a number of different matching techniques for word images. These include shape context matching, SSD correlation, Euclidean Distance Mapping and dynamic time warping. Experimental results show that dynamic time warping works best and gives an average precision of around 70% on a test set of 2000 word images (from ten pages) from the George Washington corpus. Dynamic time warping is relatively expensive and we will describe approaches to speeding up the computation so that the approach scales. Our immediate goal is to process a set of 100 page images with a longer term goal of processing all 6000 available pages.
65 citations
••
26 Apr 1985TL;DR: This study compared several different spectral distortion measures including the Itakura-Saito (IS), the log likelihood ratio (LLR), thelihood ratio (LR), the cepstral (CEP), and two perceptually based distortion measures, the weighted likelihood ratios (WLR) and the weighted slope metric (WSM) in terms of their effects on the performance of a standard dynamic time warping (DTW) based, isolated word, speech recognizer.
Abstract: In this study we compared several different spectral distortion measures including the Itakura-Saito (IS), the log likelihood ratio (LLR), the likelihood ratio (LR), the cepstral (CEP), and two perceptually based distortion measures, the weighted likelihood ratio (WLR) and the weighted slope metric (WSM) distortion measures, in terms of their effects on the performance of a standard dynamic time warping (DTW) based, isolated word, speech recognizer. Two modifications of the basic forms of each measure were also investigated, namely a Bark-scale frequency warping and the incorporation of suprasegmental energy information. All distortion measures and their modifications were tested on an alpha-digit vocabulary, 4-talker, telephone recording data base. The results can be summarized as: (1) All LPC-based distortion measures performed reasonably well. The LLR and WSM distortion measures gave the highest recognition accuracy, while the IS distortion measure gave the lowest score; (2) Whereas the addition of suprasegmental energy information helped the recognition performance, the use of gain and absolute loudness degraded the performance; (3) Bark-scale frequency warping did not perform as well as its unwarped counterpart; (4) The WLR distortion measure did not perform as well as its unweighted counterpart.
65 citations