Showing papers on "Dynamic time warping published in 2002"

PDF

Open Access

Proceedings Article•DOI•

[...]

Michail Vlachos¹, George Kollios², Dimitrios Gunopulos¹•Institutions (2)

University of California, Riverside¹, Boston University²

26 Feb 2002

TL;DR: This work formalizes non-metric similarity functions based on the longest common subsequence (LCSS), which are very robust to noise and furthermore provide an intuitive notion of similarity between trajectories by giving more weight to similar portions of the sequences.

...read moreread less

Abstract: We investigate techniques for analysis and retrieval of object trajectories in two or three dimensional space. Such data usually contain a large amount of noise, that has made previously used metrics fail. Therefore, we formalize non-metric similarity functions based on the longest common subsequence (LCSS), which are very robust to noise and furthermore provide an intuitive notion of similarity between trajectories by giving more weight to similar portions of the sequences. Stretching of sequences in time is allowed, as well as global translation of the sequences in space. Efficient approximate algorithms that compute these similarity measures are also provided. We compare these new methods to the widely used Euclidean and time warping distance functions (for real and synthetic data) and show the superiority of our approach, especially in the strong presence of noise. We prove a weaker version of the triangle inequality and employ it in an indexing structure to answer nearest neighbor queries. Finally, we present experimental results that validate the accuracy and efficiency of our approach.

...read moreread less

1,504 citations

Book Chapter•DOI•

Chapter 36 – Exact Indexing of Dynamic Time Warping

[...]

Eamonn Keogh¹•Institutions (1)

University of California, Riverside¹

01 Jan 2002

TL;DR: Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis, but does not obey the triangular inequality and, thus, has resisted attempts at exact indexing.

...read moreread less

Abstract: Publisher Summary The indexing of very large time series databases has attracted the attention of database community in recent years. The vast majority of work in this area has focused on indexing under the Euclidean distance metric. The problem of indexing time series has attracted much research interest in the database community. Most algorithms that are used to index time series utilize the Euclidean distance or some variation thereof. However, it has been forcefully shown that the Euclidean distance is a very brittle distance measure. Dynamic time warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis. Because of this flexibility, DTW is widely used in science, medicine, industry, and finance. Unfortunately, however, DTW does not obey the triangular inequality and, thus, has resisted attempts at exact indexing. Instead, many researchers have introduced approximate indexing techniques, or abandoned the idea of indexing and concentrated on speeding up sequential search.

...read moreread less

1,033 citations

Proceedings Article•

Exact indexing of dynamic time warping

[...]

Eamonn Keogh¹•Institutions (1)

University of California, Riverside¹

20 Aug 2002

TL;DR: In this paper, a technique for the exact indexing of Dynamic Time Warping (DTW) is proposed. But the technique is not suitable for time series and does not guarantee no false dismissals.

...read moreread less

Abstract: The problem of indexing time series has attracted much research interest in the database community. Most algorithms used to index time series utilize the Euclidean distance or some variation thereof. However is has been forcefully shown that the Euclidean distance is a very brittle distance measure. Dynamic Time Warping (DTW) is a much more robust distance measure for time series, allowing similar shapes to match even if they are out of phase in the time axis. Because of this flexibility, DTW is widely used in science, medicine, industry and finance. Unfortunately however, DTW does not obey the triangular inequality, and thus has resisted attempts at exact indexing. Instead, many researchers have introduced approximate indexing techniques, or abandoned the idea of indexing and concentrated on speeding up sequential search. In this work we introduce a novel technique for the exact indexing of DTW. We prove that our method guarantees no false dismissals and we demonstrate its vast superiority over all competing approaches in the largest and most comprehensive set of time series indexing experiments ever undertaken.

...read moreread less

668 citations

Proceedings Article•DOI•

Online handwriting recognition with support vector machines - a kernel approach

[...]

Claus Bahlmann¹, Bernard Haasdonk¹, Hans Burkhardt¹•Institutions (1)

University of Freiburg¹

06 Aug 2002

TL;DR: A novel classification approach for online handwriting recognition is described that combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel that is directly addresses the problem of discrimination by creating class boundaries and thus is less sensitive to modeling assumptions.

...read moreread less

Abstract: In this paper we describe a novel classification approach for online handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this kernel Gaussian DTW (GDTW) kernel. This kernel approach has a main advantage over common HMM techniques. It does not assume a model for the generative class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variable-sized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g., speech recognition or genome processing. We show experiments with this kernel approach on the UNIPEN handwriting data, achieving results comparable to an HMM-based technique.

...read moreread less

377 citations

Proceedings Article•

Iterative Deepening Dynamic Time Warping for Time Series.

[...]

Selina Chu, Eamonn Keogh, David M. Hart, Michael J. Pazzani

01 Jan 2002

TL;DR: Almost all algorithms that operate on time series data need to compute the similarity between them, and Euclidean distance, or some extension or modification thereof, is typically used.

...read moreread less

Abstract: Time series are a ubiquitous form of data occurring in virtually every scientific discipline and business application. There has been much recent work on adapting data mining algorithms to time series databases. For example, Das et al. attempt to show how association rules can be learned from time series [7]. Debregeas and Hebrail [8] demonstrate a technique for scaling up time series clustering algorithms to massive datasets. Keogh and Pazzani introduced a new, scalable time series classification algorithm [16]. Almost all algorithms that operate on time series data need to compute the similarity between them. Euclidean distance, or some extension or modification thereof, is typically used. However as we will demonstrate in Section 2.1, Euclidean distance can be an extremely brittle distance measure.

...read moreread less

285 citations

Journal Article•DOI•

Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography - mass spectrometry data

[...]

Dan Bylund¹, Rolf Danielsson¹, Gunnar Malmquist², Karin E. Markides¹•Institutions (2)

Uppsala University¹, Amersham plc²

05 Jul 2002-Journal of Chromatography A

TL;DR: A time warping algorithm for alignment of LC-MS data in the chromatographic direction has been examined and with moderate time shifts present in the data, pre-processing with this algorithm yields approximately trilinear data for which reasonable models can be made.

...read moreread less

226 citations

Journal Article•DOI•

A comparison of two algorithms for warping of analytical signals

[...]

V. Pravdova, Beata Walczak, Desire Massart

01 Apr 2002-Analytica Chimica Acta

TL;DR: Two techniques for alignment of profiles, namely dynamic time Warping (DTW) and correlation optimized warping (COW) were tested and compared and the attention was focused on chromatographic and spectroscopic profiles.

...read moreread less

212 citations

Book Chapter•DOI•

Gait Sequence Analysis Using Frieze Patterns

[...]

Yanxi Liu¹, Robert T. Collins¹, Yanghai Tsin¹•Institutions (1)

Carnegie Mellon University¹

28 May 2002

TL;DR: This work analyzes walking people using a gait sequence representation that bypasses the need for frame-to-frame tracking of body parts and finds that the frieze groups of the gait patterns and their canonical tiles enable us to estimate viewing direction of human walking videos.

...read moreread less

Abstract: We analyze walking people using a gait sequence representation that bypasses the need for frame-to-frame tracking of body parts The gait representation maps a video sequence of silhouettes into a pair of two-dimensional spatio-temporal patterns that are near-periodic along the time axis Mathematically, such patterns are called "frieze" patterns and associated symmetry groups "frieze groups" With the help of a walking humanoid avatar, we explore variation in gait frieze patterns with respect to viewing angle, and find that the frieze groups of the gait patterns and their canonical tiles enable us to estimate viewing direction of human walking videos In addition, analysis of periodic patterns allows us to determine the dynamic time warping and affine scaling that aligns two gait sequences from similar viewpoints We also show how gait alignment can be used to perform human identification and model-based body part segmentation

...read moreread less

204 citations

Proceedings Article•

Measuring the similarity of Rhythmic Patterns.

[...]

Jouni Paulus¹, Anssi Klapuri¹•Institutions (1)

Tampere University of Technology¹

01 Jan 2002

TL;DR: A system is described which measures the similarity of two arbitrary rhythmic patterns, and behaved consistently by assigning high similarity measures to similar musical rhythms, even when performed using different sound sets.

...read moreread less

Abstract: A system is described which measures the similarity of two arbitrary rhythmic patterns. The patterns are represented as acoustic signals, and are not assumed to have been performed with similar sound sets. Two novel methods are presented that constitute the algorithmic core of the system. First, a probabilistic musical meter estimation process is described, which segments a continuous musical signal into patterns. As a side-product, the method outputs tatum, tactus (beat), and measure lengths. A subsequent process performs the actual similarity measurements. Acoustic features are extracted which model the fluctuation of loudness and brightness within the pattern, and dynamic time warping is then applied to align the patterns to be compared. In simulations, the system behaved consistently by assigning high similarity measures to similar musical rhythms, even when performed using different sound sets.

...read moreread less

135 citations

Proceedings Article•DOI•

[...]

Michail Vlachos¹, Dimitrios Gunopulos¹, George Kollios²•Institutions (2)

University of California, Riverside¹, Boston University²

02 Sep 2002

TL;DR: This work proposes the use of non-metric distance functions based on the longest common subsequence (LCSS), in conjunction with a sigmoidal matching function for similarity analysis of spatio-temporal trajectories for mobile objects.

...read moreread less

Abstract: We investigate techniques for similarity analysis of spatio-temporal trajectories for mobile objects. Such data may contain a large number of outliers, which degrade the performance of Euclidean and time warping distance. Therefore, we propose the use of non-metric distance functions based on the longest common subsequence (LCSS), in conjunction with a sigmoidal matching function. Finally, we compare these new methods to various L/sub p/ norms and also to time warping distance (for real and synthetic data) and present experimental results that validate the accuracy and efficiency of our approach, especially in the presence of noise.

...read moreread less

95 citations

Proceedings Article•DOI•

ECG frame classification using dynamic time warping

[...]

B. Huang¹, Witold Kinsner¹•Institutions (1)

University of Winnipeg¹

07 Aug 2002

TL;DR: An electrocardiogram (ECG) frame classification technique realized by a dynamic time warping (DTW) matching technique, which has been used successfully in speech recognition, is presented, which is used to classify ECG frames because ECG and speech signals have similar nonstationary characteristics.

...read moreread less

Abstract: Presents an electrocardiogram (ECG) frame classification technique realized by a dynamic time warping (DTW) matching technique, which has been used successfully in speech recognition. We use the DTW to classify ECG frames because ECG and speech signals have similar nonstationary characteristics. The DTW mapping function is obtained by searching the frame from its end to start. A threshold is setup for DWT matching residual either to classify an ECG frame or to add a new class. Classification and establishment of a template set are carried out simultaneously. A frame is classified into a category with a minimal residual and satisfying a threshold requirement. A classification residual of 1.33% is achieved by the DTW for a 10-minute ECG recording.

...read moreread less

Journal Article•DOI•

Development of Isolated Word Speech Recognition System

[...]

Antanas Lipeika, Joana Lipeikienė, Laimutis Telksnys

01 Jan 2002-Informatica (lithuanian Academy of Sciences)

TL;DR: The isolated word speech recognition system based on dynamic time warping (DTW) has been developed and performance is evaluated using 12 words of Lithuanian language pronounced ten times by ten speakers.

...read moreread less

Abstract: The isolated word speech recognition system based on dynamic time warping (DTW) has been developed. Speaker adaptation is performed using speaker recognition techniques. Vector quantization is used to create reference templates for speaker recognition. Linear predictive coding (LPC) parameters are used as features for recognition. Performance is evaluated using 12 words of Lithuanian language pronounced ten times by ten speakers.

...read moreread less

Patent•

Signal modification based on continuous time warping for low bitrate celp coding

[...]

Ajit V. Rao¹•Institutions (1)

Microsoft¹

27 Jun 2002

TL;DR: In this article, the warp contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function, using only a subset of possible contours contained within a sub-range of the range of contours.

...read moreread less

Abstract: A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.

...read moreread less

Journal Article•DOI•

On the use of nearest feature line for speaker identification

[...]

Ke Chen¹, Tingyao Wu², Hong-Jiang Zhang³•Institutions (3)

University of Birmingham¹, Peking University², Microsoft³

01 Dec 2002-Pattern Recognition Letters

TL;DR: This paper explores the use of NFL for speaker identification in terms of limited data and examines how the NFL performs in such a vexing problem of various mismatches between training and test, and proposes an alternative method for similarity measure.

...read moreread less

Book Chapter•DOI•

Robust Face Recognition Using Dynamic Space Warping

[...]

Hichem Sahbi, Nozha Boujemaa

01 Jun 2002

TL;DR: A complete scheme for face recognition based on salient feature extraction in challenging conditions, which is performed without an a priori or learned model, and makes face recognition robust to low frequency variations as well as to high frequency variations.

...read moreread less

Abstract: The utility of face recognition for multimedia indexing is enhanced by using accurate detection and alignment of salient invariant face features. The face recognition can be performed using template matching or a feature-based-approach, but both these methods suffer from occlusion and require an a priori model for extracting information. To avoid these drawbacks, we present in this paper a complete scheme for face recognition based on salient feature extraction in challenging conditions, which is performed without an a priori or learned model. These features are used in a matching process that overcomes occlusion effects and facial expressions using the dynamic space warping which aligns each feature in the query image, if possible, with its corresponding feature in the gallery set. Thus, we make face recognition robust to low frequency variations (like the presence of occlusion, etc) as well as to high frequency variations (like expression, gender, etc). A maximum likelihood scheme is used to make the recognition process more precise, as is shown in the experiments.

...read moreread less

Book Chapter•DOI•

Unsupervised Learning Motion Models Using Dynamic Time Warping

[...]

Marek Kulbacki¹, Jakub Segen², Artur Bak³•Institutions (3)

Polish Academy of Sciences¹, Bell Labs², University of Wrocław³

03 Jun 2002

TL;DR: This paper constructs motion models to easier extract features of given motions and proposes measure of discrepancy between motions, which shows how two motions are similar to each other, normalizes length of motions and decreases high dimension of considered motion data, so clustering may take place in dimensionally reduced space.

...read moreread less

Abstract: This paper concerns essential, practical problem in automatic animation human-like figures with the support of informatics technologies connected with motion capture domain. The main problem we want to solve is partition set of primitive motions into appropriate groups according to similarity between motions. Up to now, experiments in systems of this kind, appeared be not too adequate to needs. In this situation, we had been faced with the necessity of creating new methods for supporting process of managing motion data. We construct motion models to easier extract features of given motions. Using these models we propose measure of discrepancy between motions. It shows how two motions are similar to each other, normalizes length of motions and decreases high dimension of considered motion data, so clustering may take place in dimensionally reduced space.

...read moreread less

Proceedings Article•DOI•

A comparison of techniques for automatic clustering of handwritten characters

[...]

V. Vuori¹, Jorma Laaksonen¹•Institutions (1)

Helsinki University of Technology¹

11 Aug 2002

TL;DR: It is claimed that a good set of prototypes can be formed from the combined results of different clustering algorithms, however, the number of clusters cannot be determined automatically, but some human interventions are required.

...read moreread less

Abstract: This work reports experiments with four hierarchical clustering algorithms and two clustering indices for online handwritten character recognition. The main motivation of the work is to develop an automatic method for finding a set of prototypical characters which would represent well the different writing styles present in a large international database. One of the major obstacles in achieving this goal is the uneven representation of different writing styles in the database. On the basis of the results of the experiments, we claim that a good set of prototypes can be formed from the combined results of different clustering algorithms. However, the number of clusters cannot be determined automatically, but some human interventions are required.

...read moreread less

Soundspotter – a prototype system for content-based audio retrieval

[...]

Christian Spevak, Emmanuel Favreau

01 Jan 2002

TL;DR: Soundspotter is presented, which allows the user to select a specific passage within an audio file and retrieve perceptually similar passages, and comprises several alternative retrieval algorithms, including dynamic time warping and trajectory matching based on a self-organizing map.

...read moreread less

Abstract: We present the audio retrieval system “Soundspotter,” which allows the user to select a specific passage within an audio file and retrieve perceptually similar passages. The system extracts framebased features from the sound signal and performs pattern matching on the resulting sequences of feature vectors. Finally, an adjustable number of best matches is returned, ranked by their similarity to the reference passage. Soundspotter comprises several alternative retrieval algorithms, including dynamic time warping and trajectory matching based on a self-organizing map. We explain the algorithms and report initial results of a comparative evaluation.

...read moreread less

Journal Article•DOI•

Recognition of critical situations from time series of laboratory results by case-based reasoning.

[...]

Lutz Fritsche¹, Alexander Schlaefer, Klemens Budde, Kay Schroeter, Hans-Hellmut Neumayer - Show less +1 more•Institutions (1)

Charité¹

01 Sep 2002-Journal of the American Medical Informatics Association

TL;DR: The new case-based reasoning algorithm with dynamic time warping as the measure of similarity allows extension of the use of automatic laboratory alerting systems to conditions in which abnormal laboratory results are the norm and critical states can be detected only by recognition of pathological changes over time.

...read moreread less

Journal Article•DOI•

Statistical lip-appearance models trained automatically using audio information

[...]

Philippe Daubias, Paul Deléglise

01 Jan 2002-EURASIP Journal on Advances in Signal Processing

TL;DR: A neural network based statistical appearance model of the lips which classifies pixels as belonging to the lips, skin, or inner mouth classes is presented which reduces the parameter space dimensionality in the red-hue energy minimization, thus yielding better contour shape and location estimates.

...read moreread less

Abstract: We aim at modeling the appearance of the lower face region to assist visual feature extraction for audio-visual speech processing applications. In this paper, we present a neural network based statistical appearance model of the lips which classifies pixels as belonging to the lips, skin, or inner mouth classes. This model requires labeled examples to be trained, and we propose to label images automatically by employing a lip-shape model and a red-hue energy function. To improve the performance of lip-tracking, we propose to use blue marked-up image sequences of the same subject uttering the identical sentences as natural nonmarked-up ones. The easily extracted lip shapes from blue images are then mapped to the natural ones using acoustic information. The lip-shape estimates obtained simplify lip-tracking on the natural images, as they reduce the parameter space dimensionality in the red-hue energy minimization, thus yielding better contour shape and location estimates. We applied the proposed method to a small audio-visual database of three subjects, achieving errors in pixel classification around 6%, compared to 3% for hand-placed contours and 20% for filtered red-hue.

...read moreread less

Journal Article•DOI•

Influence of Erroneous Learning Samples on Adaptation in On-Line Handwriting Recognition

[...]

V. Vuori¹, Jorma Laaksonen¹, Jari Kangas²•Institutions (2)

Helsinki University of Technology¹, Nokia²

01 Apr 2002-Pattern Recognition

TL;DR: The results of the simulations showed that the adaptation strategies are able to improve the system's recognition rate and the prototype inactivation methods do reduce the harmful effects of erroneous learning samples.

...read moreread less

Book Chapter•DOI•

Recognition of Environmental Sounds Using Speech Recognition Techniques

[...]

Michael A. Cowling¹, Renate Sitte¹•Institutions (1)

Griffith University¹

01 Jan 2002

TL;DR: This paper analyses the different techniques used for speech recognition and identifies those that can be used for non-speech sound recognition and performs benchmarks on these techniques and determines which technique is better suited forNon- speech sound recognition.

...read moreread less

Abstract: This paper discusses the use of speech recognition techniques in non-speech sound recognition. It analyses the different techniques used for speech recognition and identifies those that can be used for non-speech sound recognition. It then performs benchmarks on these techniques and determines which technique is better suited for non-speech sound recognition. As a comparison, it also gives results for the use of learning vector quantization (LVQ) and artificial neural network (ANN) techniques in speech recognition.

...read moreread less

Journal Article•

Shape-Based Retrieval of Similar Subsequences in Time-Series Databases

[...]

Ji-Hui Yun, Sang-Uk Kim, Tae-hun Kim, Sang-Hyeon Park

01 Jan 2002-Journal of KIISE:Databases

TL;DR: Wang et al. as mentioned in this paper proposed an effective and efficient approach for shape-based retrieval of subsequences, which supports various combinations of transformations such as shifting, scaling, moving average, and time warping.

...read moreread less

Abstract: This paper deals with the problem of shape-based retrieval in time-series databases. The shape-based retrieval is defined as the operation that searches for the (sub)sequences whose shapes are similar to that of a given query sequence regardless of their actual element values. In this paper, we propose an effective and efficient approach for shape-based retrieval of subsequences. We first introduce a new similarity model for shape-based retrieval that supports various combinations of transformations such as shifting, scaling, moving average, and time warping. For efficient processing of the shape-based retrieval based on the similarity model, we also propose the indexing and query processing methods. To verify the superiority of our approach, we perform extensive experiments with the real-world S&P 500 stock data. The results reveal that our approach successfully finds all the subsequences that have the shapes similar to that of the query sequence, and also achieves significant speedup up to around 66 times compared with the sequential scan method.

...read moreread less

Proceedings Article•DOI•

Shape-based retrieval of similar subsequences in time-series databases

[...]

Sang-Wook Kim¹, Jeehee Yoon², Sanghyun Park³, Tae-Hoon Kim²•Institutions (3)

Kangwon National University¹, Hallym University², IBM³

11 Mar 2002

TL;DR: This paper introduces a new similarity model for shape-based retrieval that supports various combinations of transformations such as shifting, scaling, moving average, and time warping and proposes the indexing and query processing methods.

...read moreread less

Abstract: This paper deals with the problem of shape-based retrieval in time-series databases. The shape-based retrieval is defined as the operation that searches for the (sub)sequences whose shapes are similar to that of a given query sequence. In this paper, we propose an effective and efficient approach for shape-based retrieval of subsequences. We first introduce a new similarity model for shape-based retrieval that supports various combinations of transformations such as shifting, scaling, moving average, and time warping. For efficient processing of the shape-based retrieval, we also propose the indexing and query processing methods. To verify the superiority of our approach, we perform extensive experiments with the real-world S&P 500 stock data. The results reveal that our approach successfully finds all the subsequences that have the shapes similar to that of the query sequence, and also achieves significant speedup over the sequential scan method.

...read moreread less

Dissertation•

Modèles a posteriori de la forme et de l'apparence des lèvres pour la reconnaissance automatique de la parole audiovisuelle

[...]

Philippe Daubias

01 Jan 2002

TL;DR: This manuscript presents research on model-based parameters extraction from video sequences for automatic speechreading in natural weaklv constrained, conditions and describes the a posteriori lip shape and appearance models learnt from corpora that are proposed.

...read moreread less

Abstract: In this manuscript, we present our research on model-based parameters extraction from video sequences for automatic speechreading in natural weaklv constrained, conditions. More precisely we describe the a posteriori lip shape and appearance models learnt from corpora that we propose. To be trained, these models require that lips can be located easily on images, which is not the case on nutural images. As manually labelling images is time-consuming, and hardly possible on a large corpus, we propose to use automatic methods instead through the use of make up and speech's bimodality. First, we defined a shape model for the lips containing two polygons : one for the outer lip contour and the other for the inner lip contour. This rnodel gives the opportunity to extract most lipreading information according to a in depth bibliographical study. To train statistically this model, we use video sequences where the speakers wear bIue lipstick on their lips, which enables easy boundary extraction. Welearn the mean shape and the main deformations. Next, we studied statistical appearance models which can only be trained on natural images. On these images, automatic lip location without external constraints is still unsolved. To label lips automatically, we use two repetitions of the same sentence by the same subject, with and without blue make up : onceagain, the blue sequence enables easy lip location and dynamic time warping (dtw) allows to estimate lip shape on natural images using the extracted shapes on blue images. The appearance model obtained is very similar to the one obtained when training the same initial model with hand-Iabeled images and is quite better than other models relying on hue. Moreover, the model we built can be adapted to any subject.

...read moreread less

Book Chapter•DOI•

Automatic Segmentation of Speech at the Phonetic Level

[...]

Jon Ander Gómez¹, María José Castro¹•Institutions (1)

Polytechnic University of Valencia¹

06 Aug 2002-Lecture Notes in Computer Science

TL;DR: A complete automatic speech segmentation technique has been studied in order to eliminate the need for manually segmented sentences and show the usefulness of the approach presented here is that manually segmenting data is not needed inorder to train acoustic models.

...read moreread less

Abstract: A complete automatic speech segmentation technique has been studied in order to eliminate the need for manually segmented sentences. The goal is to fix the phoneme boundaries using only the speech waveform and the phonetic sequence of the sentences.The phonetic boundaries are established using a Dynamic Time Warping algorithm that uses the a posteriori probabilities of each phonetic unit given the acoustic frame. These a posteriori probabilities are calculated by combining the probabilities of acoustic classes which are obtained from a clustering procedure on the feature space and the conditional probabilities of each acoustic class with respect to each phonetic unit.The usefulness of the approach presented here is that manually segmented data is not needed in order to train acoustic models. The results of the obtained segmentation are similar to those obtained using the HTK toolkit with the "flat-start" option activated. Finally, results using Artificial Neural Networks and manually segmented data are also reported for comparison purposes.

...read moreread less

Journal Article•DOI•

A DTW-based probability model for speaker feature analysis and data mining

[...]

Jingwei Liu¹, Qiansheng Cheng¹, Zhongguo Zheng¹, Minping Qian¹•Institutions (1)

Peking University¹

01 Sep 2002-Pattern Recognition Letters

TL;DR: A DTW-based statistical model is proposed to explore the subspace structures of speaker feature space for feature evaluation, dimension reduction and inter-class information discovery in pattern space to demonstrate its usefulness in isolated digits speaker identification.

...read moreread less

Proceedings Article•

Probabilistic models for joint clustering and time-warping of multidimensional curves

[...]

Darya Chudova¹, Scott Gaffney¹, Padhraic Smyth¹•Institutions (1)

University of California, Irvine¹

07 Aug 2002

TL;DR: In this article, a generative mixture model is proposed to align and cluster sets of multidimensional curves measured on a discrete time grid, which allows both local nonlinear time warping and global linear shifts of the observed curves in both time and measurement spaces relative to the mean curves within the clusters.

...read moreread less

Abstract: In this paper we present a family of models and learning algorithms that can simultaneously align and cluster sets of multidimensional curves measured on a discrete time grid. Our approach is based on a generative mixture model that allows both local nonlinear time warping and global linear shifts of the observed curves in both time and measurement spaces relative to the mean curves within the clusters. The resulting model can be viewed as a form of Bayesian network with a special temporal structure. The Expectation-Maximization (EM) algorithm is used to simultaneously recover both the curve models for each cluster, and the most likely alignments and cluster membership for each curve. We evaluate the methodology on two real-world data sets, and show that the Bayesian network models provide systematic improvements in predictive power over more conventional clustering approaches.

...read moreread less

Book Chapter•DOI•

Learning Intrinsic Video Content Using Levenshtein Distance in Graph Partitioning

[...]

Jeffrey Ng¹, Shaogang Gong¹•Institutions (1)

Queen Mary University of London¹

28 May 2002

TL;DR: The graph partitioning method is extended and in particular, the Normalised Cut model originally introduced for static image segmentation is extended to unsupervised clustering of temporal trajectories withfully automated model order selection.

...read moreread less

Abstract: We present a novel approach for automatically learning models of temporal trajectories extracted from video data Instead of using a representation of linearly time-normalised vectors of fixed-length, our approach makes use of Dynamic Time Warp distance as a similarity measure to capture the underlying ordered structure of variable-length temporal data while removing the non-linear warping of the time scale We reformulate the structure learning problem as an optimal graphpartitioning of the dataset to solely exploit Dynamic Time Warp similarity weights without the need for intermediate cluster centroid representations We extend the graph partitioning method and in particular, the Normalised Cut model originally introduced for static image segmentation to unsupervised clustering of temporal trajectories withfully automated model order selection By computing hierarchical average Dynamic Time Warp for eachcluster, we learn warp-free trajectory models and recover the time warp profiles and structural variance in the data We demonstrate the approach on modelling trajectories of continuous hand-gestures and moving objects in an indoor environment

...read moreread less

Patent•

Gaussian model-based dynamic time warping system and method for speech processing

[...]

Jean-François Bonastre, Philippe Morin, Jean-Claude Junqua

18 Dec 2002

TL;DR: The Gaussian Dynamic Time Warping model as discussed by the authors provides a hierarchical statistical model for representing an acoustic pattern, which is useful in speech processing application, particularly in applications such as word and speaker recognition.

...read moreread less

Abstract: The Gaussian Dynamic Time Warping model provides a hierarchical statistical model for representing an acoustic pattern. The first layer of the model represents the general acoustic space; the second layer represents each speaker space and the third layer represents the temporal structure information contained in each enrollment speech utterance, based on equally-spaced time intervals. These three layers are hierarchically developed: the second layer is derived from the first, and the third layer is derived from the second. The model is useful in speech processing application, particularly in applications such as word and speaker recognition, using a spotting recognition mode.

...read moreread less