scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic time warping published in 2001"


Proceedings Article
01 Jan 2001
TL;DR: Dynamic time warping (DTW), is a technique for efficiently achieving this warping of sequences that have the approximately the same overall component shapes, but these shapes do not line up in X-axis.
Abstract: Time series are a ubiquitous form of data occurring in virtually every scientific discipline. A common task with time series data is comparing one sequence with another. In some domains a very simple distance measure, such as Euclidean distance will suffice. However, it is often the case that two sequences have the approximately the same overall component shapes, but these shapes do not line up in X-axis. Figure 1 shows this with a simple example. In order to find the similarity between such sequences, or as a preprocessing step before averaging them, we must "warp" the time axis of one (or both) sequences to achieve a better alignment. Dynamic time warping (DTW), is a technique for efficiently achieving this warping. In addition to data mining (Keogh & Pazzani 2000, Yi et. al. 1998, Berndt & Clifford 1994), DTW has been used in gesture recognition (Gavrila & Davis 1995), robotics (Schmill et. al 1999), speech processing (Rabiner & Juang 1993), manufacturing (Gollmer & Posten 1995) and medicine (Caiani et. al 1998).

1,131 citations


Journal ArticleDOI
TL;DR: Time warping is shown to be superior to simple clustering at mapping corresponding time states and directions for algorithm improvement are discussed including development of multiple time series alignments and possible applications to causality searches and non-temporal processes.
Abstract: Motivation: Increasingly, biological processes are being studied through time series of RNA expression data collected for large numbers of genes. Because common processes may unfold at varying rates in different experiments or individuals, methods are needed that will allow corresponding expression states in different time series to be mapped to one another. Results: We present implementations of time warping algorithms applicable to RNA and protein expression data and demonstrate their application to published yeast RNA expression time series. Programs executing two warping algorithms are described, a simple warping algorithm and an interpolative algorithm, along with programs that generate graphics that visually present alignment information. We show time warping to be superior to simple clustering at mapping corresponding time states. We document the impact of statistical measurement noise and sample size on the quality of time alignments, and present issues related to statistical assessment of alignment quality through alignment scores. We also discuss directions for algorithm improvement including development of multiple time series alignments and possible applications to causality searches and non-temporal processes (‘concentration warping’). Availability: Academic implementations of alignment programs genewarp and genewarpi and the graphics generation programs grphwarp and grphwarpi are available as Win32 system DOS box executables on our web site along with documentation on their use. The publicly available data on which they were demonstrated may be found at http://genome-www.stanford.edu/cellcycle/. Postscript files generated by grphwarp and grphwarpi may be directly printed or viewed using GhostView software available at http://www.cs.wisc.edu/∼ghost/.

511 citations


Proceedings ArticleDOI
02 Apr 2001
TL;DR: A new distance function D/sub tw-lb/ that consistently underestimates the time warping distance and also satisfies the triangular inequality is devised and achieves significant speedup up to 43 times with real-world S&P 500 stock data and up to 720 times with very large synthetic data.
Abstract: This paper proposes a new novel method for similarity search that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. Previous methods for processing similarity search that supports time warping fail to employ multi-dimensional indexes without false dismissal since the time warping distance does not satisfy the triangular inequality. Our primary goal is to innovate on search performance without permitting any false dismissal. To attain this goal, we devise a new distance function D/sub tw-lb/ that consistently underestimates the time warping distance and also satisfies the triangular inequality D/sub tw-lb/ uses a 4-tuple feature vector that is extracted from each sequence and is invariant to time warping. For efficient processing of similarity search, we employ a multi-dimensional index that uses the 4-tuple feature vector as indexing attributes and D/sub tw-lb/ as a distance function. The extensive experimental results reveal that our method achieves significant speedup up to 43 times with real-world S&P 500 stock data and up to 720 times with very large synthetic data.

337 citations


Book ChapterDOI
03 Sep 2001
TL;DR: It is argued that many time-series classification problems can be solved by detecting and combining local properties or patterns in time series, and a technique is proposed to find patterns which are useful for classification.
Abstract: In this paper, we propose some new tools to allow machine learning classifiers to cope with time series data. We first argue that many time-series classification problems can be solved by detecting and combining local properties or patterns in time series. Then, a technique is proposed to find patterns which are useful for classification. These patterns are combined to build interpretable classification rules. Experiments, carried out on several artificial and real problems, highlight the interest of the approach both in terms of interpretability and accuracy of the induced classifiers.

337 citations


Book ChapterDOI
02 Jul 2001
TL;DR: Four different fingerprint matching algorithms are combined using the proposed scheme to improve the accuracy of a fingerprint verification system and it is shown that a combination of multiple impressions or multiple fingers improves the verification performance by more than 4% and 5%, respectively.
Abstract: A scheme is proposed for classifier combination at decision level which stresses the importance of classifier selection during combination. The proposed scheme is optimal (in the Neyman-Pearson sense) when sufficient data are available to obtain reasonable estimates of the join densities of classifier outputs. Four different fingerprint matching algorithms are combined using the proposed scheme to improve the accuracy of a fingerprint verification system. Experiments conducted on a large fingerprint database (∼ 2,700 fingerprints) confirm the effectiveness of the proposed integration scheme. An overall matching performance increase of ∼ 3% is achieved. We further show that a combination of multiple impressions or multiple fingers improves the verification performance by more than 4% and 5%, respectively. Analysis of the results provide some insight into the various decision-level classifier combination strategies.

246 citations


Proceedings ArticleDOI
01 Dec 2001
TL;DR: This paper demonstrates gait recognition using only the trajectories of lower body joint angles projected into the walking plane and uses the expected confusion metric as a means to estimate how well joint-angle signals will perform in a larger population.
Abstract: This paper demonstrates gait recognition using only the trajectories of lower body joint angles projected into the walking plane. For this work, we begin with the position of 3D markers as projected into the sagittal or walking plane. We show a simple method for estimating the planar offsets between the markers and the underlying skeleton and joints; given these offsets we compute the joint angle trajectories. To compensate for systematic temporal variations from one instance to the next-predominantly distance and speed of walk-we fix the number of footsteps and time-normalize the trajectories by a variance compensated time warping. We perform recognition on two walking databases of 18 people (over 150 walk instances) using simple nearest neighbor algorithm with Euclidean distance as a measurement criteria. We also use the expected confusion metric as a means to estimate how well joint-angle signals will perform in a larger population.

216 citations


Proceedings ArticleDOI
13 Jul 2001
TL;DR: This work focuses on the visual sensory information to recognize human activity in form of hand-arm movements from a small, predefined vocabulary by means of a matching technique by determining the distance between the unknown input and a set of previously defined templates.
Abstract: We focus on the visual sensory information to recognize human activity in form of hand-arm movements from a small, predefined vocabulary. We accomplish this task by means of a matching technique by determining the distance between the unknown input and a set of previously defined templates. A dynamic time warping algorithm is used to perform the time alignment and normalization by computing a temporal transformation allowing the two signals to be matched. The system is trained with finite video sequences of single gesture performances whose start and end-point are accurately known. Preliminary experiments are accomplished off-line and result in a recognition accuracy of up to 92%.

184 citations


Proceedings Article
01 Sep 2001
TL;DR: A new methodology for automatic alignment based on dynamic time warping, where the spectr al peak structure is used to compute the local distance, enhanced by a model of attacks and of silence is proposed.
Abstract: Music alignment is the association of events in a score with points in the time axis of an audio signal. The signal is thus segmented according to the events in the score. We propose a new methodology for automatic alignment based on dynamic time warping, where the spectr al peak structure is used to compute the local distance, enhanced by a model of attacks and of silence. The methodology can cope with performances consider ed difficult to align, like polyphonic music, trills, fast sequences, or multi-instrument music. An optimisation of the representation of the alignment path makes the method applicable to long sound files, so that unit databases can be fully automatically segmented and labeled. On 708 sequences of synthesised music, we achieved an averag e offset of 25 ms and an error rate of 2.5%.

101 citations


Proceedings ArticleDOI
01 Mar 2001
TL;DR: Time warping as mentioned in this paper is a transformation that allows any sequence element to replicate itself as many times as needed without extra costs without extra computation cost, and is defined as the smallest distance between two sequences transformed by time warping.
Abstract: The sequence database is a set of data sequences, each of which is an ordered list of elements [1]. Sequences of stock prices, money exchange rates, temperature data, product sales data, and company growth rates are the typical examples of sequence databases [2, 8]. Similarity search is an operation that finds sequences or subsequences whose changing patterns are similar to that of a given query sequence [1, 2, 8]. Similarity search is of growing importance in many new applications such as data mining and data warehousing [6, 17]. There have been many research efforts [1, 7, 8, 10, 17] for efficient similarity searches in sequence databases using the Euclidean distance as a similarity measure. However, recent techniques [13–15, 18] tend to favor the time warping distance for its higher accuracy and wider applicability at the expense of high computation cost. Time warping is a transformation that allows any sequence element to replicate itself as many times as needed without extra costs [18]. For → example, two sequences X = 〈20, 21, 21, 20, 20, 23, 23, 23〉 → and Q = 〈20, 20, 21,20, 23〉 can be identically transformed into 〈20, 20, 21, 21, 20, 20, 23, 23, 23〉 by time warping. The time warping distance is defined as the smallest distance between two sequences transformed by time warping. While the Euclidean distance can be used only when two sequences compared are of the same length, the time warping distance can be applied to any two sequences of arbitrary lengths. Therefore, the time warping distance fits well with the databases where sequences are of different lengths. The time warping distance can be applied to both whole sequence and subsequence searches. Let us first consider the

85 citations


Book ChapterDOI
01 Jan 2001
TL;DR: A hybrid time series clustering algorithm that uses Dynamic Time Warping (DTM) and Hidden Markov Model (HMM) induction and it worked well in experiments with artificial data.
Abstract: Given a source of time series data, such as the stock market or the monitors in an intensive care unit, there is often utility in determining whether there are qualitatively different regimes in the data and in characterizing those regimes. For example, one might like to know whether the various indicators of a patient’s health measured over time are being produced by a patient who is likely to live or one that is likely to die. In this case, there is a priori knowledge of the number of regimes that exist in the data (two), and the regime to which any given time series belongs can be determined post hoc (by simply noting whether the patient lived or died). However, these two pieces of information are not always present.

82 citations


Proceedings ArticleDOI
01 May 2001
TL;DR: A survey of time series similarity and indexing techniques can be found in this article, where the authors examine a variety of similarity measures, including Lp norms, time warping, longest common subsequence measures, baselines, moving averaging, or deformable Markov model templates.
Abstract: Time series is the simplest form of temporal data. A time series is a sequence of real numbers collected regularly in time, where each number represents a value. Time series data come up in a variety of domains, including stock market analysis, environmental data, telecommunications data, medical data and financial data. Web data that count the number of clicks on given cites, or model the usage of different pages are also modeled as time series. Therefore time series account for a large fraction of the data stored in commercial databases. There is recently increasing recognition of this fact, and support for time series as a different data type in commercial data bases management systems is increasing. IBM DB2 for example implements support for time series using data-blades.The pervasiveness and importance of time series data has sparked a lot of research work on the topic. While the statistics literature on time series is vast, it has not studied methods that would be appropriate for the time series similarity and indexing problems we discuss here; much of the relevant work on these problems has been done by the computer science community.One interesting problem with time series data is finding whether different time series display similar behavior. More formally, the problem can be stated as: Given two time series X and Y, determine whether they are similar or not (in other words, define and compute a distance function dist(X, Y)). Typically each time series describes the evolution of an object, for example the price of a stock, or the levels of pollution as a function of time at a given data collection station. The objective can be to cluster the different objects to similar groups, or to classify an object based on a set of known object examples. The problem is hard because the similarity model should allow for imprecise matches. One interesting variation is the subsequence similarity problem, where given two time series X and Y, we have to determine those subsequences of X that are similar to pattern Y. To answer these problems, different notions of similarity between time series have been proposed in data mining research.In the tutorial we examine the different time series similarity models that have been proposed, in terms of efficiency and accuracy. The solutions encompass techniques from a wide variety of disciplines, such as databases, signal processing, speech recognition, pattern matching, combinatorics and statistics. We survey proposed similarity techniques, including the Lp norms, time warping, longest common subsequence measures, baselines, moving averaging, or deformable Markov model templates.Another problem that comes up in applications is the indexing problem: given a time series X, and a set of time series S = {Y1,…,YN}, find the time series in S that are most similar to the query X. A variation is the subsequence indexing problem, where given a set of sequences S, and a query sequence (pattern) X, find the sequences in S that contain subsequences that are similar to X. To solve these problems efficiently, appropriate indexing techniques have to be used. Typically, the similarity problem is related to the indexing problem: simple (and possibly inaccurate) similarity measures are usually easy to build indexes for, while more sophisticated similarity measures make the indexing problem hard and interesting.We examine the indexing techniques that can be used for different models, and the dimensionality reduction techniques that have been proposed to improve indexing performance. A time series of length n can be considered as a tuple in an n-dimensional space. Indexing this space directly is inefficient because of the very high dimensionality. The main idea to improve on it is to use a dimensionality reduction technique that takes the n item long time series, and maps it to a lower dimensional space with k dimensions (hopefully, k

Book ChapterDOI
13 Sep 2001
TL;DR: An unsupervised algorithm for segmenting categorical time series that exploits two statistical characteristics of meaningful episodes, and segments text into words successfully in three languages.
Abstract: This paper describes an unsupervised algorithm for segmenting categorical time series. The algorithm first collects statistics about the frequency and boundary entropy of ngrams, then passes a window over the series and has two "expert methods" decide where in the window boundaries should be drawn. The algorithm segments text into words successfully in three languages. We claim that the algorithm finds meaningful episodes in categorical time series, because it exploits two statistical characteristics of meaningful episodes.

Journal ArticleDOI
TL;DR: An adaptive recognition system for isolated handwritten characters and the experiments carried out with it to turn a writer-independent system into writer-dependent and increase recognition performance.
Abstract: This paper describes an adaptive recognition system for isolated handwritten characters and the ex- periments carried out with it. The characters used in our experiments are alphanumeric characters, including both the upper- and lower-case versions of the Latin al- phabets and three Scandinavian diacriticals. The writers are allowed to use their own natural style of writing. The recognition system is based on the k-nearest neighbor rule. The six character similarity measures applied by the system are all based on dynamic time warping. The aim of the first experiments is to choose the best combi- nation of the simple preprocessing and normalization op- erations and the dissimilarity measure for a multi-writer system. However, the main focus of the workis on online adaptation. The purpose of the adaptations is to turn a writer-independent system into writer-dependent and increase recognition performance. The adaptation is car- ried out by modifying the prototype set of the classifier according to its recognition performance and the user's writing style. The ways of adaptation include: (1) adding new prototypes; (2) inactivating confusing prototypes; and (3) reshaping existing prototypes. The reshaping al- gorithm is based on the Learning Vector Quantization. Four different adaptation strategies, according to which the modifications of the prototype set are performed, have been studied both offline and online. Adaptation is carried out in a self-supervised fashion during normal use and thus remains unnoticed by the user.

Book ChapterDOI
T. Mori1, K. Uehara1
18 Sep 2001
TL;DR: The task of tagging motion data is equal to the task of expressing motion by using motion association rules, which consist of symbols, which uniquely represent basic patterns in motion data.
Abstract: In the past several years, human motion data has been used in some domains such as SFX movies, CGs and so on. Motion data is captured by a motion capture system which captures the position of sensors on body joints, so we can get motion data as 3-D time series data. Since, motion data is multi-stream data of time series for 17 body parts, the amount of data is huge. Furthermore, motion data is expensive. A motion database can help those creators to produce motion data with less cost. The database, however, requires a content-based retrieval method because it is difficult to identify the motion by using a keyword approach. Consequently we introduce association rules which represent dependency between body parts. Association rules represent the motion of body parts and can be used as visual tags. We introduce a method to discover dependency between body parts as association rules. Association rules consist of symbols uniquely representing basic patterns. We call basic patterns primitive motions. Primitive motions are extracted from motion data by using segmentation and clustering processes. Finally, we discuss some experiments to discover association rules from multi-streams of motion data.

Journal ArticleDOI
TL;DR: The new implementation of the dynamic time warping can be used to align the major components of the event-related potential of the repeated single trials and shows significant improvement over some commonly used methods.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: The effectiveness of piecewise linear 2D warping, a dynamic programming-based elastic image matching technique, in handwritten character recognition is investigated in this paper, which is capable of providing compensation for most variations in character patterns with tractable computation.
Abstract: The effectiveness of piecewise linear 2D warping, a dynamic programming-based elastic image matching technique, in handwritten character recognition is investigated. The technique presented is capable of providing compensation for most variations in character patterns with tractable computation. The superiority of the present technique over several conventional 2D warping techniques in variation compensation is experimentally justified. Another comparison with monotonic and continuous 2D warping, a more flexible matching technique, reveals that the method presented takes far less computation than the latter, yet provides almost the same recognition accuracy for most categories.

PatentDOI
TL;DR: In this article, it is proposed that speech recognition be implemented in the form of a predefined sequence of states, such that upon recognition of an appropriate voice command, the system changes from one state to another state, and this change takes place in dependence on at least one speech recognition parameter.
Abstract: To control an arbitrary system by speech recognition, it is proposed that speech recognition be implemented in the form of a predefined sequence of states, such that, upon recognition of an appropriate voice command, the system changes from one state to another state, and this change takes place in dependence on at least one speech recognition parameter. The speech recognition parameters can influence, for example, the so-called “false acceptance rate” (FAR) and/or the “false rejection rate” (FRR), which thus are set to state-specific values for the individual states, in order to achieve improved recognition accuracy.

Patent
09 Jan 2001
TL;DR: In this article, a method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and words, and nametags.
Abstract: A method and system for speech recognition combines different types of engines in order to recognize user-defined digits and control words, predefined digits and control words, and nametags Speaker-independent engines are combined with speaker-dependent engines A Hidden Markov Model (HMM) engine is combined with Dynamic Time Warping (DTW) engines

Proceedings ArticleDOI
05 Oct 2001
TL;DR: The prefix-querying approach based on sliding windows is incorporated, which provides effective and scalable subsequence matching even with a large volume of a database and achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.
Abstract: This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In our earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multi-dimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality.In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our method, we perform extensive experiments. The results reveal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

Patent
Bi Ning1
11 Jul 2001
TL;DR: In this article, a method and apparatus for constructing voice templates for a speaker-independent voice recognition system includes segmenting a training utterance to generate time-clustered segments, each segment being represented by a mean.
Abstract: A method and apparatus for constructing voice templates for a speaker-independent voice recognition system includes segmenting a training utterance to generate time-clustered segments, each segment being represented by a mean. The means for all utterances of a given word are quantized to generate template vectors. Each template vector is compared with testing utterances to generate a comparison result. The comparison is typically a dynamic time warping computation. The training utterances are matched with the template vectors if the comparison result exceeds at least one predefined threshold value, to generate an optimal path result, and the training utterances are partitioned in accordance with the optical path result. The partitioning is typically a K-means segmentation computation. The partitioned utterances may then be re-quantized and re-compared with the testing utterances until the at least one predefined threshold value is not exceeded.

PatentDOI
Horst-Udo Hain1
TL;DR: In this article, the assignment of phonemes to graphemes producing them in a lexicon having words (grapheme sequences) and their associated phonetic transcription (phoneme sequences) for the preparation of patterns for training neural networks for the purpose of grapheme-phonemic conversion is carried out with the aid of a variant of dynamic programming which is known as dynamic time warping (DTW).
Abstract: The assignment of phonemes to graphemes producing them in a lexicon having words (grapheme sequences) and their associated phonetic transcription (phoneme sequences) for the preparation of patterns for training neural networks for the purpose of grapheme-phoneme conversion is carried out with the aid of a variant of dynamic programming which is known as dynamic time warping (DTW).

Proceedings ArticleDOI
25 Jul 2001
TL;DR: Fuzzy clustering and dynamic time warping methods are used to deal with fuzzy groupings of data attributes as well as with degrees of distance between time series patterned attributes, respectively.
Abstract: Data mining, as an active field, discovers useful knowledge from large data sets. This paper focuses on continuous time series data that have often been encountered in real applications (e.g., sales records, economic data and stock transactions) and discusses how to discover the hidden relationship among time series patterns in terms of their similarities. Fuzzy clustering and dynamic time warping (DTW) methods are used to deal with fuzzy groupings of data attributes as well as with degrees of distance between time series patterned attributes, respectively. An economic time series example is provided to help illustrate the ideas.

Patent
13 Nov 2001
TL;DR: In this article, a dynamic time warping algorithm is used to normalize length of the input utterance to match the length of a model utterance previously stored for the person.
Abstract: A biometric identification method of identifying a person combines facial identification steps with audio identification steps. In order to reduce vulnerability of a recognition system to deception using photographs or even three-dimensional masks or replicas, the system uses a sequence of images to verify that lips and chin are moving as a predetermined sequence of sounds are uttered by a person who desires to be identified. In order to compensate for variations in speed of making the utterance, a dynamic time warping algorithm is used to normalize length of the input utterance to match the length of a model utterance previously stored for the person. In order to prevent deception based on two-dimensional images, preferably two cameras pointed in different directions are used for facial recognition.

Proceedings ArticleDOI
13 Nov 2001
TL;DR: The main motivation for these methods is two and higher dimensional point-pattern matching, and therefore they generalize these methods into the 20 case, and it is shown that this generalization leads to an NP-complete problem.
Abstract: Edit distance is a powerful measure of similarity in string matching, measuring the minimum amount of insertions, deletions, and substitutions to convert a string into another string. This measure is ofte. contrasted with time warping in speech processing, that measures how close two trajectories are by allowing compression and expansion operations on time scale. Erne warping can be easily generalized to measure the similarity between ID point-patterns (ascending lists of real values), as the diference between ith and (i - l)th points in a point-pattern can be considered as the value of a trajectory at the time i. Howeve< we show that edit distance is more natural choice, and derive a measure by calculating the minimum amount of space needed to insert and delete between points to convert a point-pattern into another. We show that this measure defines a metric. We also define a substitution operation such that the distance calculation automatically separates the points into matching and mismatching points. The algorithms are based on dynamic programming. The main motivation for these methods is two and higher dimensional point-pattern matching, and therefore we generalize these methods into the 20 case, and show that this generalization leads to an NP-complete problem. There is also applications for the ID case; we discuss shortly the matching of tree ring sequences in dendrochronology.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: A new approach towards feature representation for speech recognition, named state transition matrix (STM), is proposed to address temporal varying problem in speech recognition using only a single-layer perceptron neural network.
Abstract: A high performance neural-network-based speech recognition system is presented. A new approach towards feature representation for speech recognition, named state transition matrix (STM), is proposed to address temporal varying problem in speech recognition. Using STM, we need only a single-layer perceptron neural network to perform speech recognition. Experimental results show that an overall accuracy of 95% and 87% was achieved for speaker-dependent isolated word recognition and multi-speaker-dependent isolated word recognition, respectively.

Proceedings Article
01 Jan 2001
TL;DR: Improved automatic speaker verification performance is demonstrated, and a hybrid system embedding DTW into a GMM is presented, showing an equal error improvement over a standard GMM.
Abstract: Standard Gaussian mixture modelling does not possess time sequence information (TSI) other than that which might be embedded in the acoustic features. Dynamic time warping relates directly to TSI, time-warping two sequences of features into alignment. Here, a hybrid system embedding DTW into a GMM is presented. Improved automatic speaker verification performance is demonstrated. Testing 1000 speakers in a fully text independent, world-model-adapted mode shows an equal error improvement over a standard GMM from 4.1% to 3.8%.

Patent
05 Sep 2001
TL;DR: In this paper, a method and system that combines voice recognition engines (104, 108, 112, 114, 114) and resolves differences between the results of individual voice recognition engine using a mapping function is presented.
Abstract: A method and system that combines voice recognition engines (104, 108, 112, 114) and resolves differences between the results of individual voice recognition engines (104, 106, 108, 112, 114) using a mapping function. Speaker independent voice recognition engine (104) and speaker-dependent voice recognition engine (106) are combined. Hidden Markov Model (HMM) engines (108, 114) and Dynamic Time Warping (DTW) engines (104, 106, 112) are combined.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: The paper investigates the use of multilayer perceptron (MLP) and dynamic time warping (DTW) in recognizing these Malay syllable sounds that are quite similar to each other.
Abstract: Attempts at creating and using the first database for Malay syllables are presented. The speech vocabulary consists of 16 Malay syllables which are initialized with plosives and followed by succeeding vowels. The paper investigates the use of multilayer perceptron (MLP) and dynamic time warping (DTW) in recognizing these Malay syllable sounds that are quite similar to each other. From the experimental results, DTW and MLP achieved 77.09% and 90.82% respectively in overall average recognition rate.

Patent
Ajit V. Rao1
29 Jun 2001
TL;DR: In this paper, the warp contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function, using only a subset of possible contours contained within a sub-range of the range of contours.
Abstract: A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.

Patent
17 Jul 2001
TL;DR: In this paper, a method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engine is presented, where a speaker independent (SI) Hidden Markov Model (HMM) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined.
Abstract: A method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engines. A speaker independent (SI) Hidden Markov Model (HMM) engine, a speaker independent Dynamic Time Warping (DTW-SI) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined. Combining and resolving the results of these engines results in a system with better recognition accuracy and lower rejection rates than using the results of only one engine.