Showing papers on "Dynamic time warping published in 1994"

PDF

Open Access

Proceedings Article•

Using dynamic time warping to find patterns in time series

[...]

Donald J. Berndt¹, James Clifford¹•Institutions (1)

31 Jul 1994

TL;DR: Preliminary experiments with a dynamic programming approach to pattern detection in databases, based on the dynamic time warping technique used in the speech recognition field, are described.

...read moreread less

Abstract: Knowledge discovery in databases presents many interesting challenges within the context of providing computer tools for exploring large data archives. Electronic data repositories are growing quickly and contain data from commercial, scientific, and other domains. Much of this data is inherently temporal, such as stock prices or NASA telemetry data. Detecting patterns in such data streams or time series is an important knowledge discovery task. This paper describes some preliminary experiments with a dynamic programming approach to the problem. The pattern detection algorithm is based on the dynamic time warping technique used in the speech recognition field.

...read moreread less

3,229 citations

Using Dynamic Time Warping to FindPatterns in Time Series

[...]

James Clifford

01 Jan 1994

TL;DR: In this paper, a dynamic time warping technique used in the speech recognition field is used to detect patterns in data streams or time series, such as stock prices or NASA telemetry data.

...read moreread less

Abstract: Knowledge discovery in databases presents many interesting challenges within the ¢onte~t of providing computer tools for exploring large data archives. Electronic data .repositories are growing qulckiy and contain data from commercial, scientific, and other domains. Much of this data is inherently temporal, such as stock prices or NASA telemetry data. Detect£ug patterns in such data streams or time series is an important knowledge discovery task. This paper describes some pr~|~m;~,ry experiments with a dynamic prograrnm~,~g approach to the problem. The pattern detection algorithm is based on the dynamic time warping technique used in the speech recognition field.

...read moreread less

161 citations

Posted Content•

Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping

[...]

Pascale Fung¹, Kathleen R. McKeown¹•Institutions (1)

Columbia University¹

22 Sep 1994-arXiv: Computation and Language

TL;DR: The authors proposed a new algorithm called DK-vec for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries, which uses frequency, position and recency information as features for pattern matching.

...read moreread less

Abstract: We propose a new algorithm called DK-vec for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries. DK-vec improves on previous alignment algorithms in that it handles better the non-linear nature of noisy corpora. The algorithm uses frequency, position and recency information as features for pattern matching. Dynamic Time Warping is used as the matching technique between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment.

...read moreread less

81 citations

Journal Article•DOI•

Prototype-based minimum classification error/generalized probabilistic descent training for various speech units

[...]

Erik McDermott, Shigeru Katagiri

01 Oct 1994-Computer Speech & Language

TL;DR: This work extends LVQ into a prototype-based minimum error classifier appropriate for the classification of various speech units which the original LVQ was unable to treat, and discusses the issue of smoothing the loss function from the perspective of increasing classifier robustness.

...read moreread less

52 citations

Proceedings Article•

Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping

[...]

Pascale Fung¹, Kathleen R. McKeown•Institutions (1)

Columbia University¹

05 Oct 1994

TL;DR: A new algorithm called DK-vec is proposed for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries that handles better the non-linear nature of noisy corpora.

...read moreread less

Abstract: We propose a new algorithm, DK-vec, for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries. The algorithm uses frequency, position and recency information as features for pattern matching. Dynamic Time Warping is used as the matching technique between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment.

...read moreread less

42 citations

Book Chapter•DOI•

Algorithms for Signature Verification

[...]

Giuseppe Pirlo¹•Institutions (1)

University of Bari¹

01 Jan 1994

TL;DR: The need to assure that only the right people are authorized to high-security accesses has led to develop systems for automatic personal verification.

...read moreread less

Abstract: The need to assure that only the right people are authorized to high-security accesses has led to develop systems for automatic personal verification.

...read moreread less

39 citations

Proceedings Article•DOI•

A connectionist recognizer for on-line cursive handwriting recognition

[...]

S. Manke¹, U. Bodenhausen¹•Institutions (1)

Karlsruhe Institute of Technology¹

19 Apr 1994

TL;DR: The MS-TDNN integrates the high accuracy single character recognition capabilities of a TDNN with a non-linear time alignment procedure (dynamic time warping algorithm) for finding stroke and character boundaries in isolated, handwritten characters and words.

...read moreread less

Abstract: Shows how the multi-state time delay neural network (MS-TDNN), which is already used successfully in continuous speech recognition tasks, can be applied both to online single character and cursive (continuous) handwriting recognition. The MS-TDNN integrates the high accuracy single character recognition capabilities of a TDNN with a non-linear time alignment procedure (dynamic time warping algorithm) for finding stroke and character boundaries in isolated, handwritten characters and words. In this approach each character is modelled by up to 3 different states and words are represented as a sequence of these characters. The authors describe the basic MS-TDNN architecture and the input features used in the paper, and present results (up to 97.7% word recognition rate) both on writer dependent/independent, single character recognition tasks and writer dependent, cursive handwriting tasks with varying vocabulary sizes up to 20000 words. >

...read moreread less

37 citations

Patent•DOI•

Speaker verification, speech recognition and channel normalization through dynamic time/frequency warping

[...]

Randy G. Goldberg¹, Gerard P. Lynch¹, Richard R. Rosinski¹•Institutions (1)

Alcatel-Lucent¹

21 Sep 1994-Journal of the Acoustical Society of America

TL;DR: In this paper, a Dynamic Time/Frequency Warping (DTFW) technique was proposed for speaker verification, speech recognition and channel normalization, among other uses. The DTFW technique utilities best path dynamic programming methods using a 3-dimensional time frequency array representing the spectral differences between a test utterance and a reference utterance (template).

...read moreread less

Abstract: A Dynamic Time/Frequency Warping (DTFW) technique is disclosed for speaker verification, speech recognition and channel normalization, among other uses. The DTFW technique utilities best path dynamic programming methods using a 3-dimensional time frequency array representing the spectral differences between a test utterance (the utterance being analyzed) and a reference utterance (template). The array is created by summing the squares of the differences of each feature in each frame of the template with each feature in each frame of the utterance in question. Dynamic programming techniques are then used to find the minimal distance path matching the test utterance and the template so as to optimize the time and frequency warping paths.

...read moreread less

32 citations

Book•DOI•

Advances in pattern recognition systems using neural network technologies

[...]

I. Guyon, Patrick S.-P. Wang

01 Jan 1994

TL;DR: A connectionist approach to speech recognition, Y. Bengio signature verification with a Siamese TDNN and an integrated architecture for recognition of totally unconstrained hand-written numerals.

...read moreread less

Abstract: A connectionist approach to speech recognition, Y. Bengio signature verification with a Siamese TDNN, J. Bromley et al boosting performance in neural networks, H. Drucker et al an integrated architecture for recognition of totally unconstrained hand-written numerals, A. Gupta et al time warping network - a neural approach to hidden Markov model-based speech recognition, E. Levin et al computing optical flow with a recurrent neural network, H. Li and J. Wang integrated segmentation and recognition through exhaustive scans or learned Saccadic jumps, G. Martin et al experimental comparison of the effect of order in recurrent neural networks, C.B. Miller and C.L. Giles adaptive classification by neural net based prototype populations, K. Peleg and U. Ben Hanan a neural system for the recognition of partially occluded objects in cluttered scenes - a pilot study, L. Wiskott and C. von der Malsburg. (Part contents).

...read moreread less

24 citations

Proceedings Article•DOI•

Dynamic time warping with path control and non-local cost

[...]

Y. Stettiner¹, David Malah, D. Chazan•Institutions (1)

Technion – Israel Institute of Technology¹

09 Oct 1994

TL;DR: This work proposes a multidimensional dynamic-programming technique which can efficiently solve time-warping optimization problems involving colored noise, and allows control over the warping function curvature.

...read moreread less

Abstract: Dynamic time warping (DTW) is a dynamic programming technique widely used for solving time-alignment problems. The classical DTW constrains only the first derivative of the warping function, hence allowing no direct control over the warping function curvature. Moreover, it implicitly assumes-inappropriately for some applications-that the noise is white. We propose a multidimensional dynamic-programming technique which can efficiently solve time-warping optimization problems involving colored noise, and allows control over the warping function curvature. The technique is demonstrated for the co-channel speech separation problem. Applications employing DTW can benefit from the new technique, which offers improved accuracy and robustness in the presence of colored noise and competing speech.

...read moreread less

12 citations

Journal Article•DOI•

State-dependent time warping in the trended hidden Markov model

[...]

D. X. Sun¹, Li Deng², Chien-Fu Wu³•Institutions (3)

Stony Brook University¹, University of Waterloo², University of Michigan³

02 Sep 1994-Signal Processing

TL;DR: An algorithm for estimating state-dependent polynomial coefficients in the nonstationary-state hidden Markov model (or the trended HMM) which allows for the flexibility of linear time warping or scaling in individual model states is presented.

...read moreread less

Journal Article•

On the Application of Automatic Waveform Editing for Time Warping Digital and Analog Recordings

[...]

Guy Spleesters, Werner Verhelst¹, Aron Wahl•Institutions (1)

Vrije Universiteit Brussel¹

01 Feb 1994-Journal of The Audio Engineering Society

TL;DR: In this article, an automatic time-warping method is proposed which is based on a criterion of maximal local similarity between original and timewarped waveforms, and tests on time domain and subband domain representations of audio are discussed and show the practicality of the approach.

...read moreread less

Abstract: The paper addresses the problem of modifying the playback speed of audio recordings while maintaining high signal quality and naturalness (i.e., time scaling while preserving frequency domain characteristics). An automatic time-warping method is proposed which is based on a criterion of maximal local similarity between original and time-warped waveforms. Tests on time domain and subband domain representations of audio are discussed and show the practicality of the approach taken.

...read moreread less

Proceedings Article•DOI•

Word recognition using a neural network and a phonetically based DTW

[...]

Y. Matsuura¹, H. Miyazawa¹, T.E. Skinner•Institutions (1)

Meidensha¹

06 Sep 1994

TL;DR: The authors have developed a speaker-independent, isolated-word recognition system using a neural network to recognize the underlying sequence of phonemes and a dynamic time warping technique to time-align the recognized sequence ofphonemes with corresponding lexical sequences of phonEMes.

...read moreread less

Abstract: The authors have developed a speaker-independent, isolated-word recognition system using a neural network to recognize the underlying sequence of phonemes and a dynamic time warping (DTW) technique to time-align the recognized sequence of phonemes with corresponding lexical sequences of phonemes. A significant feature of this system is the ability to easily change the vocabulary, since the lexical entries are simply derived from their phoneme sequences. >

...read moreread less

Proceedings Article•DOI•

A warped time-frequency expansion for speech signal representation

[...]

P.L. Silsbee¹, Stephen A. Zahorian¹, Zaki B. Nossair¹•Institutions (1)

Old Dominion University¹

25 Oct 1994

TL;DR: A novel representation for speech signals is proposed, in which the time-varying frequency content of a speech segment is represented as a weighted sum of two-dimensional basis vectors which incorporate both frequency warping and frequency-dependent time warping.

...read moreread less

Abstract: A novel representation for speech signals is proposed. The time-varying frequency content of a speech segment is represented as a weighted sum of two-dimensional basis vectors; these incorporate both frequency warping and frequency-dependent time warping. This is quite flexible; for example, any arbitrary time or frequency warping function can easily be implemented, and any time-frequency representation can be used as the starting point. Examples are presented which demonstrate desirable characteristics of the representation: (1) explicit quantification of parameter trajectories, (2) time resolution which varies with respect to time and frequency, and (3) the ability to reconstruct a time-frequency plot which reflects the resolution characteristics of the representation. >

...read moreread less

Proceedings Article•

Nonlinear time alignment in stochastic trajectory models for speech recognition.

[...]

Mohamed Afify, Yifan Gong, Jean-Paul Haton

01 Jan 1994

Proceedings Article•

Accent Phrase Segmentation by Finding N-Best Sequences of Pitch Pattern Templates

[...]

Mitsuru Nakai, Hiroshi Shimodaira

01 Sep 1994

TL;DR: A prosodic method for segmenting continuous speech into accent phrases by using dynamic time warping between F0 contours of input speech and reference accent patterns called pitch pattern templates is described.

...read moreread less

Abstract: This paper describes a prosodic method for segmenting continuous speech into accent phrases. Optimum sequences are obtained on the basis of least squared error criterion by using dynamic time warping between F0 contours of input speech and reference accent patterns called ‘pitch pattern templates’. But the optimum sequence does not always give good agreement with phrase boundaries labeled by hand, while the second or the third optimum candidate sequence does well. Therefore, we expand our system to be able to find out multiple candidates by using N-best algorithm. Evaluation tests were carried out using the ATR continuous speech database of 10 speakers. The results showed about 97% of phrase boundaries were correctly detected when we took 30-best candidates, and this accuracy is 7.5% higher than the conventional method without using N-best search algorithm.

...read moreread less

Proceedings Article•DOI•

Text-dependent speaker verification using data fusion and channel detection

[...]

Khaled Assaleh¹, Kevin R. Farrell¹, M.S. Zilovic¹, Manish Sharma¹, Devang Naik¹, Richard J. Mammone¹ - Show less +2 more•Institutions (1)

Rutgers University¹

25 Oct 1994

TL;DR: A new system is presented for text-dependent speaker verification that uses data fusion concepts to combine the results of distortion-based and discriminant-based classifiers and is found to perform exceptionally well.

...read moreread less

Abstract: A new system is presented for text-dependent speaker verification. The system uses data fusion concepts to combine the results of distortion-based and discriminant-based classifiers. Hence, both intraspeaker and interspeaker information are utilized in the final decision. The distortion and discriminant-based classifiers used are dynamic time warping and the neural tree network, respectively. The system is evaluated with several hundred one word utterances collected over a telephone channel. All handsets considered in this experiment use electret microphones. The new system is found to perform exceptionally well for this task. A second experiment uses handsets having both electret and carbon button microphones. Here, a channel detection scheme is proposed that improves performance under these conditions.© (1994) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Issues in acoustic modeling of speech for automatic speech recognition

[...]

Yifan Gong¹, Jean-Paul Haton¹, Jean-François Mari¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Jan 1994

TL;DR: This paper deals with the improvement of stochastic techniques, especially for a better representation of time varying phenomena.

...read moreread less

Abstract: Stochastic modeling is a flexible method for handling the large variability in speech for recognition applications. In contrast to dynamic time warping where heuristic training methods for estimating word templates are used, stochastic modeling allows a probabilistic and automatic training for estimating models. This paper deals with the improvement of stochastic techniques, especially for a better representation of time varying phenomena.

...read moreread less

Proceedings Article•

Nonstationary-state hidden Markov model with state-dependent time warping: application to speech recognition.

[...]

Don X. Sun, Li Deng

01 Jan 1994

Proceedings Article•DOI•

Speech recognition by extended loop neural network

[...]

Miao Zhenjiang¹, Yuan Baozong¹•Institutions (1)

Beijing Jiaotong University¹

13 Apr 1994

TL;DR: This speech recognition approach has the features of great adaptivity and fault tolerance to carry out recognition and can perform not only the recognition task but also restore the correct information from incomplete even some extent incorrect information at the same time.

...read moreread less

Abstract: Presents an extended loop neural network approach to speech recognition. This speech recognition approach is characterized by the following important properties due to the associative memory neural network. (1) It has the features of great adaptivity and fault tolerance to carry out recognition. (2) The recognition system can be constructed which allows for the formation of arbitrary nonlinear decision surfaces. (3) The recognition system can perform not only the recognition task but also restore the correct information from incomplete even some extent incorrect information at the same time. Experiments are also conducted and the results show that this speech recognition approach has great application potentials. >

...read moreread less

Corrections and Extensions to T1A1.5/93-152

[...]

Stephen Wolf

17 Jan 1994

TL;DR: This contribution presents one minor correction to the recommended value for the fraction above threshold in contribution T1A1.5/93-152, a method for estimating the video delay uncertainty of the automated time alignment algorithm, and an improved motion spike detector that could be used for computing parameters p10 and p11 in T1

...read moreread less

Abstract: Contribution T1A15/93-152 summarized the methods of measurement for objective video quality parameters based on the Sobel-filtered image and the motion difference image that were submitted prior to conducting the T1A1 subjective experiment (this experiment collected 625 mean opinion scores - ie, 25 test scenes passed through 25 different video transmission systems that ranged in bit rate from 64 kb/sec to 45 Mb/sec) This contribution presents (1) one minor correction to the recommended value for the fraction above threshold in contribution T1A15/93-152, and (2) a method for estimating the video delay uncertainty of the automated time alignment algorithm presented in section 3 of contribution T1A15/93-152 (non-zero video delay uncertainty may result when dynamic time warping, or variable video delay, is present in the video transmission system, or when there is a substantial number of dropped video frames), (3) a method for using this video delay uncertainty in the computation of the parameters presented in T1A15/93-152, and (4) an improved motion spike detector that could be used for computing parameters p10 and p11 in T1A15/93-152

...read moreread less

Book Chapter•DOI•

Speech Recognition Using ANNs

[...]

Hervé Bourlard, Nelson Morgan¹•Institutions (1)

International Computer Science Institute¹

01 Jan 1994

TL;DR: It has also become clear that the use of higher level knowledge during the recognition process (or more generally, the efficient interaction between multiple knowledge sources) is required to overcome the limitations of current ASR systems.

...read moreread less

Abstract: Given all the difficulties presented in Chapter 1, Automatic Speech Recognition (ASR) remains a challenging problem in pattern recognition. After half a century of research, the performance currently achieved by state of the art systems is not yet at the level of a mature technology. Over the years, many technological innovations have boosted the level of performance for more and more difficult tasks. Some of the most significant of these innovations include: (1) pattern matching approaches (e.g., DTW), (2) statistical pattern recognition (e.g., HMMs), (3) better use of a priori phonological knowledge, and (4) integration of syntactic constraints in Continuous Speech Recognition (CSR) algorithms. However, despite impressive improvements, performance on realistic (i.e., fairly unconstrained) tasks are still far too low for effective use. It seems likely that new technological breakthroughs will be required for the major performance improvement that will be required. Even if one assumes infinite computational power, an infinite storage and corresponding memory bandwidth, and an infinite amount of training data, it is still not certain that one could solve the ASR problem in a satisfactory way. It has also become clear that the use of higher level knowledge during the recognition process (or more generally, the efficient interaction between multiple knowledge sources) is required to overcome the limitations of current ASR systems.

...read moreread less