Showing papers on "Dynamic time warping published in 1992"

PDF

Open Access

Proceedings Article•DOI•

Dynamic planar warping for optical character recognition

[...]

Esther Levin¹, Roberto Pieraccini¹•Institutions (1)

23 Mar 1992

TL;DR: The authors extend the dynamic time warping algorithm, widely used in automatic speech recognition (ASR), to a dynamic plane warping (DPW) algorithm, for application in the field of optical character recognition (OCR) or similar applications.

...read moreread less

Abstract: The authors extend the dynamic time warping (DTW) algorithm, widely used in automatic speech recognition (ASR), to a dynamic plane warping (DPW) algorithm, for application in the field of optical character recognition (OCR) or similar applications. Although direct application of the optimality principle reduced the computational complexity somewhat, the DPW (or image alignment) problem is exponential in the dimensions of the image. It is shown that by applying constraints to the image alignment problem, e.g., limiting the class of possible distortions, one can reduce the computational complexity dramatically, and find the optimal solution to the constrained problem in linear time. A statistical model, the planar hidden Markov model (PHMM), describing statistical properties of images is proposed. The PHMM approach was evaluated using a set of isolated handwritten digits. An overall digit recognition accuracy of 95% was achieved. It is expected that the advantage of this approach will be even more significant for harder tasks, such cursive-writing recognition and spotting. >

...read moreread less

162 citations

Journal Article•DOI•

Dimensionality reduction of the enhanced feature set for the HMM-based speech recognizer

[...]

Kuldip K. Paliwal¹•Institutions (1)

Bell Labs¹

01 Jul 1992-Digital Signal Processing

TL;DR: In [2,3], Furui investigated the use of temporal derivatives of cepstral coefficients and energy as recognition features in a dynamic time warping-based isolated word recognizer and showed how the recognition performance improves with the inclusion of first derivatives in the feature set.

...read moreread less

46 citations

Journal Article•DOI•

Time normalization in voice analysis.

[...]

Yingyong Qi¹•Institutions (1)

University of Arizona¹

01 Nov 1992-Journal of the Acoustical Society of America

TL;DR: In this study, a new method was developed for analyzing waveform perturbations of voice, and noise components of voice were calculated from the discrepancies between wavelets after they had been optimally aligned in time.

...read moreread less

Abstract: The harmonics‐to‐noise ratio (HNR) has been widely accepted for quantifying the irregular or noise component of voice. HNR, however, is usually inflated by cycle‐to‐cycle variations of fundamental frequency period because zero padding is used for time normalization of the wavelet. In this study, a new method was developed for analyzing waveform perturbations of voice. In this method, noise components of voice were calculated from the discrepancies between wavelets after they had been optimally aligned in time. The optimal time normalization of wavelets was accomplished using procedures of dynamic time warping (DTW). This method was evaluated using both synthetic and natural voices, and significant reductions in noise were obtained. The harmonics‐to‐noise ratio obtained using DTW for time normalization was also shown to be independent of fundamental frequency perturbations.

...read moreread less

41 citations

Patent•DOI•

Speech recognition apparatus and methods

[...]

Ian Bickerton

07 Jul 1992-Journal of the Acoustical Society of America

TL;DR: Speech recognition is carried out by performing a first analysis of a speech signal using a Hidden Semi Markov Model and an asymmetric time warping algorithm and a second analysis using Multi-Layer Perceptron techniques in conjunction with a neural net.

...read moreread less

Abstract: Speech recognition is carried out by performing a first analysis of a speech signal using a Hidden Semi Markov Model and an asymmetric time warping algorithm. A second analysis is also performed using Multi-Layer Perceptron techniques in conjunction with a neural net. The first analysis is used by the second to identify word boundaries. Where the first analysis provides an indication of the word spoken above a certain level of confidence, an output representative of the word spoken may be generated solely in response to the first analysis, the second analysis being utilized when the level of confidence falls. The output controls a function of an aircraft and provides feedback to the speaker of the words spoken.

...read moreread less

36 citations

Proceedings Article•DOI•

Prototype-based discriminative training for various speech units

[...]

Erik McDermott, Shigeru Katagiri

23 Mar 1992

TL;DR: The authors extend LVQ into a prototype-based classifier appropriate for the classification of various long speech units, and their results reveal clear gains in performance as a result of using PBMEC.

...read moreread less

Abstract: It has since been shown that learning vector quantisation (LVQ) is a special case of a more general method, generalized probabilistic descent (GPD), for gradient descent on a rigorously defined classification loss measure that closely reflects the misclassification rate. The authors to extend LVQ into a prototype-based classifier appropriate for the classification of various long speech units. For word recognition, a dynamic time warping procedure is integrated into the GPD learning procedure. The resulting minimum error classifier (MEC) is no longer a purely LVQ-like method, and it is called the prototype-based minimum error classifier (PBMEC). Results for the difficult Bell Labs E-set task as well as for speaker-dependent isolated word recognition for a vocabulary of 5240 words are presented. They reveal clear gains in performance as a result of using PBMEC. >

...read moreread less

35 citations

Proceedings Article•DOI•

A hybrid neural network, dynamic programming word spotter

[...]

T. Zeppenfeld¹, Alex Waibel¹•Institutions (1)

Carnegie Mellon University¹

23 Mar 1992

TL;DR: A novel keyword-spotting system that combines both neural network and dynamic programming techniques is presented, which makes use of the strengths of time delay neural networks (TDNNs), which include strong generalization ability, potential for parallel implementations, robustness to noise, and time shift invariant learning.

...read moreread less

Abstract: A novel keyword-spotting system that combines both neural network and dynamic programming techniques is presented. This system makes use of the strengths of time delay neural networks (TDNNs), which include strong generalization ability, potential for parallel implementations, robustness to noise, and time shift invariant learning. Dynamic programming models are used by this system because they have the useful capability of time warping input speech patterns. This system was trained and tested on the Stonehenge Road Rally database, which is a 20-keyword-vocabulary, speaker-independent, continuous-speech corpus. Currently, this system performs at a figure of merit (FOM) rate of 82.5%. FOM is the detection rate averaged from 0 to 10 false alarms per keyword hour. This measure is explained in detail. >

...read moreread less

35 citations

Proceedings Article•DOI•

A whole word recurrent neural network for keyword spotting

[...]

K.P. Li, J.A. Naylor, M.L. Rossen

23 Mar 1992

TL;DR: The authors present a neural network which is trained on word examples to perform the wordspotting task and has multiple recurrent connections with time delay to account for temporal dynamics.

...read moreread less

Abstract: The authors present a neural network which is trained on word examples to perform the wordspotting task. This network has multiple recurrent connections with time delay to account for temporal dynamics. A single network may be trained to recognize one word or many words. A hybrid wordspotter is evaluated in which a conventional wordspotter (based on dynamic time warping word matching) is used to screen incoming speech for potential keywords which are then passed to the network for the final accept/reject decision. Initial tests on a standard wordspotting test corpora resulted in improved keyword recognition at false alarm rates above zero. >

...read moreread less

31 citations

Journal Article•DOI•

GPD training of dynamic programming-based speech recognizers

[...]

Takashi Komori, Shigeru Katagiri

01 Nov 1992-The Journal of The Acoustical Society of Japan (e)

TL;DR: Experimental evaluation results in tasks of classifying syllables and phonemes clearly demonstrate GPD's superiority, and it is shown that the design algorithm appraised in this paper can be considered a new version of learning vector quantization, which is incorporated with the dynamic programming.

...read moreread less

Abstract: Although many pattern classifiers based on artificial neural networks have been vigorous-ly studied, they are still inadequate from a viewpoint of classifyingdynamic (variable-and unspecified-duration) speech patterns. To cope with this problem, the generalized probabilistic descent method (GPD) has recently been proposed. GPD not only allows one to train a discriminative system to classify dynamic patterns, but also possesses a remarkable advantage, namely guaranteeing the learning optimality (in the sense of a probabilistic descent search). A practical implementation of this theory, however, remains to be evaluated. In this light, we particularly focus on evaluating GPD in designing a widely-used speech recognizer based on dynamic time warping distance-measurement. We also show that the design algorithm appraised in this paper can be considered a new version of learning vector quantization, which is incorporated with the dynamic programming. Experimental evaluation results in tasks of classifying syllables and phonemes clearly demonstrate GPD's superiority.

...read moreread less

22 citations

Proceedings Article•DOI•

Application of a generalized probabilistic descent method to dynamic time warping-based speech recognition

[...]

Takashi Komori, Shigeru Katagiri

23 Mar 1992

TL;DR: A generalized probabilistic descent method (GPD) is evaluated in designing a speech recognizer incorporated with the dynamic time warping methodology and results clearly demonstrate that GPD can be a viable candidate for a method to realize a high-performance speech recognizers.

...read moreread less

Abstract: Although several kinds of discriminative training methods based on artificial neural networks have been vigorously tested, the pursuit of highly capable classification of variable-duration speech patterns has been unsatisfactory. In this light, the authors evaluate a generalized probabilistic descent method (GPD) in designing a speech recognizer incorporated with the dynamic time warping methodology. The algorithm can be viewed as generalized learning vector quantization suited to the dynamic programming-based time warping. Experiments were conducted on two tasks: English syllable classification and Japanese phoneme classification. Results clearly demonstrate that GPD can be a viable candidate for a method to realize a high-performance speech recognizer. >

...read moreread less

22 citations

Proceedings Article•

Time Warping Invariant Neural Networks

[...]

Guo-Zheng Sun¹, Hsing-Hen Chen¹, Y. C. Lee¹•Institutions (1)

University of Maryland, College Park¹

30 Nov 1992

TL;DR: Analysis has shown that TWINN completely removes time warping and is able to handle difficult classification problem, and has certain advantages over the current available sequential processing schemes.

...read moreread less

Abstract: We proposed a model of Time Warping Invariant Neural Networks (TWINN) to handle the time warped continuous signals. Although TWINN is a simple modification of well known recurrent neural network, analysis has shown that TWINN completely removes time warping and is able to handle difficult classification problem. It is also shown that TWINN has certain advantages over the current available sequential processing schemes: Dynamic Programming(DP)[1], Hidden Markov Model(HMM)[2], Time Delayed Neural Networks(TDNN) [3] and Neural Network Finite Automata(NNFA)[4]. We also analyzed the time continuity employed in TWINN and pointed out that this kind of structure can memorize longer input history compared with Neural Network Finite Automata (NNFA). This may help to understand the well accepted fact that for learning grammatical reference with NNFA one had to start with very short strings in training set. The numerical example we used is a trajectory classification problem. This problem, making a feature of variable sampling rates, having internal states, continuous dynamics, heavily time-warped data and deformed phase space trajectories, is shown to be difficult to other schemes. With TWINN this problem has been learned in 100 iterations. For benchmark we also trained the exact same problem with TDNN and completely failed as expected.

...read moreread less

21 citations

Proceedings Article•DOI•

Inference of letter-phoneme correspondences by delimiting and dynamic time warping techniques

[...]

Robert W. P. Luk¹, Robert I. Damper¹•Institutions (1)

University of Southampton¹

23 Mar 1992

TL;DR: An algorithm for inferring correspondences between letters and phonemes from a large set of word spellings and their associated phonemic forms is described, which uses delimiting and dynamic time warping to derive correspondences.

...read moreread less

Abstract: An algorithm for inferring correspondences between letters and phonemes from a large set of word spellings and their associated phonemic forms is described. The algorithm uses two techniques to infer correspondences: delimiting and dynamic time warping (DTW). The first technique delimits the part of the word spelling and pronunciation that cannot be aligned with the existing set of correspondences. The second technique derives correspondences from the delimited part of that word. The inferred correspondences are evaluated in terms of translation performance tested with unseen words, proper names and novel words. The translation performance is compared with those obtained using the manually driven correspondences as the benchmark. Nonparametric statistical tests are used to establish whether the performances of inferred correspondences are significantly different from the manually derived correspondences. >

...read moreread less

Proceedings Article•

A dynamical model of priming and repetition blindness

[...]

Daphne Bavelier¹, Michael I. Jordan²•Institutions (2)

Salk Institute for Biological Studies¹, Massachusetts Institute of Technology²

30 Nov 1992

TL;DR: A model of visual word recognition that accounts for several aspects of the temporal processing of sequences of briefly presented words, based on dynamic time warping and multidimensional scaling is described.

...read moreread less

Abstract: We describe a model of visual word recognition that accounts for several aspects of the temporal processing of sequences of briefly presented words. The model utilizes a new representation for written words, based on dynamic time warping and multidimensional scaling. The visual input passes through cascaded perceptual, comparison, and detection stages. We describe how these dynamical processes can account for several aspects of word recognition, including repetition priming and repetition blindness.

...read moreread less

Proceedings Article•DOI•

Time warping recurrent neural networks and trajectory classification

[...]

Guo-Zheng Sun¹, H. H. Chen¹, Y. C. Lee¹, Y.D. Liu¹•Institutions (1)

University of Maryland, College Park¹

07 Jun 1992

TL;DR: The TWRNN has several advantages over such schemes as dynamic programming, hidden Markov models, time-delayed neural networks, and neural network finite automata for trajectory classification, and is shown to have built-in time warping ability.

...read moreread less

Abstract: The authors propose a model of a time warping recurrent neural network (TWRNN) to handle temporal pattern classification where severely time warped and deformed data may occur. This model is shown to have built-in time warping ability. The authors analyze the properties of TWRNN and show that for trajectory classification it has several advantages over such schemes as dynamic programming, hidden Markov models, time-delayed neural networks, and neural network finite automata. A numerical example of trajectory classification is presented. This problem, making a feature of variable sampling rates, having internal states, continuous dynamics, heavily time-warped data, and deformed phase space trajectories, is shown to be difficult for the other schemes. The TWRNN has learned it easily. The authors also trained it with TDNN and failed. >

...read moreread less

Proceedings Article•DOI•

Speech recognition using dynamic time warping with neural network trained templates

[...]

Yen-Chen Liu¹, Y. C. Lee¹, H. H. Chen¹, Guo-Zheng Sun¹•Institutions (1)

University of Maryland, College Park¹

07 Jun 1992

TL;DR: A dynamic time warping based speech recognition system with neural network trained templates is proposed and it is demonstrated through experiments that the discriminative training algorithm is far superior to the nondiscriminative one, providing both smaller recognition error rate and greater discrimination power.

...read moreread less

Abstract: A dynamic time warping based speech recognition system with neural network trained templates is proposed. The algorithm for training the templates is derived based on minimizing classification error of the speech classifier. A speaker-independent isolated digit recognition experiment is conducted and achieves a 0.89% average recognition error rate with only one template for each digit, indicating that the derived templates are able to capture the speaker-invariant features of speech signals. Both nondiscriminative and discriminative versions of the neural net template training algorithm are considered. The former is based on maximum likelihood estimation. The latter is based on minimizing classification error. It is demonstrated through experiments that the discriminative training algorithm is far superior to the nondiscriminative one, providing both smaller recognition error rate and greater discrimination power. Experiments using different feature representation schemes are considered. It is demonstrated that the combination of the feature vector and the delta feature vector yields the best recognition result. >

...read moreread less

Journal Article•DOI•

A curve interpretation and diagnostic technique for industrial processes

[...]

Steven B. Dolins¹, J.D. Reese•Institutions (1)

University of Wisconsin-Madison¹

01 Jan 1992-IEEE Transactions on Industry Applications

TL;DR: A diagnostic technique has been developed to analyze process parameters and observables that change over time known as dynamic time warping (DTW) and knowledge-based diagnosis is performed on the symbolic data to determine malfunctions.

...read moreread less

Abstract: Detecting manufacturing problems as soon as they occur is important for efficient manufacturing in today's factories. Many of these problems could be minimized by installing diagnostic systems to monitor manufacturing steps. A diagnostic technique has been developed to analyze process parameters and observables that change over time. Process parameters control the operation of equipment, and observables are attributes of a partially completed product. The technique uses a specified digital signal processing algorithm known as dynamic time warping (DTW) to transform the input signal into symbolic data. Knowledge-based diagnosis is performed on the symbolic data to determine malfunctions. A detailed description of the DTW algorithm and knowledge-based analysis is presented. Two different applications-one in the glass industry and another one in the semiconductor industry-are discussed to illustrate the general use of this technique. >

...read moreread less

Journal Article•DOI•

Text-independent speaker verification based on broad phonetic segmentation of speech

[...]

Sunil K. Gupta¹, Michael Savic²•Institutions (2)

Bell Labs¹, Rensselaer Polytechnic Institute²

01 Apr 1992-Digital Signal Processing

TL;DR: This paper investigates text-independent speaker verification, which involves the determination of whether or not a test utterance belongs to a specific reference speaker and the required information stored in the templates is different in this case.

...read moreread less

Proceedings Article•DOI•

Speech recognition using dynamic neural networks

[...]

N.M. Botros¹, S. Premnath¹•Institutions (1)

Southern Illinois University Carbondale¹

07 Jun 1992

TL;DR: The authors present an algorithm for isolated-word recognition that takes into consideration the duration variability of the different utterances of the same word and shows that all these words could be recognized.

...read moreread less

Abstract: The authors present an algorithm for isolated-word recognition that takes into consideration the duration variability of the different utterances of the same word. The algorithm is based on extracting acoustical features from the speech signal and using them as the input to a sequence of multilayer perceptron neural networks. The networks were implemented as predictors for the speech samples for a certain duration of time. The networks were trained by a combination of the back-propagation and the dynamic time warping (DTW) techniques. The DTW technique was implemented to normalize the duration variability. The networks were trained to recognize the correct words and to reject the wrong words. The training set consisted of ten words, each uttered seven times by three different speakers. The test set consisted of three utterances of each of the ten words. The results show that all these words could be recognized. >

...read moreread less

Journal Article•DOI•

An FFT-based speech recognition system

[...]

Greg Hopper, Reza R. Adhami¹•Institutions (1)

University of Alabama in Huntsville¹

01 May 1992-Journal of The Franklin Institute-engineering and Applied Mathematics

TL;DR: A speaker-dependent, isolated-word speech recognition system is presented which is based on the use of the fast Fourier transform for extracting features from the speech input and compares them against previously stored word templates using dynamic time warping to identify the uttered word.

...read moreread less

Abstract: A speaker-dependent, isolated-word speech recognition system is presented which is based on the use of the fast Fourier transform for extracting features from the speech input. The algorithm then normalizes those features and compares them against previously stored word templates using dynamic time warping in order to identify the uttered word. The system has been successfully implemented and provided good results when tested using a small dictionary.

...read moreread less

Proceedings Article•DOI•

A VLSI hardware accelerator for dynamic time warping

[...]

V.K. Sundaresan¹, S. Nichani¹, Nagarajan Ranganathan¹, Ravi Sankar¹•Institutions (1)

University of South Florida¹

30 Aug 1992

TL;DR: The special purpose architecture is used to perform the band matrix multiplication in order to compute the local distance metric based on Itakura's log likelihood distance.

...read moreread less

Abstract: Describes an area and time efficient systolic array architecture for computations in Dynamic Time Warping (DTW). The special purpose architecture is used to perform the band matrix multiplication in order to compute the local distance metric based on Itakura's log likelihood distance. The time complexity of the algorithm is O(nk) where n and k are the number of elements in the row of the first and second input matrices. The number of processors is equal to the bandwidth w of the output band matrix. The speedup of the parallel algorithm compared to the sequential algorithm is wz where z is the multiplier stages within a PE. The parallel algorithm can be implemented as a single VLSI chip. >

...read moreread less

Journal Article•DOI•

Phoneme recognition using time-warping neural networks

[...]

Kiyoaki Aikawa

01 Nov 1992-The Journal of The Acoustical Society of Japan (e)

TL;DR: The proposed Time-Warping Neural Network (TWNN) demonstrates a higher phoneme recognition accuracy than a baseline recognizer composed of time-delay neural networks with a linear time alignment mechanism.

...read moreread less

Abstract: This paper proposes a novel neural network architecture for phoneme-based speech recognition. The new architecture is composed of five time-warping sub-networks and an output layer which integrates the sub-networks. Each time-warping sub-network has a different time-warping function embedded between the input layer and the first hidden layer. A time-warping sub-network recognizes the input speech warping the time axis using its time-warping function. The network is called the Time-Warping Neural Network (TWNN). The purpose of this network is to cope with the temporal variability of acoustic-phonetic features. The TWNN demonstrates a higher phoneme recognition accuracy than a baseline recognizer composed of time-delay neural networks with a linear time alignment mechanism.

...read moreread less

Book Chapter•DOI•

Multiple Template Modeling of Sublexical Units

[...]

Pablo Aibar¹, María José Castro¹, Francisco Casacuberta¹, Enrique Vidal¹•Institutions (1)

Polytechnic University of Valencia¹

01 Jan 1992

TL;DR: Automatic training procedures are developed to obtain the model or models for a certain type of linguistic unit, under the framework of a Distance-based approach, and some preliminary experimental results for single-spe speaker and multi-speaker tasks are reported.

...read moreread less

Abstract: Automatic training procedures are developed to obtain the model or models for a certain type of linguistic unit, under the framework of a Distance-based approach. The chosen units are phonetic-units and the models are templates. In a first approach, one prototype (centroid) per phonetic-unit is obtained through an iterative process and by using Dynamic Time Warping techniques. A refinement is performed through a Clustering procedure that obtains several prototypes per phonetic-unit. Another refinement, which is based on Multiedit Condensing techniques, is also proposed. Some preliminary experimental results for single-speaker and multi-speaker tasks are reported.

...read moreread less

Proceedings Article•DOI•

Short term memory structures for dynamic neural networks

[...]

B. de Vries¹, Jose C. Principe•Institutions (1)

Sarnoff Corporation¹

26 Oct 1992

TL;DR: The gamma memory, a recursive linear structure, is presented as a generalization of the tapped delay line or the context memory units to construct nonuniform time warping scales that may be useful in speech recognition.

...read moreread less

Abstract: A framework for designing and characterizing short-term memory structures for neural networks is presented. The gamma memory, a recursive linear structure, is presented as a generalization of the tapped delay line or the context memory units. The gamma memory principle can be enhanced to construct nonuniform time warping scales that may be useful in speech recognition. >

...read moreread less

Book Chapter•DOI•

A New Method for Dynamic Time Alignment of Speech Waveforms

[...]

J. Kittler¹, A. E. Lucas¹•Institutions (1)

University of Surrey¹

01 Jan 1992

TL;DR: The method attempts to address the shortcomings of traditional time alignment approaches, commonly based on dynamic programming algorithms, by employing the branch and bound search algorithm coupled with the Mahalanobis distance measure as the matching criterion.

...read moreread less

Abstract: In this paper, a new method for dynamic time alignment of speech waveforms is introduced. The method attempts to address the shortcomings of traditional time alignment approaches, commonly based on dynamic programming algorithms. Such methods, usually called dynamic time warping (DTW) algorithms, make the assumption that the samples of the speech waveform under consideration are statistically independent. The proposed method makes no such assumption. Instead, the method is based on models of speech entities with Gaussian distributions and general covariance matrices. These ideas are implemented by employing the branch and bound search algorithm [1] coupled with the Mahalanobis distance measure as the matching criterion. Hence, the new method attempts to utilise more discriminatory information than is presently incorporated. Preliminary results on a spoken letter recognition problem are reported validating the approach.

...read moreread less

Proceedings Article•DOI•

Dynamic time warping using an artificial neural network

[...]

F.A. Unal¹, N. Tepedelenlioglu²•Institutions (2)

Siemens¹, Florida Institute of Technology²

07 Jun 1992

TL;DR: A dynamic time warping algorithm using the Hopfield neural network to achieve an optimum match between a reference and a test signal is described.

...read moreread less

Abstract: A dynamic time warping (DTW) algorithm using the Hopfield neural network is described. A DTW energy function is constructed to achieve an optimum match between a reference and a test signal and mapped to the network's Lyapunov function to determine the connection weights and the biases for the neurons. The experimental results verify that the Hopfield network can be effectively used to solve this optimization problem. >

...read moreread less

Proceedings Article•DOI•

Iterative speaker adaptation for speech recognition

[...]

F.J. Scholtz, J.A. du Preez

11 Sep 1992

TL;DR: A new, unsupervised speaker adaptation scheme which requires no prior training phase is proposed, which improves the recognition rate as more speech data becomes available, making it most suitable for real-time implementation.

...read moreread less

Abstract: A speaker-independent speech recognition system is desirable in many applications where speaker-specific data does not exist. It speaker-independent data is available, the system could be adapted to the specific speaker, thereby reducing the recognition error rate. A new, unsupervised speaker adaptation scheme which requires no prior training phase is proposed. The algorithm improves the recognition rate as more speech data becomes available, making it most suitable for real-time implementation. In the tests conducted this algorithm yields an improvement of almost 50% on the recognition error rate. >

...read moreread less

Book Chapter•DOI•

Automatic Transformation of Speech Databases for Continuous Speech Recognition

[...]

S. Rieck¹, Ernst Günter Schukat-Talamazzini¹, Thomas Kuhn¹, S. Kunzmann¹, Elmar Nöth¹ - Show less +1 more•Institutions (1)

University of Erlangen-Nuremberg¹

01 Jan 1992

TL;DR: A dynamic time warping algorithm is used to match the original and the resampled speech signals and the results showed only a slight decrease in performance when using the new labelings.

...read moreread less

Abstract: In this paper a method is described to generate automatically the labels for a new speech database from an existing manually labeled speech database. This becomes necessary when new standards are introduced and the speech signals have to be resampled. A dynamic time warping algorithm is used to match the original and the resampled speech signals. The comparison is carried out on mel based features. To improve computation time the search space for the DTW algorithm is restricted. Several experiments were carried out with a normal density Bayes classifier to check the quality of the new labelings. The results showed only a slight decrease in performance when using the new labelings.

...read moreread less

Book Chapter•DOI•

A new technique of data over voice based on time-warping

[...]

Dov Wulich¹, Moshe Bukris¹•Institutions (1)

Ben-Gurion University of the Negev¹

01 Jan 1992