scispace - formally typeset
Proceedings ArticleDOI

Speech recognition using dynamic features of acoustic subword spectra

Reads0
Chats0
TLDR
A novel approach for speech signal analysis has been developed that incorporates both steady-state and dynamic spectral features into a unified model that has been successfully applied in automatic speech recognition contexts and does not require frame-based optimal search algorithms.
Abstract
A novel approach for speech signal analysis has been developed that incorporates both steady-state and dynamic spectral features into a unified model. This model has been successfully applied in automatic speech recognition contexts and does not require frame-based optimal search algorithms. The model decomposes an utterance into a chain of acoustic subwords and simultaneously generates a mathematical description of instantaneous acoustic-phonetic features and dynamic transitions. The algorithm was tested using a speaker-dependent limited vocabulary recognition task and achieved higher recognition rates than both vector quantization and hidden Markov models. >

read more

Citations
More filters

Real-time recognition of spoken words

TL;DR: In this article, a real-time word recognition system using only a small computer (8K memory) and a few analog peripherals is described, where a spectral analysis is carried out by a bank of 17 1/3-octave bandpass filters.
Proceedings ArticleDOI

Speech coding by the efficient transformation of the spectral envelope of subwords

TL;DR: A signal-dependent representation which captures, with a few KL vectors and transform coefficients, the perceptually and phonetically important structure of the spectral envelope has been applied to the analysis, synthesis and coding of speech with promising results in the 5-kb/s range.

A Survey of Temporal Techniques Applied Toward Neural Network Based Continuous Speech Recognition

Chris D. Love
TL;DR: Neural network architectures for the recognition of continuous speech are reviewed and Hierarchic structures that recognize events of increasing temporal scale seem to provide the most promising path toward effective recognition ofContinuous speech.
Proceedings ArticleDOI

Dynamic recognition of vowels by machine using trajectories in a two dimensional feature space

TL;DR: This article used a k-nearest neighbor rule with 2300 training vowels and as many test vowels, taken from continuous speech samples of the same group of 33 male speakers, achieved an average success rate of 72% in six way classification.
References
More filters
Book

Phoneme recognition using time-delay neural networks

TL;DR: The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation.
Journal ArticleDOI

Phoneme recognition using time-delay neural networks

TL;DR: In this article, the authors presented a time-delay neural network (TDNN) approach to phoneme recognition, which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the time delay arrangement enables the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time and therefore not blurred by temporal shifts in the input
Proceedings ArticleDOI

A database for speaker-independent digit recognition

TL;DR: A large speech database has been collected for use in designing and evaluating algorithms for speaker independent recognition of connected digit sequences and formal human listening tests on this database provided certification of the labelling of the digit sequences.
Journal ArticleDOI

Dynamic specification of coarticulated vowels

TL;DR: Experiments summarized herein support the view that the most important source of information for speaker-invariant vowel identity is carried in dynamic specification of vowel onset and offset spectral patterns, with vowel duration also playing a role.
Journal ArticleDOI

Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today

TL;DR: An evaluation of the equipment now available for turning the theory of electronic speech recognition into practice and the fulfilment of this goal seems much closer than it did because of the pace of advance in IC technology.
Related Papers (5)