Proceedings ArticleDOI
Speech recognition using dynamic features of acoustic subword spectra
K.L. Brown,V.R. Algazi +1 more
- pp 293-296
Reads0
Chats0
TLDR
A novel approach for speech signal analysis has been developed that incorporates both steady-state and dynamic spectral features into a unified model that has been successfully applied in automatic speech recognition contexts and does not require frame-based optimal search algorithms.Abstract:
A novel approach for speech signal analysis has been developed that incorporates both steady-state and dynamic spectral features into a unified model. This model has been successfully applied in automatic speech recognition contexts and does not require frame-based optimal search algorithms. The model decomposes an utterance into a chain of acoustic subwords and simultaneously generates a mathematical description of instantaneous acoustic-phonetic features and dynamic transitions. The algorithm was tested using a speaker-dependent limited vocabulary recognition task and achieved higher recognition rates than both vector quantization and hidden Markov models. >read more
Citations
More filters
Real-time recognition of spoken words
TL;DR: In this article, a real-time word recognition system using only a small computer (8K memory) and a few analog peripherals is described, where a spectral analysis is carried out by a bank of 17 1/3-octave bandpass filters.
Proceedings ArticleDOI
Speech coding by the efficient transformation of the spectral envelope of subwords
TL;DR: A signal-dependent representation which captures, with a few KL vectors and transform coefficients, the perceptually and phonetically important structure of the spectral envelope has been applied to the analysis, synthesis and coding of speech with promising results in the 5-kb/s range.
A Survey of Temporal Techniques Applied Toward Neural Network Based Continuous Speech Recognition
TL;DR: Neural network architectures for the recognition of continuous speech are reviewed and Hierarchic structures that recognize events of increasing temporal scale seem to provide the most promising path toward effective recognition ofContinuous speech.
Proceedings ArticleDOI
Dynamic recognition of vowels by machine using trajectories in a two dimensional feature space
TL;DR: This article used a k-nearest neighbor rule with 2300 training vowels and as many test vowels, taken from continuous speech samples of the same group of 33 male speakers, achieved an average success rate of 72% in six way classification.
References
More filters
Book
Phoneme recognition using time-delay neural networks
TL;DR: The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation.
Journal ArticleDOI
Phoneme recognition using time-delay neural networks
TL;DR: In this article, the authors presented a time-delay neural network (TDNN) approach to phoneme recognition, which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the time delay arrangement enables the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time and therefore not blurred by temporal shifts in the input
Proceedings ArticleDOI
A database for speaker-independent digit recognition
TL;DR: A large speech database has been collected for use in designing and evaluating algorithms for speaker independent recognition of connected digit sequences and formal human listening tests on this database provided certification of the labelling of the digit sequences.
Journal ArticleDOI
Dynamic specification of coarticulated vowels
TL;DR: Experiments summarized herein support the view that the most important source of information for speaker-invariant vowel identity is carried in dynamic specification of vowel onset and offset spectral patterns, with vowel duration also playing a role.
Journal ArticleDOI
Computers: Speech recognition: Turning theory to practice: New ICs have brought the requisite computer power to speech technology; an evaluation of equipment shows where it stands today
G. R. Doddington,T. B. Schalk +1 more
TL;DR: An evaluation of the equipment now available for turning the theory of electronic speech recognition into practice and the fulfilment of this goal seems much closer than it did because of the pace of advance in IC technology.
Related Papers (5)
Automatic speech recognition using acoustic sub-words and no time alignment
V.R. Algazi,K.L. Brown +1 more