Open AccessProceedings Article
Fast back-propagation learning methods for large phonemic neural networks.
Patrick Haffner,Alex Waibel,Hidefumi Sawai,Kiyohiro Shikano +3 more
- pp 2553-2556
Reads0
Chats0
TLDR
Several improvements in the Back-Propagation procedure are proposed to increase training speed, and their limitations with respect to generalization are discussed.Abstract:
Several improvements in the Back-Propagation procedure are proposed to increase training speed, and we discuss their limitations with respect to generalization. performance. The error surface is modeled to avoid local minima and flat areas. The synaptic weights are updated as often as possible. Both the step size and the momentum are dynamically scaled to the largest possible values that do not result in overshooting. Training for the speaker-dependent recognition of the phonemes /b/, /d/ and /g/ has been reduced from 2 days to 1 minnte on an Alliant parallel computer, delivering the same 98.6% recognition performance. With a 55000-connection TDNN, the same algorithm needs 1 hour and 5000 training tokens to recognize the 18 Japanese consonants with 96.7% correct.read more
Citations
More filters
Journal ArticleDOI
A survey of hybrid ANN/HMM models for automatic speech recognition
Edmondo Trentin,Marco Gori +1 more
TL;DR: A number of significant hybrid models for ASR are reviewed, putting together approaches and techniques from a highly specialistic and non-homogeneous literature, allowing for tangible improvements in recognition performance over the standard HMMs in difficult and significant benchmark tasks.
Proceedings ArticleDOI
Integrating time alignment and neural networks for high performance continuous speech recognition
TL;DR: The authors describe two systems in which neural network classifiers are merged with dynamic programming (DP) time alignment methods to produce high-performance continuous speech recognizers.
Proceedings ArticleDOI
Optimisation of neural models for speaker identification
J. Oglesby,John Mason +1 more
TL;DR: An approach to speaker recognition based on feedforward neural models is investigated, and recognition performance is shown to be comparable to that of a vector quantization approach based on personalized codebooks.
Book
Soft Computing and Human-Centered Machines
TL;DR: F fuzzy set theory - analysis and extensions methods in hard and fuzzy clustering soft-competitive learning paradigms aggregation operations for fusing fuzzy information fuzzy gated neural networks in pattern recognition soft computing technique in kansei (emotional) information processing.
Journal ArticleDOI
Connected recognition with a recurrent network
TL;DR: This work attempted multi-talker, connected recognition of the spoken American English letter names b, d, e and v, using a recurrent neural network as the speech recognizer.
References
Related Papers (5)
Modularity and scaling in large phonemic neural networks
Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
S. Davis,Paul Mermelstein +1 more