scispace - formally typeset
Search or ask a question

Showing papers by "Yoshua Bengio published in 1989"


Journal ArticleDOI
TL;DR: A set of Multi-Layered Networks allows the integration of information extracted with variable resolution in the time and frequency domains and to keep the number of links between nodes of the networks small for significant generalization during learning with a reasonable training set size.
Abstract: A set of Multi-Layered Networks allows the integration of information extracted with variable resolution in the time and frequency domains and to keep the number of links between nodes of the networks small for significant generalization during learning with a reasonable training set size.

29 citations


Proceedings Article
01 Jan 1989
TL;DR: This work attempts to combine neural networks with knowledge from speech science to build a speaker independent speech recognition system and combines delays, copies of activations of hidden and output units at the input level, and Back-Propagation for Sequences (BPS), a learning algorithm for networks with local self-loops.
Abstract: We attempt to combine neural networks with knowledge from speech science to build a speaker independent speech recognition system. This knowledge is utilized in designing the preprocessing, input coding, output coding, output supervision and architectural constraints. To handle the temporal aspect of speech we combine delays, copies of activations of hidden and output units at the input level, and Back-Propagation for Sequences (BPS), a learning algorithm for networks with local self-loops. This strategy is demonstrated in several experiments, in particular a nasal discrimination task for which the application of a speech theory hypothesis dramatically improved generalization.

15 citations


Proceedings Article
20 Aug 1989
TL;DR: Experiments are performed on 10 English vowels showing a recognition rate higher than 95% for new speakers, suggesting that MLNs suitably fed by the data computed by an ear model have good generalization capabilities over new speakers and new sounds.
Abstract: The paper describes a speech coding system based on an ear model followed by a set of MultiLayer Networks (MLN). MLNs are trained to learn how to recognize articulatory features like the place and manner of articulation. Experiments are performed on 10 English vowels showing a recognition rate higher than 95% for new speakers. When features are used for recognition, comparable results are obtained for vowels and diphthongs not used for training and pronounced by new speakers. This suggests that MLNs suitably fed by the data computed by an ear model have good generalization capabilities over new speakers and new sounds.

6 citations


Journal ArticleDOI
01 Dec 1989
TL;DR: These experiments are part of an attempt to construct a data‐driven speech recognition system with multiple neural networks specialized to different tasks, such as one trained on the E‐set consonants.
Abstract: Artificial neural networks capable of doing hard learning offer a new way to undertake automatic speech recognition. The Boltzmann machine algorithm and the error back-propagation algorithm have been used to perform speaker normalization. Spectral segments are represented by spectral lines. Speaker-independent recognition of place of articulation for vowels is performed on lines. Performance of the networks is shown to depend on the coding of the input data. Samples were extracted from continuous speech of 38 speakers. The error rate obtained (4.2% error on test set of 72 samples with the Boltzmann machine and 6.9% error with error back-propagation) is better than that of previous experiments, using the same data, with continuous Hidden Markov Models (7.3% error on test set and 3% error on training set). These experiments are part of an attempt to construct a data-driven speech recognition system with multiple neural networks specialized to different tasks. Results are also reported on the recognition performance of other trained networks, such as one trained on the E-set consonants. Les reseaux neuraux artificiels en mesure d'effectuer des apprentissages difficiles offrent une nouvelle facon d'effectuer la reconnaissance automatique de la parole. L'algorithme de la machine de Boltzmann et l'algorithme de propagation arriere d'erreurs ont ete utilises pour effectuer la normalisation du locuteur. Les segments spectraux sont representes par des lignes spectrales. La reconnaissance independante de l'endroit de l'articulation des voyelles par le locuteur est effectuee sur des lignes. La performance des reseaux s'est revelee dependante de l'encodage des donnees d'entree. Des echantillons ont ete tires du discours continu de 38 locuteurs. Le taux d'erreur obtenu (4,2% d'erreur dans les 72 echantillons avec la machine de Boltzmann, 6,9% d'erreur avec l'algorithme de propagation arriere d'erreurs) est inferieur a celui des experiences precedentes, effectuees a l'aide des měmes donnees et avec les modeles continus de Markov (7,3% d'erreur avec la serie d'essai et 3% d'erreur avec la serie de formation). Ces experiences font partie d'un programme en vue d'elaborer un systeme de reconnaissance de la parole guidee par les donnees qui serait dote de reseaux neuraux multiples specialises dans diverses taches. Cet article presente egalement des donnees sur la performance d'autres reseaux formes.

5 citations


Book ChapterDOI
23 May 1989
TL;DR: In this article, the authors combine a structural or knowledge-based approach for describing speech units with neural networks capable of automatically learning relations between acoustic properties and speech units for speaker-independent recognition of vowels using an ear model for preprocessing.
Abstract: Combining a structural or knowledge-based approach for describing speech units with neural networks capable of automatically learning relations between acoustic properties and speech units is investigated. The authors show how speech coding can be performed by sets of multilayer neural networks whose execution is decided by a data-driven strategy. Coding is based on phonetic properties characterizing a large population of speakers. Results on speaker-independent recognition of vowels using an ear model for preprocessing are reported. >

5 citations


01 Jan 1989
TL;DR: The authors show how speech coding can be performed by sets of multilayer neural networks whose execution is decided by a data-driven strategy.

3 citations


Proceedings Article
01 Jan 1989
TL;DR: A system based on a neural network with one hidden layer trained with back propagation designed to efficiently identify proteins exhibiting such domains, characterized by a few localized conserved regions and a low overall homology.
Abstract: In order to detect the presence and location of immunoglobulin (Ig) domains from amino acid sequences we built a system based on a neural network with one hidden layer trained with back propagation. The program was designed to efficiently identify proteins exhibiting such domains, characterized by a few localized conserved regions and a low overall homology. When the National Biomedical Research Foundation (NBRF) NEW protein sequence database was scanned to evaluate the program's performance, we obtained very low rates of false negatives coupled with a moderate rate of false positives.

1 citations