scispace - formally typeset
Search or ask a question

Showing papers by "Geoffrey E. Hinton published in 1988"


Journal ArticleDOI
01 Jan 1988-Nature
TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
Abstract: We describe a new learning procedure, back-propagation, for networks of neurone-like units. The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal ‘hidden’ units which are not part of the input or output come to represent important features of the task domain, and the regularities in the task are captured by the interactions of these units. The ability to create useful new features distinguishes back-propagation from earlier, simpler methods such as the perceptron-convergence procedure1.

23,814 citations


Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations


Journal ArticleDOI
TL;DR: The simulation of DCPS is intended as a detailed demonstration of the feasibility of certain ideas and should not be viewed as a full implementation of production systems.

287 citations


Book ChapterDOI
01 Jan 1988
TL;DR: A general parallel search method is described, based on statistical mechanics, and it is shown how it leads to a general learning rule for modifying the connection strengths so as to incorporate knowledge about a task domain in an efficient way.
Abstract: The computational power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections can allow a significant fraction of the knowledge of the system to be applied to an instance of a problem in a very short time. One kind of computation for which massively parallel networks appear to be well suited is large constraint satisfaction searches, but to use the connections efficiently two conditions must be met: First, a search technique that is suitable for parallel networks must be found. Second, there must be some way of choosing internal representations which allow the preexisting hardware connections to be used efficiently for encoding the constraints in the domain being searched. We describe a general parallel search method, based on statistical mechanics, and we show how it leads to a general learning rule for modifying the connection strengths so as to incorporate knowledge about a task domain in an efficient way. We describe some simple examples in which the learning algorithm creates internal representations that are demonstrably the most efficient way of using the preexisting connectivity structure.

179 citations


Proceedings ArticleDOI
11 Apr 1988
TL;DR: A time-delay neural network for phoneme recognition that was able to invent without human interference meaningful linguistic abstractions in time and frequency such as formant tracking and segmentation and does not rely on precise alignment or segmentation of the input.
Abstract: A time-delay neural network (TDNN) for phoneme recognition is discussed. By the use of two hidden layers in addition to an input and output layer it is capable of representing complex nonlinear decision surfaces. Three important properties of the TDNNs have been observed. First, it was able to invent without human interference meaningful linguistic abstractions in time and frequency such as formant tracking and segmentation. Second, it has learned to form alternate representations linking different acoustic events with the same higher level concept. In this fashion it can implement trading relations between lower level acoustic events leading to robust recognition performance despite considerable variability in the input speech. Third, the network is translation-invariant and does not rely on precise alignment or segmentation of the input. The TDNNs performance is compared with the best of hidden Markov models (HMMs) on a speaker-dependent phoneme-recognition task. The TDNN achieved a recognition of 98.5% compared to 93.7% for the HMM, i.e., a fourfold reduction in error. >

166 citations



Journal ArticleDOI
TL;DR: It is suggested that when comparing shapes in this kind of task people rely more on scene-based representations than on viewer-centered representations.

107 citations


Proceedings Article
01 Jan 1988
TL;DR: GEMINI is a hybrid procedure for multilayer networks, which shares many of the implementation advantages of correlational reinforcement procedures but is more efficient.
Abstract: Learning procedures that measure how random perturbations of unit activities correlate with changes in reinforcement are inefficient but simple to implement in hardware. Procedures like back-propagation (Rumelhart, Hinton and Williams, 1986) which compute how changes in activities affect the output error are much more efficient, but require more complex hardware. GEMINI is a hybrid procedure for multilayer networks, which shares many of the implementation advantages of correlational reinforcement procedures but is more efficient. GEMINI injects noise only at the first hidden layer and measures the resultant effect on the output error. A linear network associated with each hidden layer iteratively inverts the matrix which relates the noise to the error change, thereby obtaining the error-derivatives. No back-propagation is involved, thus allowing unknown non-linearities in the system. Two simulations demonstrate the effectiveness of GEMINI.

27 citations


Journal ArticleDOI
TL;DR: A time‐delay neural network approach to speech recognition that “invented” well‐known acoustic‐phonetic features as useful abstractions and recognizes voiced stops extracted from varying phonetic contexts at an error rate four times lower than the best of the authors' HMMs.
Abstract: A time‐delay neural network (TDNN) approach is presented to speech recognition that is characterized by two important properties: (1) Using multilayer arrangements of simple computing units, a TDNN can represent arbitrary nonlinear classification decision surfaces that are learned automatically using error back propagation. (2) The time‐delay arrangement enables the network to discover acoustic‐phonetic features and the temporal relationships between them independent of position in time and, hence, not blurred by temporal shifts in the input. The TDNNs are compared with the currently most popular technique in speech recognition, hidden Markov models (HMM). Extensive performance evaluation shows that the TDNN recognizes voiced stops extracted from varying phonetic contexts at an error rate four times lower (1.5% vs 6.3%) than the best of our HMMs. To perform this task, the TDNN “invented” well‐known acoustic‐phonetic features (e.g., F2 rise, F2 fall, vowel onset) as useful abstractions. It also developed alternate internal representations to link different acoustic realizations to the same concept. The TDNNs trained for other phonetic classes achieve similar high levels of performance. The integration of such smaller networks into large phonetic nets and propose strategies for the design of neural network based large vocabulary speech recognition systems is discussed.

8 citations