Home
/
Authors
/
Jeffrey L. Elman

Author

Jeffrey L. Elman

Other affiliations: University of Texas at Austin, University of California, Berkeley

Bio: Jeffrey L. Elman is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Sentence & Verb. The author has an hindex of 50, co-authored 109 publications receiving 21408 citations. Previous affiliations of Jeffrey L. Elman include University of Texas at Austin & University of California, Berkeley.

Topics: Sentence, Verb, Connectionism, N400, Vocabulary ...read more

Papers published on a yearly basis

2021
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2001
1999
1998
1997
1996
1995
1994
1993
1991
1990
1989
1988
1986
1985
1984
1983
1981
1980
1979
1978
1977

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Finding Structure in Time

[...]

Jeffrey L. Elman¹•Institutions (1)

University of California, San Diego¹

01 Mar 1990-Cognitive Science

TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.

...read moreread less

10,264 citations

Journal Article•DOI•

The TRACE model of speech perception.

[...]

James L. McClelland¹, Jeffrey L. Elman²•Institutions (2)

Carnegie Mellon University¹, University of California, San Diego²

01 Jan 1986-Cognitive Psychology

TL;DR: The TRACE model, described in detail elsewhere, deals with short segments of real speech, and suggests a mechanism for coping with the fact that the cues to the identity of phonemes vary as a function of context.

...read moreread less

2,663 citations

Journal Article•DOI•

Learning and development in neural networks: the importance of starting small

[...]

Jeffrey L. Elman¹•Institutions (1)

University of California, San Diego¹

01 Jul 1993-Cognition

TL;DR: Possible synergistic interactions between maturational change and the ability to learn a complex domain (language) as investigated in connectionist networks suggest that developmental restrictions on resources may constitute a necessary prerequisite for mastering certain complex domains.

...read moreread less

1,766 citations

Journal Article•DOI•

Distributed Representations, Simple Recurrent Networks, And Grammatical Structure

[...]

Jeffrey L. Elman¹•Institutions (1)

University of California, San Diego¹

01 Sep 1991-Machine Learning

TL;DR: Using a prediction task, a simple recurrent network is trained on multiclausal sentences which contain multiply-embedded relative clauses and principal component analysis of the hidden unit activation patterns reveals that the network solves the task by developing complex distributed representations which encode the relevant grammatical relations and hierarchical constituent structure.

...read moreread less

Abstract: In this paper three problems for a connectionist account of language are considered: 1. What is the nature of linguistic representations? 2. How can complex structural relationships such as constituent structure be represented? 3. How can the apparently open-ended nature of language be accommodated by a fixed-resource system? Using a prediction task, a simple recurrent network (SRN) is trained on multiclausal sentences which contain multiply-embedded relative clauses. Principal component analysis of the hidden unit activation patterns reveals that the network solves the task by developing complex distributed representations which encode the relevant grammatical relations and hierarchical constituent structure. Differences between the SRN state representations and the more traditional pushdown store are discussed in the final section.

...read moreread less

1,163 citations

Journal Article•DOI•

Learning and evolution in neural networks

[...]

Stefano Nolfi, Domenico Parisi, Jeffrey L. Elman¹•Institutions (1)

University of California, San Diego¹

20 Jun 1994-Adaptive Behavior

TL;DR: Simulation on populations of neural networks that both evolve at the population level and learn at the individual level finds both individuals that have high fitness and individuals that, although they do not have high Fitness at birth, end up with high fitness because they learn to predict.

...read moreread less

Abstract: This article describes simulations on populations of neural networks that both evolve at the population level and learn at the individual level. Unlike other simulations, the evolutionary task (finding food in the environment) and the learning task (predicting the next position of food on the basis of present position and planned network's movement) are different tasks. In these conditions, learning influences evolution (without Lamarckian inheritance of learned weight changes) and evolution influences learning. Average but not peak fitness has a better evolutionary growth with learning than without learning. After the initial generations, individuals that learn to predict during life also improve their food-finding ability during life. Furthermore, individuals that inherit an innate capacity to find food also inherit an innate predisposition to learn to predict the sensory consequences of their movements. They do not predict better at birth, but they do learn to predict better than individuals of the ini...

...read moreread less

349 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Long short-term memory

[...]

Sepp Hochreiter¹, Jürgen Schmidhuber²•Institutions (2)

Technische Universität München¹, Dalle Molle Institute for Artificial Intelligence Research²

01 Nov 1997-Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

...read moreread less

72,897 citations

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Posted Content•

Efficient Estimation of Word Representations in Vector Space

[...]

Tomas Mikolov¹, Kai Chen², Greg S. Corrado³, Jeffrey Dean³•Institutions (3)

Brno University of Technology¹, Beijing University of Posts and Telecommunications², Google³

16 Jan 2013-arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Abstract: We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.

...read moreread less

20,077 citations

Book Chapter•DOI•

Learning internal representations by error propagation

[...]

David E. Rumelhart, Geoffrey E. Hinton, Ronald J. Williams

01 Jan 1988

TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.

...read moreread less

Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

...read moreread less

17,604 citations

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse