Emergence of simple-cell receptive field properties by learning a sparse code for natural images

doi:10.1038/381607A0

Home
/
Papers
/
Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Journal Article•DOI•

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

Bruno A. Olshausen¹, Bruno A. Olshausen², David J. Field²•Institutions (2)

University of California, Davis¹, Cornell University²

13 Jun 1996-Nature (Nature Publishing Group)-Vol. 381, Iss: 6583, pp 607-609

TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.

read less

Abstract: The receptive fields of simple cells in mammalian primary visual cortex can be characterized as being spatially localized, oriented and bandpass (selective to structure at different spatial scales), comparable to the basis functions of wavelet transforms. One approach to understanding such response properties of visual neurons has been to consider their relationship to the statistical structure of natural images in terms of efficient coding. Along these lines, a number of studies have attempted to train unsupervised learning algorithms on natural images in the hope of developing receptive fields with similar properties, but none has succeeded in producing a full set that spans the image space and contains all three of the above properties. Here we investigate the proposal that a coding strategy that maximizes sparseness is sufficient to account for these properties. We show that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex. The resulting sparse image code provides a more efficient representation for later stages of processing because it possesses a higher degree of statistical independence among its outputs.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Additional excerpts

...…1997; Hochreiter & Schmidhuber, 1999; Hyvärinen, Hoyer, & Oja, 1999; Lewicki & Olshausen, 1998) throughwell-known feature detectors (e.g., Olshausen & Field, 1996; Schmidhuber, Eldracher, & Foltin, 1996), such as off-center-on-surround-like structures, aswell as orientation sensitive…...
[...]

Journal Article•DOI•

Learning the parts of objects by non-negative matrix factorization

[...]

Daniel D. Lee¹, H. Sebastian Seung², H. Sebastian Seung¹•Institutions (2)

Alcatel-Lucent¹, Massachusetts Institute of Technology²

21 Oct 1999-Nature

TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.

...read moreread less

Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

...read moreread less

11,500 citations

Journal Article•DOI•

Representation Learning: A Review and New Perspectives

[...]

Yoshua Bengio, Aaron Courville, Pascal Vincent

01 Aug 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.

...read moreread less

Abstract: The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.

...read moreread less

11,201 citations

Journal Article•DOI•

A tutorial on support vector regression

[...]

Alexander J. Smola¹, Bernhard Schölkopf²•Institutions (2)

Australian National University¹, Max Planck Society²

01 Aug 2004-Statistics and Computing

TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.

...read moreread less

Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

...read moreread less

10,696 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

An information-maximization approach to blind separation and blind deconvolution

[...]

Anthony J. Bell¹, Terrence J. Sejnowski¹•Institutions (1)

University of California, San Diego¹

01 Nov 1995-Neural Computation

TL;DR: It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived.

...read moreread less

Abstract: We derive a new self-organizing learning algorithm that maximizes the information transferred in a network of nonlinear units. The algorithm does not assume any knowledge of the input distributions, and is defined here for the zero-noise limit. Under these conditions, information maximization has extra properties not found in the linear case (Linsker 1989). The nonlinearities in the transfer function are able to pick up higher-order moments of the input distributions and perform something akin to true redundancy reduction between units in the output representation. This enables the network to separate statistically independent components in the inputs: a higher-order generalization of principal components analysis. We apply the network to the source separation (or cocktail party) problem, successfully separating unknown mixtures of up to 10 speakers. We also show that a variant on the network architecture is able to perform blind deconvolution (cancellation of unknown echoes and reverberation in a speech signal). Finally, we derive dependencies of information transfer on time delays. We suggest that information maximization provides a unifying framework for problems in "blind" signal processing.

...read moreread less

9,157 citations

Journal Article•DOI•

Receptive fields and functional architecture of monkey striate cortex

[...]

David H. Hubel, Torsten N. Wiesel

01 Mar 1968-The Journal of Physiology

TL;DR: The striate cortex was studied in lightly anaesthetized macaque and spider monkeys by recording extracellularly from single units and stimulating the retinas with spots or patterns of light, with response properties very similar to those previously described in the cat.

...read moreread less

Abstract: 1. The striate cortex was studied in lightly anaesthetized macaque and spider monkeys by recording extracellularly from single units and stimulating the retinas with spots or patterns of light. Most cells can be categorized as simple, complex, or hypercomplex, with response properties very similar to those previously described in the cat. On the average, however, receptive fields are smaller, and there is a greater sensitivity to changes in stimulus orientation. A small proportion of the cells are colour coded. 2. Evidence is presented for at least two independent systems of columns extending vertically from surface to white matter. Columns of the first type contain cells with common receptive-field orientations. They are similar to the orientation columns described in the cat, but are probably smaller in cross-sectional area. In the second system cells are aggregated into columns according to eye preference. The ocular dominance columns are larger than the orientation columns, and the two sets of boundaries seem to be independent. 3. There is a tendency for cells to be grouped according to symmetry of responses to movement; in some regions the cells respond equally well to the two opposite directions of movement of a line, but other regions contain a mixture of cells favouring one direction and cells favouring the other. 4. A horizontal organization corresponding to the cortical layering can also be discerned. The upper layers (II and the upper two-thirds of III) contain complex and hypercomplex cells, but simple cells are virtually absent. The cells are mostly binocularly driven. Simple cells are found deep in layer III, and in IV A and IV B. In layer IV B they form a large proportion of the population, whereas complex cells are rare. In layers IV A and IV B one finds units lacking orientation specificity; it is not clear whether these are cell bodies or axons of geniculate cells. In layer IV most cells are driven by one eye only; this layer consists of a mosaic with cells of some regions responding to one eye only, those of other regions responding to the other eye. Layers V and VI contain mostly complex and hypercomplex cells, binocularly driven. 5. The cortex is seen as a system organized vertically and horizontally in entirely different ways. In the vertical system (in which cells lying along a vertical line in the cortex have common features) stimulus dimensions such as retinal position, line orientation, ocular dominance, and perhaps directionality of movement, are mapped in sets of superimposed but independent mosaics. The horizontal system segregates cells in layers by hierarchical orders, the lowest orders (simple cells monocularly driven) located in and near layer IV, the higher orders in the upper and lower layers.

...read moreread less

6,388 citations

Journal Article•DOI•

Relations between the statistics of natural images and the response properties of cortical cells.

[...]

David J. Field¹•Institutions (1)

University of Cambridge¹

01 Dec 1987-Journal of The Optical Society of America A-optics Image Science and Vision

TL;DR: The results obtained with six natural images suggest that the orientation and the spatial-frequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higher-order redundancy into first- order redundancy.

...read moreread less

Abstract: The relative efficiency of any particular image-coding scheme should be defined only in relation to the class of images that the code is likely to encounter. To understand the representation of images by the mammalian visual system, it might therefore be useful to consider the statistics of images from the natural environment (i.e., images with trees, rocks, bushes, etc). In this study, various coding schemes are compared in relation to how they represent the information in such natural images. The coefficients of such codes are represented by arrays of mechanisms that respond to local regions of space, spatial frequency, and orientation (Gabor-like transforms). For many classes of image, such codes will not be an efficient means of representing information. However, the results obtained with six natural images suggest that the orientation and the spatial-frequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higher-order redundancy (e.g., correlation between the intensities of neighboring pixels) into first-order redundancy (i.e., the response distribution of the coefficients). Such coding produces a relatively high signal-to-noise ratio and permits information to be transmitted with only a subset of the total number of cells. These results support Barlow's theory that the goal of natural vision is to represent the information in the natural environment with minimal redundancy.

...read moreread less

3,077 citations

Journal Article•DOI•

An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex

[...]

J. P. Jones¹, L. A. Palmer•Institutions (1)

University of Pennsylvania¹

01 Dec 1987-Journal of Neurophysiology

TL;DR: It seems that an optimal strategy has evolved for sampling images simultaneously in the 2D spatial and spatial frequency domains and the Gabor function provides a useful and reasonably accurate description of most spatial aspects of simple receptive fields.

...read moreread less

Abstract: 1. Using the two-dimensional (2D) spatial and spectral response profiles described in the previous two reports, we test Daugman's generalization of Marcelja's hypothesis that simple receptive fields belong to a class of linear spatial filters analogous to those described by Gabor and referred to here as 2D Gabor filters. 2. In the space domain, we found 2D Gabor filters that fit the 2D spatial response profile of each simple cell in the least-squared error sense (with a simplex algorithm), and we show that the residual error is devoid of spatial structure and statistically indistinguishable from random error. 3. Although a rigorous statistical approach was not possible with our spectral data, we also found a Gabor function that fit the 2D spectral response profile of each simple cell and observed that the residual errors are everywhere small and unstructured. 4. As an assay of spatial linearity in two dimensions, on which the applicability of Gabor theory is dependent, we compare the filter parameters estimated from the independent 2D spatial and spectral measurements described above. Estimates of most parameters from the two domains are highly correlated, indicating that assumptions about spatial linearity are valid. 5. Finally, we show that the functional form of the 2D Gabor filter provides a concise mathematical expression, which incorporates the important spatial characteristics of simple receptive fields demonstrated in the previous two reports. Prominent here are 1) Cartesian separable spatial response profiles, 2) spatial receptive fields with staggered subregion placement, 3) Cartesian separable spectral response profiles, 4) spectral response profiles with axes of symmetry not including the origin, and 5) the uniform distribution of spatial phase angles. 6. We conclude that the Gabor function provides a useful and reasonably accurate description of most spatial aspects of simple receptive fields. Thus it seems that an optimal strategy has evolved for sampling images simultaneously in the 2D spatial and spatial frequency domains.

...read moreread less

1,723 citations

Journal Article•DOI•

Self-organization in a perceptual network

[...]

Ralph Linsker¹•Institutions (1)

IBM¹

01 Mar 1988-IEEE Computer

TL;DR: It is shown that even a single developing cell of a layered network exhibits a remarkable set of optimization properties that are closely related to issues in statistics, theoretical physics, adaptive signal processing, the formation of knowledge representation in artificial intelligence, and information theory.

...read moreread less

Abstract: The emergence of a feature-analyzing function from the development rules of simple, multilayered networks is explored. It is shown that even a single developing cell of a layered network exhibits a remarkable set of optimization properties that are closely related to issues in statistics, theoretical physics, adaptive signal processing, the formation of knowledge representation in artificial intelligence, and information theory. The network studied is based on the visual system. These results are used to infer an information-theoretic principle that can be applied to the network as a whole, rather than a single cell. The organizing principle proposed is that the network connections develop in such a way as to maximize the amount of information that is preserved when signals are transformed at each processing stage, subject to certain constraints. The operation of this principle is illustrated for some simple cases. >

...read moreread less

1,469 citations