scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Relations between the statistics of natural images and the response properties of cortical cells.

01 Dec 1987-Journal of The Optical Society of America A-optics Image Science and Vision (J Opt Soc Am A)-Vol. 4, Iss: 12, pp 2379-2394
TL;DR: The results obtained with six natural images suggest that the orientation and the spatial-frequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higher-order redundancy into first- order redundancy.
Abstract: The relative efficiency of any particular image-coding scheme should be defined only in relation to the class of images that the code is likely to encounter. To understand the representation of images by the mammalian visual system, it might therefore be useful to consider the statistics of images from the natural environment (i.e., images with trees, rocks, bushes, etc). In this study, various coding schemes are compared in relation to how they represent the information in such natural images. The coefficients of such codes are represented by arrays of mechanisms that respond to local regions of space, spatial frequency, and orientation (Gabor-like transforms). For many classes of image, such codes will not be an efficient means of representing information. However, the results obtained with six natural images suggest that the orientation and the spatial-frequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higher-order redundancy (e.g., correlation between the intensities of neighboring pixels) into first-order redundancy (i.e., the response distribution of the coefficients). Such coding produces a relatively high signal-to-noise ratio and permits information to be transmitted with only a subset of the total number of cells. These results support Barlow's theory that the goal of natural vision is to represent the information in the natural environment with minimal redundancy.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations


Additional excerpts

  • ...Many papers of the previous millennium, however, were about unsupervised learning (UL) without a teacher (e.g., Atick, Li, & Redlich, 1992; Baldi & Hornik, 1989; Barlow, Kaushal, & Mitchison, 1989; Barrow, 1987; Deco & Parra, 1997; Field, 1987; Földiák, 1990; Földiák & Young, 1995; Grossberg, 1976a, 1976b; Hebb, 1949; Kohonen, 1972, 1982, 1988; Kosko, 1990; Martinetz, Ritter, & Schulten, 1990; Miller, 1994; Mozer, 1991; Oja, 1989; Palm, 1992; Pearlmutter &Hinton, 1986; Ritter & Kohonen, 1989; Rubner & Schulten, 1990; Sanger, 1989; Saund, 1994; von der Malsburg, 1973; Watanabe, 1985; Willshaw & von der Malsburg, 1976); see also post-2000 work (e....

    [...]

  • ...…without a teacher (e.g., Atick, Li, & Redlich, 1992; Baldi & Hornik, 1989; Barlow, Kaushal, & Mitchison, 1989; Barrow, 1987; Deco & Parra, 1997; Field, 1987; Földiák, 1990; Földiák & Young, 1995; Grossberg, 1976a, 1976b; Hebb, 1949; Kohonen, 1972, 1982, 1988; Kosko, 1990; Martinetz, Ritter, &…...

    [...]

Journal ArticleDOI
TL;DR: The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.
Abstract: In this paper, we propose a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene, that we term the Spatial Envelope. We propose a set of perceptual dimensions (naturalness, openness, roughness, expansion, ruggedness) that represent the dominant spatial structure of a scene. Then, we show that these dimensions may be reliably estimated using spectral and coarsely localized information. The model generates a multidimensional space in which scenes sharing membership in semantic categories (e.g., streets, highways, coasts) are projected closed together. The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.

6,882 citations


Cites background from "Relations between the statistics of..."

  • ...This transformation provides the same detail of description at each spatial scale in agreement with the property of scale invariance of the second order statistics of real world images (e.g. Field, 1987)....

    [...]

  • ...The average of the energy spectrum provides a description of the correlation found in natural images (Field, 1987, 1994; van der Schaaf and van Hateren, 1996), and it has several implications for explaining the processing carried out by the first stages of the visual system (Field, 1987; Atick and Redlich, 1992)....

    [...]

  • ...…the energy spectrum provides a description of the correlation found in natural images (Field, 1987, 1994; van der Schaaf and van Hateren, 1996), and it has several implications for explaining the processing carried out by the first stages of the visual system (Field, 1987; Atick and Redlich, 1992)....

    [...]

  • ...The average of the energy spectrum provides a description of the correlation found in natural images (Field, 1987, 1994; van der Schaaf and van Hateren, 1996), and it has several implications for explaining the processing carried out by the first stages of the visual system (Field, 1987; Atick and…...

    [...]

Journal ArticleDOI
13 Jun 1996-Nature
TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.
Abstract: The receptive fields of simple cells in mammalian primary visual cortex can be characterized as being spatially localized, oriented and bandpass (selective to structure at different spatial scales), comparable to the basis functions of wavelet transforms. One approach to understanding such response properties of visual neurons has been to consider their relationship to the statistical structure of natural images in terms of efficient coding. Along these lines, a number of studies have attempted to train unsupervised learning algorithms on natural images in the hope of developing receptive fields with similar properties, but none has succeeded in producing a full set that spans the image space and contains all three of the above properties. Here we investigate the proposal that a coding strategy that maximizes sparseness is sufficient to account for these properties. We show that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex. The resulting sparse image code provides a more efficient representation for later stages of processing because it possesses a higher degree of statistical independence among its outputs.

5,947 citations

Journal ArticleDOI
TL;DR: This paper investigates the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval, and compares the retrieval performance of the EMD with that of other distances.
Abstract: We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.

4,593 citations

Journal ArticleDOI
TL;DR: Results suggest that rather than being exclusively feedforward phenomena, nonclassical surround effects in the visual cortex may also result from cortico-cortical feedback as a consequence of the visual system using an efficient hierarchical strategy for encoding natural images.
Abstract: We describe a model of visual processing in which feedback connections from a higher- to a lower- order visual cortical area carry predictions of lower-level neural activities, whereas the feedforward connections carry the residual errors between the predictions and the actual lower-level activities. When exposed to natural images, a hierarchical network of model neurons implementing such a model developed simple-cell-like receptive fields. A subset of neurons responsible for carrying the residual errors showed endstopping and other extra-classical receptive-field effects. These results suggest that rather than being exclusively feedforward phenomena, nonclassical surround effects in the visual cortex may also result from cortico-cortical feedback as a consequence of the visual system using an efficient hierarchical strategy for encoding natural images.

4,149 citations


Cites background from "Relations between the statistics of..."

  • ...2a ), the motivation being that the response properties of visual neurons might be largely determined by the statistics of natural image...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.

65,425 citations

Journal ArticleDOI
TL;DR: This method is used to examine receptive fields of a more complex type and to make additional observations on binocular interaction and this approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours.
Abstract: What chiefly distinguishes cerebral cortex from other parts of the central nervous system is the great diversity of its cell types and interconnexions. It would be astonishing if such a structure did not profoundly modify the response patterns of fibres coming into it. In the cat's visual cortex, the receptive field arrangements of single cells suggest that there is indeed a degree of complexity far exceeding anything yet seen at lower levels in the visual system. In a previous paper we described receptive fields of single cortical cells, observing responses to spots of light shone on one or both retinas (Hubel & Wiesel, 1959). In the present work this method is used to examine receptive fields of a more complex type (Part I) and to make additional observations on binocular interaction (Part II). This approach is necessary in order to understand the behaviour of individual cells, but it fails to deal with the problem of the relationship of one cell to its neighbours. In the past, the technique of recording evoked slow waves has been used with great success in studies of functional anatomy. It was employed by Talbot & Marshall (1941) and by Thompson, Woolsey & Talbot (1950) for mapping out the visual cortex in the rabbit, cat, and monkey. Daniel & Whitteiidge (1959) have recently extended this work in the primate. Most of our present knowledge of retinotopic projections, binocular overlap, and the second visual area is based on these investigations. Yet the method of evoked potentials is valuable mainly for detecting behaviour common to large populations of neighbouring cells; it cannot differentiate functionally between areas of cortex smaller than about 1 mm2. To overcome this difficulty a method has in recent years been developed for studying cells separately or in small groups during long micro-electrode penetrations through nervous tissue. Responses are correlated with cell location by reconstructing the electrode tracks from histological material. These techniques have been applied to

12,923 citations

Journal ArticleDOI
TL;DR: The theory of edge detection explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround ∇2G filters acting on the image forms the basis for a physiological model of simple cells.
Abstract: A theory of edge detection is presented. The analysis proceeds in two parts. (1) Intensity changes, which occur in a natural image over a wide range of scales, are detected separately at different scales. An appropriate filter for this purpose at a given scale is found to be the second derivative of a Gaussian, and it is shown that, provided some simple conditions are satisfied, these primary filters need not be orientation-dependent. Thus, intensity changes at a given scale are best detected by finding the zero values of delta 2G(x,y)*I(x,y) for image I, where G(x,y) is a two-dimensional Gaussian distribution and delta 2 is the Laplacian. The intensity changes thus discovered in each of the channels are then represented by oriented primitives called zero-crossing segments, and evidence is given that this representation is complete. (2) Intensity changes in images arise from surface discontinuities or from reflectance or illumination boundaries, and these all have the property that they are spatially. Because of this, the zero-crossing segments from the different channels are not independent, and rules are deduced for combining them into a description of the image. This description is called the raw primal sketch. The theory explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround delta 2G filters acting on the image forms the basis for a physiological model of simple cells (see Marr & Ullman 1979).

6,893 citations

01 Jan 1946

5,910 citations

Book
01 Jan 1965
TL;DR: In this paper, the authors provide a broad overview of Fourier Transform and its relation with the FFT and the Hartley Transform, as well as the Laplace Transform and the Laplacian Transform.
Abstract: 1 Introduction 2 Groundwork 3 Convolution 4 Notation for Some Useful Functions 5 The Impulse Symbol 6 The Basic Theorems 7 Obtaining Transforms 8 The Two Domains 9 Waveforms, Spectra, Filters and Linearity 10 Sampling and Series 11 The Discrete Fourier Transform and the FFT 12 The Discrete Hartley Transform 13 Relatives of the Fourier Transform 14 The Laplace Transform 15 Antennas and Optics 16 Applications in Statistics 17 Random Waveforms and Noise 18 Heat Conduction and Diffusion 19 Dynamic Power Spectra 20 Tables of sinc x, sinc2x, and exp(-71x2) 21 Solutions to Selected Problems 22 Pictorial Dictionary of Fourier Transforms 23 The Life of Joseph Fourier

5,714 citations