Home
/
Authors
/
Eero P. Simoncelli

Author

Eero P. Simoncelli

Other affiliations: New York University, Stanford University, Courant Institute of Mathematical Sciences ...read more

Bio: Eero P. Simoncelli is an academic researcher from Center for Neural Science. The author has contributed to research in topics: Wavelet & Image processing. The author has an hindex of 81, co-authored 260 publications receiving 68623 citations. Previous affiliations of Eero P. Simoncelli include New York University & Stanford University.

Topics: Wavelet, Image processing, Statistical model, Receptive field, Wavelet transform ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Metamers of the ventral stream

[...]

Jeremy Freeman¹, Eero P. Simoncelli¹, Eero P. Simoncelli², Eero P. Simoncelli³•Institutions (3)

Center for Neural Science¹, Courant Institute of Mathematical Sciences², Howard Hughes Medical Institute³

01 Sep 2011-Nature Neuroscience

TL;DR: A population model for mid-ventral processing is developed, in which nonlinear combinations of V1 responses are averaged in receptive fields that grow with eccentricity, providing a quantitative framework for assessing the capabilities and limitations of everyday vision.

...read moreread less

Abstract: Receptive fields of visual neurons get bigger along the ventral visual pathway and, in each area, they grow with distance from the fovea. The authors exploit these properties to build a model for visual representation in the ventral stream, using 'metameric' visual stimuli (which appear perceptually identical, but are actually different) to test the model predictions. The model can also explain deficits in peripheral recognition known as visual crowding.

...read moreread less

576 citations

Journal Article•DOI•

Image compression via joint statistical characterization in the wavelet domain

[...]

Robert Buccigrossi, Eero P. Simoncelli

01 Dec 1999-IEEE Transactions on Image Processing

TL;DR: In this article, a probability model for natural images is proposed based on empirical observation of their statistics in the wavelet transform domain, and an image coder called EPWIC is constructed, in which subband coefficients are encoded one bitplane at a time using a nonadaptive arithmetic encoder.

...read moreread less

Abstract: We develop a probability model for natural images, based on empirical observation of their statistics in the wavelet transform domain. Pairs of wavelet coefficients, corresponding to basis functions at adjacent spatial locations, orientations, and scales, are found to be non-Gaussian in both their marginal and joint statistical properties. Specifically, their marginals are heavy-tailed, and although they are typically decorrelated, their magnitudes are highly correlated. We propose a Markov model that explains these dependencies using a linear predictor for magnitude coupled with both multiplicative and additive uncertainties, and show that it accounts for the statistics of a wide variety of images including photographic images, graphical images, and medical images. In order to directly demonstrate the power of the model, we construct an image coder called EPWIC (embedded predictive wavelet image coder), in which subband coefficients are encoded one bitplane at a time using a nonadaptive arithmetic encoder that utilizes conditional probabilities calculated from the model. Bitplanes are ordered using a greedy algorithm that considers the MSE reduction per encoded bit. The decoder uses the statistical model to predict coefficient values based on the bits it has received. Despite the simplicity of the model, the rate-distortion performance of the coder is roughly comparable to the best image coders in the literature.

...read moreread less

576 citations

Journal Article•DOI•

On Advances in Statistical Modeling of Natural Images

[...]

Anuj Srivastava¹, Ann B. Lee², Eero P. Simoncelli³, Song-Chun Zhu⁴•Institutions (4)

Florida State University¹, Brown University², New York University³, Ohio State University⁴

01 Jan 2003-Journal of Mathematical Imaging and Vision

TL;DR: Some recent results in statistical modeling of natural images that attempt to explain patterns of non-Gaussian behavior of image statistics, i.e. high kurtosis, heavy tails, and sharp central cusps are reviewed.

...read moreread less

Abstract: Statistical analysis of images reveals two interesting properties: (i) invariance of image statistics to scaling of images, and (ii) non-Gaussian behavior of image statistics, i.e. high kurtosis, heavy tails, and sharp central cusps. In this paper we review some recent results in statistical modeling of natural images that attempt to explain these patterns. Two categories of results are considered: (i) studies of probability models of images or image decompositions (such as Fourier or wavelet decompositions), and (ii) discoveries of underlying image manifolds while restricting to natural images. Applications of these models in areas such as texture analysis, image classification, compression, and denoising are also considered.

...read moreread less

561 citations

Journal Article•DOI•

How MT cells analyze the motion of visual patterns

[...]

Nicole C. Rust¹, Nicole C. Rust², Nicole C. Rust³, Valerio Mante¹, Valerio Mante⁴, Valerio Mante⁵, Eero P. Simoncelli¹, Eero P. Simoncelli³, J. Anthony Movshon¹ - Show less +5 more•Institutions (5)

Center for Neural Science¹, McGovern Institute for Brain Research², Howard Hughes Medical Institute³, Stanford University⁴, University of Zurich⁵

15 Oct 2006-Nature Neuroscience

TL;DR: This work shows that the responses of MT cells can be captured by a linear-nonlinear model that operates not on the visual stimulus, but on the afferent responses of a population of nonlinear V1 cells, and robustly predicts the separately measured responses to gratings and plaids.

...read moreread less

Abstract: Neurons in area MT (V5) are selective for the direction of visual motion. In addition, many are selective for the motion of complex patterns independent of the orientation of their components, a behavior not seen in earlier visual areas. We show that the responses of MT cells can be captured by a linear-nonlinear model that operates not on the visual stimulus, but on the afferent responses of a population of nonlinear V1 cells. We fit this cascade model to responses of individual MT neurons and show that it robustly predicts the separately measured responses to gratings and plaids. The model captures the full range of pattern motion selectivity found in MT. Cells that signal pattern motion are distinguished by having convergent excitatory input from V1 cells with a wide range of preferred directions, strong motion opponent suppression and a tuned normalization that may reflect suppressive input from the surround of V1 cells.

...read moreread less

534 citations

Journal Article•DOI•

Cardinal rules: Visual orientation perception reflects knowledge of environmental statistics

[...]

Ahna R. Girshick¹, Michael S. Landy², Michael S. Landy¹, Eero P. Simoncelli•Institutions (2)

New York University¹, Center for Neural Science²

01 Jul 2011-Nature Neuroscience

TL;DR: This work estimated observers' internal models for orientation and found that they matched the local orientation distribution measured in photographs, and determined how a neural population could embed probabilistic information responsible for such biases.

...read moreread less

Abstract: Orientation judgments are more accurate at the horizontal and vertical orientations, possibly reflecting a statistical inference. Here the authors provide evidence for this idea, finding that observers' internal models for orientation match the local orientation distribution measured in photographs, and suggest how such information could be encoded in a neural population.

...read moreread less

528 citations

…
1
2
3
4
5
6
7
…
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Image quality assessment: from error visibility to structural similarity

[...]

Zhou Wang¹, Alan C. Bovik², Hamid R. Sheikh², Eero P. Simoncelli³•Institutions (3)

Center for Neural Science¹, University of Texas at Austin², Howard Hughes Medical Institute³

01 Apr 2004-IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

...read moreread less

40,609 citations

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

30,843 citations

Journal Article•DOI•

A theory for multiresolution signal decomposition: the wavelet representation

[...]

Stéphane Mallat¹•Institutions (1)

New York University¹

01 Jul 1989-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.

...read moreread less

Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

...read moreread less

20,028 citations

Book•

Compressed sensing

[...]

D.L. Donoho¹•Institutions (1)

Stanford University¹

01 Jan 2004

TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.

...read moreread less

Abstract: Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0

...read moreread less

18,609 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse