Home
/
Authors
/
Edward H. Adelson

Author

Edward H. Adelson

Other affiliations: Sarnoff Corporation, New York University, University of Michigan ...read more

Bio: Edward H. Adelson is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Tactile sensor & Image processing. The author has an hindex of 74, co-authored 228 publications receiving 40686 citations. Previous affiliations of Edward H. Adelson include Sarnoff Corporation & New York University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1965

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Laplacian Pyramid as a Compact Image Code

[...]

Peter J. Burt¹, Edward H. Adelson²•Institutions (2)

Rensselaer Polytechnic Institute¹, Sarnoff Corporation²

01 Apr 1983-IEEE Transactions on Communications

TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.

...read moreread less

Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

...read moreread less

6,975 citations

Journal Article•DOI•

Spatiotemporal energy models for the perception of motion

[...]

Edward H. Adelson¹, James R. Bergen¹•Institutions (1)

Sarnoff Corporation¹

01 Feb 1985-Journal of The Optical Society of America A-optics Image Science and Vision

TL;DR: In this article, the first stage consists of linear filters that are oriented in space-time and tuned in spatial frequency, and the outputs of quadrature pairs of such filters are squared and summed to give a measure of motion energy.

...read moreread less

Abstract: A motion sequence may be represented as a single pattern in x–y–t space; a velocity of motion corresponds to a three-dimensional orientation in this space. Motion sinformation can be extracted by a system that responds to the oriented spatiotemporal energy. We discuss a class of models for human motion mechanisms in which the first stage consists of linear filters that are oriented in space-time and tuned in spatial frequency. The outputs of quadrature pairs of such filters are squared and summed to give a measure of motion energy. These responses are then fed into an opponent stage. Energy models can be built from elements that are consistent with known physiology and psychophysics, and they permit a qualitative understanding of a variety of motion phenomena.

...read moreread less

3,504 citations

Journal Article•DOI•

The design and use of steerable filters

[...]

William T. Freeman¹, Edward H. Adelson¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1991-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The authors present an efficient architecture to synthesize filters of arbitrary orientations from linear combinations of basis filters, allowing one to adaptively steer a filter to any orientation, and to determine analytically the filter output as a function of orientation.

...read moreread less

Abstract: The authors present an efficient architecture to synthesize filters of arbitrary orientations from linear combinations of basis filters, allowing one to adaptively steer a filter to any orientation, and to determine analytically the filter output as a function of orientation. Steerable filters may be designed in quadrature pairs to allow adaptive control over phase as well as orientation. The authors show how to design and steer the filters and present examples of their use in the analysis of orientation and phase, angularly adaptive filtering, edge detection, and shape from shading. One can also build a self-similar steerable pyramid representation. The same concepts can be generalized to the design of 3-D steerable filters. >

...read moreread less

3,365 citations

The Plenoptic Function and the Elements of Early Vision

[...]

Edward H. Adelson¹, James R. Bergen•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1991

TL;DR: Early vision as discussed by the authors is defined as measuring the amounts of various kinds of visual substances present in the image (e.g., redness or rightward motion energy) rather than in how it labels "things".

...read moreread less

Abstract: What are the elements of early vision? This question might be taken to mean, What are the fundamental atoms of vision?—and might be variously answered in terms ofsuch candidate structures as edges, peaks, corners, and so on. In this chapter we adopt a rather different point of view and ask the question, What are the fundamentalsubstances of vision? This distinction is important becausewe wish to focus on the first steps in extraction of visualinformation. At this level it is premature to talk aboutdiscrete objects, even such simple ones as edges and corners.There is general agreement that early vision involvesmeasurements of a number of basic image properties in-cluding orientation, color, motion, and so on. Figure l.lshows a caricature (in the style of Neisser, 1976), of the sort of architecture that has become quite popular as a model for both human and machine vision. The first stageof processing involves a set of parallel pathways, eachdevoted to one particular-visual property. We propose that the measurements of these basic properties be con-sidered as the elements of early vision. We think of earlyvision as measuring the amounts of various kinds of vi-sual "substances" present in the image (e.g., redness orrightward motion energy). In other words, we are inter- ested in how early vision measures “stuff” rather than in how it labels “things.”What, then, are these elementary visual substances?Various lists have been compiled using a mixture of intui-tion and experiment. Electrophysiologists have describedneurons in striate cortex that are selectively sensitive tocertain visual properties; for reviews, see Hubel (1988) and DeValois and DeValois (1988). Psychophysicists haveinferred the existence of channels that are tuned for cer- tain visual properties; for reviews, see Graham (1989), Olzak and Thomas (1986), Pokorny and Smith (1986), and Watson (1986). Researchers in perception have foundaspects of visual stimuli that are processed pre-attentive- ly (Beck, 1966; Bergen & Julesz, 1983; Julesz & Bergen,

...read moreread less

1,576 citations

Journal Article•DOI•

Shiftable multiscale transforms

[...]

Eero P. Simoncelli¹, William T. Freeman¹, Edward H. Adelson¹, David J. Heeger²•Institutions (2)

Massachusetts Institute of Technology¹, Ames Research Center²

01 Mar 1992-IEEE Transactions on Information Theory

TL;DR: Two examples of jointly shiftable transforms that are simultaneously shiftable in more than one domain are explored and the usefulness of these image representations for scale-space analysis, stereo disparity measurement, and image enhancement is demonstrated.

...read moreread less

Abstract: One of the major drawbacks of orthogonal wavelet transforms is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavelet transforms are also unstable with respect to dilations of the input signal and, in two dimensions, rotations of the input signal. The authors formalize these problems by defining a type of translation invariance called shiftability. In the spatial domain, shiftability corresponds to a lack of aliasing; thus, the conditions under which the property holds are specified by the sampling theorem. Shiftability may also be applied in the context of other domains, particularly orientation and scale. Jointly shiftable transforms that are simultaneously shiftable in more than one domain are explored. Two examples of jointly shiftable transforms are designed and implemented: a 1-D transform that is jointly shiftable in position and scale, and a 2-D transform that is jointly shiftable in position and orientation. The usefulness of these image representations for scale-space analysis, stereo disparity measurement, and image enhancement is demonstrated. >

...read moreread less

1,448 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Image quality assessment: from error visibility to structural similarity

[...]

Zhou Wang¹, Alan C. Bovik², Hamid R. Sheikh², Eero P. Simoncelli³•Institutions (3)

Center for Neural Science¹, University of Texas at Austin², Howard Hughes Medical Institute³

01 Apr 2004-IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

...read moreread less

40,609 citations

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Journal Article•DOI•

A theory for multiresolution signal decomposition: the wavelet representation

[...]

Stéphane Mallat¹•Institutions (1)

New York University¹

01 Jul 1989-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.

...read moreread less

Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

...read moreread less

20,028 citations

Proceedings Article•DOI•

Rapid object detection using a boosted cascade of simple features

[...]

Paul A. Viola¹, Michael Jones•Institutions (1)

Mitsubishi¹

01 Dec 2001

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Abstract: This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the "integral image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers. The third contribution is a method for combining increasingly more complex classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection.

...read moreread less

18,620 citations

Book•

A wavelet tour of signal processing

[...]

Stéphane Mallat

01 Jan 1998

TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.

...read moreread less

Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

...read moreread less

17,693 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse