scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Photographic tone reproduction for digital images

TL;DR: In this article, the tone reproduction problem is also considered, which maps the potentially high dynamic range of real world luminances to the low dynamic ranges of the photographic print, which is a classic photographic task.
Abstract: A classic photographic task is the mapping of the potentially high dynamic range of real world luminances to the low dynamic range of the photographic print. This tone reproduction problem is also ...

Summary (2 min read)

1 Introduction

  • The range of light the authors experience in the real world is vast, spanning approximately ten orders of absolute range from star-lit scenes to sun-lit snow, and over four orders of dynamic range from shadows to highlights in a single scene.
  • Most of this work has used an explicit perceptual model to control the operator [Upstill 1985; Tumblin and Rushmeier 1993; Ward 1994; Ferwerda et al.
  • First, current models often introduce artifacts such as ringing or visible clamping (see Section 4).
  • This has led us to develop a tone reproduction technique designed for a wide variety of images, including those having a very high dynamic range .

2 Background

  • The tone reproduction problem was first defined by photographers.
  • Copyrights for components of this work owned by others than ACM must be honored.
  • Before discussing how the Zone System is applied, the authors first summarize some relevant terminology.
  • Because zones relate logarithmically to scene luminances, dynamic range can be expressed as the difference between highest and lowest distinguishable scene zones .
  • For regions where loss of detail is objectionable, the photographer can resort to dodging-andburning which will locally change the development process.

3 Algorithm

  • The Zone System summarized in the last section is used to develop a new tone mapping algorithm for digital images, such as those created by rendering algorithms (e.g., [Ward Larson and Shakespeare 1998]) or captured using high dynamic range photography [Debevec and Malik 1997].
  • The authors are not trying to closely mimic the actual photographic process [Geigel and Musgrave 1997], but instead use the basic conceptual framework of the Zone System to manage choices in tone reproduction.
  • Then, if necessary, the authors apply automatic dodging-and-burning to accomplish dynamic range compression.

3.1 Initial luminance mapping

  • The authors first show how to set the tonal range of the output image based on the scene’s key value.
  • Like many tone reproduction methods [Tumblin and Rushmeier 1993; Ward 1994; Holm 1996], the authors view the log-average luminance as a useful approximation to the key of the scene.
  • The denominator causes a graceful blend between these two scalings.
  • As mentioned in the previous section, this is not always desirable.
  • For many high dynamic range images, the compression provided by this technique appears to be sufficient to preserve detail in low contrast areas, while compressing high luminances to a displayable range.

3.2 Automatic dodging-and-burning

  • In traditional dodging-and-burning, all portions of the print potentially receive a different exposure time from the negative, bringing “up” selected dark regions or bringing “down” selected light regions to avoid loss of detail [Adams 1983].
  • The authors choice of center-surround ratio is 1.6, which results in a difference of Gaussians model that closely resembles a Laplacian of Gaussian filter [Marr 1982].
  • To choose the largest neighborhood around a pixel with fairly even luminances, the authors threshold V to select the corresponding scale sm.
  • In either case the pixel’s contrast relative to the surrounding area is increased.
  • In summary, by automatically selecting an appropriate neighborhood for each pixel the authors effectively implement a pixel-by-pixel dodging and burning technique as applied in photography [Adams 1983].

4 Results

  • The convolutions of Equation 5 were computed using a Fast Fourier Transform (FFT).
  • Ward’s contrast scale factor A global multiplier is used that aims to maintain visibility thresholds [Ward 1994].
  • In Figure 11 eight different tone mapping operators are shown side by side using the Cornell box high dynamic range image as input.
  • The model is slightly different from the original Cornell box because the authors have placed a smaller light source underneath the ceiling of the box so that the ceiling receives a large quantity of direct illumination, a characteristic of many architectural environments.
  • The authors have also experimented with a fast approximation of the Gaussian convolution using a multiscale spline based approach [Burt and Adelson 1983], which was first used in the context of tone reproduction by [Tumblin et al. 1999], and have found that the computation is about 3.7 times faster than their Fourier domain implementation.

5 Summary

  • The authors have developed a relatively simple and fast tone reproduction algorithm for digital images that borrows from 150 years of photographic experience.
  • It is designed to follow their practices and is thus well-suited for applications where creating subjectively satisfactory and essentially artifact-free images is the desired goal.

Did you find this useful? Give us your feedback

Figures (16)

Content maybe subject to copyright    Report

Photographic Tone Reproduction for Digital Images
Erik Reinhard
University of Utah
Michael Stark
University of Utah
Peter Shirley
University of Utah
James Ferwerda
Cornell University
Abstract
A classic photographic task is the mapping of the potentially high
dynamic range of real world luminances to the low dynamic range
of the photographic print. This tone reproduction problem is also
faced by computer graphics practitioners who map digital images to
a low dynamic range print or screen. The work presented in this pa-
per leverages the time-tested techniques of photographic practice to
develop a new tone reproduction operator. In particular, we use and
extend the techniques developed by Ansel Adams to deal with dig-
ital images. The resulting algorithm is simple and produces good
results for a wide variety of images.
CR Categories: I.4.10 [Computing Methodologies]: Image Pro-
cessing and Computer Vision—Image Representation
Keywords: Tone reproduction, dynamic range, Zone System.
1 Introduction
The range of light we experience in the real world is vast, spanning
approximately ten orders of absolute range from star-lit scenes to
sun-lit snow, and over four orders of dynamic range from shad-
ows to highlights in a single scene. However, the range of light
we can reproduce on our print and screen display devices spans at
best about two orders of absolute dynamic range. This discrep-
ancy leads to the tone reproduction problem: how should we map
measured/simulated scene luminances to display luminances and
produce a satisfactory image?
A great deal of work has been done on the tone reproduction
problem [Matkovic et al. 1997; McNamara et al. 2000; McNamara
2001]. Most of this work has used an explicit perceptual model to
control the operator [Upstill 1985; Tumblin and Rushmeier 1993;
Ward 1994; Ferwerda et al. 1996; Ward et al. 1997; Tumblin et al.
1999]. Such methods have been extended to dynamic and interac-
tive settings [Ferwerda et al. 1996; Durand and Dorsey 2000; Pat-
tanaik et al. 2000; Scheel et al. 2000; Cohen et al. 2001]. Other
work has focused on the dynamic range compression problem by
spatially varying the mapping from scene luminances to display lu-
minances while preserving local contrast [Oppenheim et al. 1968;
Stockham 1972; Chiu et al. 1993; Schlick 1994; Tumblin and Turk
1999]. Finally, computational models of the human visual system
can also guide such spatially-varying maps [Rahman et al. 1996;
Rahman et al. 1997; Pattanaik et al. 1998].
Radiance map courtesy of Cornell Program of Computer Graphics
Linear map
New operator
Figure 1: A high dynamic range image cannot be displayed directly
without losing visible detail using linear scaling (top). Our new
algorithm (bottom) is designed to overcome these problems.
Using perceptual models is a sound approach to the tone repro-
duction problem, and could lead to effective hands-off algorithms,
but there are two problems with current models. First, current mod-
els often introduce artifacts such as ringing or visible clamping (see
Section 4). Second, visual appearance depends on more than simply
matching contrast and/or brightness; scene content, image medium,
and viewing conditions must often be considered [Fairchild 1998].
To avoid these problems, we turn to photographic practices for in-
spiration. This has led us to develop a tone reproduction technique
designed for a wide variety of images, including those having a very
high dynamic range (e.g., Figure 1).
2 Background
The tone reproduction problem was first defined by photographers.
Often their goal is to produce realistic “renderings” of captured
scenes, and they have to produce such renderings while facing the
Copyright © 2002 by the Association for Computing Machinery, Inc.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or
distributed for commercial advantage and that copies bear this notice and the full
citation on the first page. Copyrights for components of this work owned by
others than ACM must be honored. Abstracting with credit is permitted. To copy
otherwise, to republish, to post on servers, or to redistribute to lists, requires prior
specific permission and/or a fee. Request permissions from Permissions Dept,
ACM Inc., fax +1 (212-869-0481 or e-mail p rmissions@acm.orge
.
© 2002 ACM 1-58113-521-1/02/0007 $5.00
267

Middle grey
Light
Dark
Figure 2: A photographer uses the Zone System to anticipate po-
tential print problems.
Figure 3: A normal-key map for a high-key scene (for example con-
taining snow) results in an unsatisfactory image (left). A high-key
map solves the problem (right).
limitations presented by slides or prints on photographic papers.
Many common practices were developed over the 150 years of pho-
tographic practice [London and Upton 1998]. At the same time
there were a host of quantitative measurements of media response
characteristics by developers [Stroebel et al. 2000]. However, there
was usually a disconnect between the artistic and technical aspects
of photographic practice, so it was very difficult to produce satis-
factory images without a great deal of experience.
Ansel Adams attempted to bridge this gap with an approach he
called the Zone System [Adams 1980; Adams 1981; Adams 1983]
which was first developed in the 1940s and later popularized by
Minor White [White et al. 1984]. It is a system of “practical sensit-
ometry”, where the photographer uses measured information in the
field to improve the chances of producing a good final print. The
Zone System is still widely used more than fifty years after its in-
ception [Woods 1993; Graves 1997; Johnson 1999]. Therefore, we
believe it is useful as a basis for addressing the tone reproduction
problem. Before discussing how the Zone System is applied, we
first summarize some relevant terminology.
Zone: A zone is defined as a Roman numeral associated with an
approximate luminance range in a scene as well as an approxi-
mate reflectance of a print. There are eleven print zones, rang-
ing from pure black (zone 0) to pure white (zone X), each
doubling in intensity, and a potentially much larger number of
scene zones (Figure 4).
2
x+4
L ...
Dynamic range = 15 scene zones
Print zones
2
x+16
L2
x+1
L2
x
L 2
x+2
L 2
x+3
L 2
x+15
L
shadow
textured
Darkest Brightest
highlight
textured
VIIIVI VII
X
IX
Middle grey maps to Zone V
I0 II III IV V
Figure 4: The mapping from scene zones to print zones. Scene zones
at either extreme will map to pure black (zone 0) or white (zone X)
if the dynamic range of the scene is eleven zones or more.
Middle-grey: This is the subjective middle brightness region of
the scene, which is typically mapped to print zone V.
Dynamic range: In computer graphics the dynamic range of a
scene is expressed as the ratio of the highest scene luminance
to the lowest scene luminance. Photographers are more inter-
ested in the ratio of the highest and lowest luminance regions
where detail is visible. This can be viewed as a subjective
measure of dynamic range. Because zones relate logarithmi-
cally to scene luminances, dynamic range can be expressed
as the difference between highest and lowest distinguishable
scene zones (Figure 4).
Key: The key of a scene indicates whether it is subjectively light,
normal, or dark. A white-painted room would be high-key,
and a dim stable would be low-key.
Dodging-and-burning: This is a printing technique where some
light is withheld from a portion of the print during develop-
ment (dodging), or more light is added to that region (burn-
ing). This will lighten or darken that region in the final print
relative to what it would be if the same development were
used for all portions of the print. In traditional photography
this technique is applied using a small wand or a piece of pa-
per with a hole cut out.
A crucial part of the Zone System is its methodology for predicting
how scene luminances will map to a set of print zones. The pho-
tographer first takes a luminance reading of a surface he perceives
as a middle-grey (Figure 2 top). In a typical situation this will be
mapped to zone V, which corresponds to the 18% reflectance of the
print. For high-key scenes the middle-grey will be one of the darker
regions, whereas in low-key scenes this will be one of the lighter re-
gions. This choice is an artistic one, although an 18% grey-card is
often used to make this selection process more mechanical (Fig-
ure 3).
Next the photographer takes luminance readings of both light
and dark regions to determine the dynamic range of the scene (Fig-
ure 2 bottom). If the dynamic range of the scene does not exceed
nine zones, an appropriate choice of middle grey can ensure that all
textured detail is captured in the final print. For a dynamic range of
more than nine zones, some areas will be mapped to pure black or
white with a standard development process. Sometimes such loss
of detail is desirable, such as a very bright object being mapped to
pure white (see [Adams 1983], p. 51). For regions where loss of
detail is objectionable, the photographer can resort to dodging-and-
burning which will locally change the development process.
The above procedure indicates that the photographic process is
difficult to automate. For example, determining that an adobe build-
ing is high-key would be very difficult without some knowledge
268

about the adobe’s true reflectance. Only knowledge of the geometry
and light inter-reflections would allow one to know the difference
between luminance ratios of a dark-dyed adobe house and a normal
adobe house. However, the Zone System provides the photogra-
pher with a small set of subjective controls. These controls form
the basis for our tone reproduction algorithm described in the next
section.
The challenges faced in tone reproduction for rendered or cap-
tured digital images are largely the same as those faced in conven-
tional photography. The main difference is that digital images are in
a sense “perfect” negatives, so no luminance information has been
lost due to the limitations of the film process. This is a blessing in
that detail is available in all luminance regions. On the other hand,
this calls for a more extreme dynamic range reduction, which could
in principle be handled by an extension of the dodging-and-burning
process. We address this issue in the next section.
3 Algorithm
The Zone System summarized in the last section is used to develop
a new tone mapping algorithm for digital images, such as those cre-
ated by rendering algorithms (e.g., [Ward Larson and Shakespeare
1998]) or captured using high dynamic range photography [De-
bevec and Malik 1997]. We are not trying to closely mimic the
actual photographic process [Geigel and Musgrave 1997], but in-
stead use the basic conceptual framework of the Zone System to
manage choices in tone reproduction. We first apply a scaling that
is analogous to setting exposure in a camera. Then, if necessary,
we apply automatic dodging-and-burning to accomplish dynamic
range compression.
3.1 Initial luminance mapping
We first show how to set the tonal range of the output image based
on the scene’s key value. Like many tone reproduction meth-
ods [Tumblin and Rushmeier 1993; Ward 1994; Holm 1996], we
view the log-average luminance as a useful approximation to the
key of the scene. This quantity
¯
L
w
is computed by:
¯
L
w
=
1
N
exp
x,y
log (δ + L
w
(x, y))
(1)
where L
w
(x, y) is the “world” luminance for pixel (x, y), N is the
total number of pixels in the image and δ is a small value to avoid
the singularity that occurs if black pixels are present in the image. If
the scene has normal-key we would like to map this to middle-grey
of the displayed image, or 0.18 on a scale from zero to one. This
suggests the equation:
L(x, y)=
a
¯
L
w
L
w
(x, y) (2)
where L(x, y) is a scaled luminance and a =0.18. For low-key
or high-key images we allow the user to map the log average to
different values of a. We typically vary a from 0.18 up to 0.36 and
0.72 andvaryitdownto0.09,and0.045. An example of varying is
given in Figure 5. In the remainder of this paper we call the value
of parameter a the “key value”, because it relates to the key of the
image after applying the above scaling.
The main problem with Equation 2 is that many scenes have pre-
dominantly a normal dynamic range, but have a few high luminance
regions near highlights or in the sky. In traditional photography
this issue is dealt with by compression of both high and low lumi-
nances. However, modern photography has abandoned these “s”-
shaped transfer curves in favor of curves that compress mainly the
Radiance map courtesy of Paul Debevec
Key value 0.72
Key value 0.09
Key value 0.36
Key value 0.18
Figure 5: The linear scaling applied to the input luminance allows
the user to steer the final appearance of the tone-mapped image.
The dynamic range of the image is 7 zones.
high luminances [Mitchell 1984; Stroebel et al. 2000]. A simple
tone mapping operator with these characteristics is given by:
L
d
(x, y)=
L(x, y)
1+L(x, y)
. (3)
Note that high luminances are scaled by approximately 1/L, while
low luminances are scaled by 1. The denominator causes a graceful
blend between these two scalings. This formulation is guaranteed
to bring all luminances within displayable range. However, as men-
tioned in the previous section, this is not always desirable. Equa-
tion 3 can be extended to allow high luminances to burn out in a
controllable fashion:
L
d
(x, y)=
L(x, y)
1+
L(x,y)
L
2
white
1+L(x, y)
(4)
where L
white
is the smallest luminance that will be mapped to pure
white. This function is a blend between Equation 3 and a linear
mapping. It is shown for various values of L
white
in Figure 6. If
L
white
value is set to the maximum luminance in the scene L
max
or higher, no burn-out will occur. If it is set to infinity, then the
function reverts to Equation 3. By default we set L
white
to the
maximum luminance in the scene. If this default is applied to scenes
that have a low dynamic range (i.e., L
max
< 1), the effect is a subtle
contrast enhancement, as can be seen in Figure 7.
The results of this function for higher dynamic range images is
shown in the left images of Figure 8. For many high dynamic range
images, the compression provided by this technique appears to be
sufficient to preserve detail in low contrast areas, while compress-
ing high luminances to a displayable range. However, for very high
dynamic range images important detail is still lost. For these im-
ages a local tone reproduction algorithm that applies dodging-and-
burning is needed (right images of Figure 8).
3.2 Automatic dodging-and-burning
In traditional dodging-and-burning, all portions of the print poten-
tially receive a different exposure time from the negative, bringing
“up” selected dark regions or bringing “down” selected light re-
gions to avoid loss of detail [Adams 1983]. With digital images we
have the potential to extend this idea to deal with very high dynamic
range images. We can think of this as choosing a key value for ev-
ery pixel, which is equivalent to specifying a local a in Equation 2.
269

0 1 2 3 4 5
World luminance (L)
L
white
= 0.5 1.0 1.5 3
L
d
Figure 6: Display luminance as function of world luminance for a
family of values for L
white
.
Input Output
Figure 7: Left: low dynamic range input image (dynamic range
is 4 zones). Right: the result of applying the operator given by
Equation 4.
This serves a similar purpose to the local adaptation methods of the
perceptually-driven tone mapping operators [Pattanaik et al. 1998;
Tumblin et al. 1999].
Dodging-and-burning is typically applied over an entire region
bounded by large contrasts. For example, a local region might cor-
respond to a single dark tree on a light background [Adams 1983].
The size of a local region is estimated using a measure of local
contrast, which is computed at multiple spatial scales [Peli 1990].
Such contrast measures frequently use a center-surround function at
each spatial scale, often implemented by subtracting two Gaussian
blurred images. A variety of such functions have been proposed, in-
cluding [Land and McCann 1971; Marr and Hildreth 1980; Blom-
maert and Martens 1990; Peli 1990; Jernigan and McLean 1992;
Gove et al. 1995; Pessoa et al. 1995] and [Hansen et al. 2000]. After
testing many of these variants, we chose a center-surround function
derived from Blommaert’s model for brightness perception [Blom-
maert and Martens 1990] because it performed the best in our tests.
This function is constructed using circularly symmetric Gaussian
profiles of the form:
R
i
(x, y, s)=
1
π(α
i
s)
2
exp
x
2
+ y
2
(α
i
s)
2
. (5)
These profiles operate at different scales s and at different image
positions (x, y). Analyzing an image using such profiles amounts
to convolving the image with these Gaussians, resulting in a re-
sponse V
i
as function of image location, scale and luminance dis-
tribution L:
V
i
(x, y, s)=L(x, y) R
i
(x, y, s). (6)
This convolution can be computed directly in the spatial domain,
or for improved efficiency can be evaluated by multiplication in the
Fourier domain. The smallest Gaussian profile will be only slightly
larger than one pixel and therefore the accuracy with which the
above equation is evaluated, is important. We perform the integra-
tion in terms of the error function to gain a high enough accuracy
without having to resort to super-sampling.
The center-surround function we use is defined by:
V (x, y, s)=
V
1
(x, y, s) V
2
(x, y, s)
2
φ
a/s
2
+ V
1
(x, y, s)
(7)
Radiance map courtesy of Greg Ward (top) and Cornell Program of Computer Graphics (bottom)
Simple operator
Simple operator Dodging−and−burning
Dodging−and−burning
Figure 8: The simple operator of Equation 3 brings out sufficient
detail in the top image (dynamic range is 6 zones), although ap-
plying dodging-and-burning does not introduce artifacts. For the
bottom image (dynamic range is 15 zones) dodging-and-burning is
required to make the book’s text visible.
where center V
1
and surround V
2
responses are derived from Equa-
tions 5 and 6. This constitutes a standard difference of Gaussians
approach, normalized by 2
φ
a/s
2
+V
1
for reasons explained below.
The free parameters a and φ are the key value and a sharpening
parameter respectively.
For computational convenience, we set the center size of the next
higher scale to be the same as the surround of the current scale. Our
choice of center-surround ratio is 1.6, which results in a difference
of Gaussians model that closely resembles a Laplacian of Gaussian
filter [Marr 1982]. From our experiments, this ratio appears to pro-
duce slightly better results over a wide range of images than other
choices of center-surround ratio. However, this ratio can be altered
by a small amount to optimize the center-surround mechanism for
specific images.
Equation 7 is computed for the sole purpose of establishing a
measure of locality for each pixel, which amounts to finding a scale
s
m
of appropriate size. This scale may be different for each pixel,
and the procedure for its selection is the key to the success of our
dodging-and-burning technique. It is also a deviation from the orig-
inal Blommaert model [Blommaert and Martens 1990]. The area to
be considered local is in principle the largest area around a given
pixel where no large contrast changes occur. To compute the size
of this area, Equation 7 is evaluated at different scales s. Note that
V
1
(x, y, s) provides a local average of the luminance around (x, y)
roughly in a disc of radius s. The same is true for V
2
(x, y, s) al-
though it operates over a larger area at the same scale s.Theval-
ues of V
1
and V
2
are expected to be very similar in areas of small
luminance gradients, but will differ in high contrast regions. To
choose the largest neighborhood around a pixel with fairly even lu-
minances, we threshold V to select the corresponding scale s
m
.
Starting at the lowest scale, we seek the first scale s
m
where:
|V (x, y, s
m
)| < (8)
is true. Here is the threshold. The V
1
in the denominator of Equa-
270

Radiance map courtesy of Paul Debevec
Scale too small (s )
Right scale (s )
Scale too large (s )
Center
Surround
Center
Right scale (s )
Surround
Center
Surround
Scale too small (s )
Scale too large (s )
2
1
2
1
3
3
Figure 9: An example of scale selection. The top image shows cen-
ter and surround at different sizes. The lower images show the re-
sults of particular choices of scale selection. If scales are chosen
too small, detail is lost. On the other hand, if scales are chosen too
large, dark rings around luminance steps will form.
tion 7 makes thresholding V independent of absolute luminance
level, while the 2
φ
a/s
2
term prevents V from becoming too large
when V approaches zero.
Given a judiciously chosen scale for a given pixel, we observe
that V
1
(x, y, s
m
) may serve as a local average for that pixel. Hence,
the global tone reproduction operator of Equation 3 can be con-
verted into a local operator by replacing L with V
1
in the denomi-
nator:
L
d
(x, y)=
L(x, y)
1+V
1
(x, y, s
m
(x, y))
(9)
This function constitutes our local dodging-and-burning operator.
The luminance of a dark pixel in a relatively bright region will sat-
isfy L<V
1
, so this operator will decrease the display luminance
L
d
, thereby increasing the contrast at that pixel. This is akin to pho-
tographic “dodging”. Similarly, a pixel in a relatively dark region
will be compressed less, and is thus “burned”. In either case the
pixel’s contrast relative to the surrounding area is increased. For
this reason, the above scale selection method is of crucial impor-
tance, as illustrated in the example of Figure 9. If s
m
is too small,
then V
1
is close to the luminance L and the local operator reduces
to our global operator (s
1
in Figure 9). On the other hand, choosing
s
m
too large causes dark rings to form around bright areas (s
3
in
the same figure), while choosing the scale as outlined above causes
the right amount of detail and contrast enhancement without intro-
ducing unwanted artifacts (s
2
in Figure 9).
Using a larger scale s
m
tends to increase contrast and enhance
edges. The value of the threshold in Equation 8, as well as the
choice of φ in Equation 7, serve as edge enhancement parameters
and work by manipulating the scale that would be chosen for each
pixel. Decreasing forces the appropriate scale s
m
to be larger.
Increasing φ also tends to select a slightly larger scale s
m
, but only
at small scales due to the division of φ by s
2
. An example of the
effect of varying φ is given in Figure 10.
A further observation is that because V
1
tends to be smaller than
L for very bright pixels, our local operator is not guaranteed to keep
the display luminance L
d
below 1. Thus, for extremely bright areas
some burn-out may occur and this is the reason we clip the display
luminance to 1 afterwards. As noted in section 2, a small amount
of burn-out may be desirable to make light sources such as the sun
look very bright.
In summary, by automatically selecting an appropriate neigh-
borhood for each pixel we effectively implement a pixel-by-pixel
dodging and burning technique as applied in photography [Adams
1983]. These techniques locally change the exposure of a film, and
so darken or brighten certain areas in the final print.
4 Results
We implemented our algorithm in C++ and obtained the luminance
values from the input R, G and B triplets with L =0.27R +
0.67G +0.06B. The convolutions of Equation 5 were computed
using a Fast Fourier Transform (FFT). Because Gaussians are sepa-
rable, these convolutions can also be efficiently computed in image
space. This is easier to implement than an FFT, but it is somewhat
slower for large images. Because of the normalization by V
1
, our
method is insensitive to edge artifacts normally associated with the
computation of an FFT.
The key value setting is determined on a per image basis, while
unless noted otherwise, the parameter φ is set to 8.0 for all the im-
ages in this paper. Our new local operator uses Gaussian profiles
s at 8 discrete scales increasing with a factor of 1.6 from 1 pixel
wide to 43 pixels wide. For practical purposes we would like the
Gaussian profile at the smallest scale to have 2 standard deviations
overlap with 1 pixel. This is achieved by setting the scaling param-
eter α
1
to 1/2
2 0.35. The parameter α
2
is 1.6 times as large.
The threshold used for scale selection was set to 0.05.
We use images with a variety of dynamic ranges as indicated
throughout this section. Note that we are using the photographic
definition of dynamic range as presented in Section 2. This results
in somewhat lower ranges than would be obtained if a conventional
computer graphics measure of dynamic range were used. However,
we believe the photographic definition is more predictive of how
challenging the tone reproduction of a given image is.
In the absence of well-tested quantitative methods to compare
tone mapping operators, we compare our results to a representative
set of tone reproduction techniques for digital images. In this sec-
tion we briefly introduce each of the operators and show images of
them in the next section. Specifically, we compare our new operator
of Equation 9 with the following.
Stockham’s homomorphic filtering Using the observation that
lighting variation occurs mainly in low frequencies and hu-
mans are more aware of albedo variations, this method op-
erates by downplaying low frequencies and enhancing high
frequencies [Oppenheim et al. 1968; Stockham 1972].
Tumblin-Rushmeier’s brightness matching operator . A model
of brightness perception is used to drive this global operator.
271

Citations
More filters
Journal ArticleDOI
TL;DR: This report describes, summarize, and analyzes the latest research in mapping general‐purpose computation to graphics hardware.
Abstract: The rapid increase in the performance of graphics hardware, coupled with recent improvements in its programmability, have made graphics hardware a compelling platform for computationally demanding tasks in a wide variety of application domains. In this report, we describe, summarize, and analyze the latest research in mapping general-purpose computation to graphics hardware. We begin with the technical motivations that underlie general-purpose computation on graphics processors (GPGPU) and describe the hardware and software developments that have led to the recent interest in this field. We then aim the main body of this report at two separate audiences. First, we describe the techniques used in mapping general-purpose computation to graphics hardware. We believe these techniques will be generally useful for researchers who plan to develop the next generation of GPGPU algorithms and techniques. Second, we survey and categorize the latest developments in general-purpose application development on graphics hardware. This survey should be of particular interest to researchers who are interested in using the latest GPGPU applications in their systems of interest.

1,998 citations

Journal ArticleDOI
01 Aug 2008
TL;DR: This paper advocates the use of an alternative edge-preserving smoothing operator, based on the weighted least squares optimization framework, which is particularly well suited for progressive coarsening of images and for multi-scale detail extraction.
Abstract: Many recent computational photography techniques decompose an image into a piecewise smooth base layer, containing large scale variations in intensity, and a residual detail layer capturing the smaller scale details in the image. In many of these applications, it is important to control the spatial scale of the extracted details, and it is often desirable to manipulate details at multiple scales, while avoiding visual artifacts.In this paper we introduce a new way to construct edge-preserving multi-scale image decompositions. We show that current basedetail decomposition techniques, based on the bilateral filter, are limited in their ability to extract detail at arbitrary scales. Instead, we advocate the use of an alternative edge-preserving smoothing operator, based on the weighted least squares optimization framework, which is particularly well suited for progressive coarsening of images and for multi-scale detail extraction. After describing this operator, we show how to use it to construct edge-preserving multi-scale decompositions, and compare it to the bilateral filter, as well as to other schemes. Finally, we demonstrate the effectiveness of our edge-preserving decompositions in the context of LDR and HDR tone mapping, detail enhancement, and other applications.

1,381 citations

Book
Richard Szeliski1
31 Dec 2006
TL;DR: In this article, the basic motion models underlying alignment and stitching algorithms are described, and effective direct (pixel-based) and feature-based alignment algorithms, and blending algorithms used to produce seamless mosaics.
Abstract: This tutorial reviews image alignment and image stitching algorithms. Image alignment algorithms can discover the correspondence relationships among images with varying degrees of overlap. They are ideally suited for applications such as video stabilization, summarization, and the creation of panoramic mosaics. Image stitching algorithms take the alignment estimates produced by such registration algorithms and blend the images in a seamless manner, taking care to deal with potential problems such as blurring or ghosting caused by parallax and scene movement as well as varying image exposures. This tutorial reviews the basic motion models underlying alignment and stitching algorithms, describes effective direct (pixel-based) and feature-based alignment algorithms, and describes blending algorithms used to produce seamless mosaics. It ends with a discussion of open research problems in the area.

1,226 citations

Journal ArticleDOI
01 Aug 2004
TL;DR: The framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a process based on Poisson equations, to further reduce any remaining visible artifacts in the composite.
Abstract: We describe an interactive, computer-assisted framework for combining parts of a set of photographs into a single composite picture, a process we call "digital photomontage." Our framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a process based on Poisson equations, to further reduce any remaining visible artifacts in the composite. Also central to the framework is a suite of interactive tools that allow the user to specify a variety of high-level image objectives, either globally across the image, or locally through a painting-style interface. Image objectives are applied independently at each pixel location and generally involve a function of the pixel values (such as "maximum contrast") drawn from that same location in the set of source images. Typically, a user applies a series of image objectives iteratively in order to create a finished composite. The power of this framework lies in its generality; we show how it can be used for a wide variety of applications, including "selective composites" (for instance, group photos in which everyone looks their best), relighting, extended depth of field, panoramic stitching, clean-plate production, stroboscopic visualization of movement, and time-lapse mosaics.

1,072 citations

Journal ArticleDOI
01 Sep 2003
TL;DR: A fast, high quality tone mapping technique to display high contrast images on devices with limited dynamic range of luminance values and taking into account user preference concerning brightness, contrast compression, and detail reproduction is proposed.
Abstract: We propose a fast, high quality tone mapping technique to display high contrast images on devices with limited dynamic range of luminance values. The method is based on logarithmic compression of luminance values, imitating the human response to light. A bias power function is introduced to adaptively vary logarithmic bases, resulting in good preservation of details and contrast. To improve contrast in dark areas, changes to the gamma correction procedure are proposed. Our adaptive logarithmic mapping technique is capable of producing perceptually tuned images with high dynamic content and works at interactive speed. We demonstrate a successful application of our tone mapping technique with a high dynamic range video player enabling to adjust optimal viewing conditions for any kind of display while taking into account user preference concerning brightness, contrast compression, and detail reproduction.

793 citations

References
More filters
Journal ArticleDOI
TL;DR: The mathematics of a lightness scheme that generates lightness numbers, the biologic correlate of reflectance, independent of the flux from objects is described.
Abstract: Sensations of color show a strong correlation with reflectance, even though the amount of visible light reaching the eye depends on the product of reflectance and illumination. The visual system must achieve this remarkable result by a scheme that does not measure flux. Such a scheme is described as the basis of retinex theory. This theory assumes that there are three independent cone systems, each starting with a set of receptors peaking, respectively, in the long-, middle-, and short-wavelength regions of the visible spectrum. Each system forms a separate image of the world in terms of lightness that shows a strong correlation with reflectance within its particular band of wavelengths. These images are not mixed, but rather are compared to generate color sensations. The problem then becomes how the lightness of areas in these separate images can be independent of flux. This article describes the mathematics of a lightness scheme that generates lightness numbers, the biologic correlate of reflectance, independent of the flux from objects

3,480 citations

Journal ArticleDOI
TL;DR: This work uses a simple statistical analysis to impose one image's color characteristics on another by choosing an appropriate source image and applying its characteristic to another image.
Abstract: We use a simple statistical analysis to impose one image's color characteristics on another. We can achieve color correction by choosing an appropriate source image and apply its characteristic to another image.

2,615 citations

Journal ArticleDOI
TL;DR: A definition of local band-limited contrast in images is proposed that assigns a contrast value to every point in the image as a function of the spatial frequency band and is helpful in understanding the effects of image-processing algorithms on the perceived contrast.
Abstract: The physical contrast of simple images such as sinusoidal gratings or a single patch of light on a uniform background is well defined and agrees with the perceived contrast, but this is not so for complex images. Most definitions assign a single contrast value to the whole image, but perceived contrast may vary greatly across the image. Human contrast sensitivity is a function of spatial frequency; therefore the spatial frequency content of an image should be considered in the definition of contrast. In this paper a definition of local band-limited contrast in images is proposed that assigns a contrast value to every point in the image as a function of the spatial frequency band. For each frequency band, the contrast is defined as the ratio of the bandpass-filtered image at the frequency to the low-pass image filtered to an octave below the same frequency (local luminance mean). This definition raises important implications regarding the perception of contrast in complex images and is helpful in understanding the effects of image-processing algorithms on the perceived contrast. A pyramidal image-contrast structure based on this definition is useful in simulating nonlinear, threshold characteristics of spatial vision in both normal observers and the visually impaired.

1,370 citations

Journal ArticleDOI
TL;DR: A multiresolution spline technique for combining two or more images into a larger image mosaic is defined and coarse features occur near borders are blended gradually over a relatively large distance without blurring or otherwise degrading finer image details in the neighborhood of th e border.
Abstract: We define a multiresolution spline technique for combining two or more images into a larger image mosaic. In this procedure, the images to be splined are first decomposed into a set of band-pass filtered component images. Next, the component images in each spatial frequency hand are assembled into a corresponding bandpass mosaic. In this step, component images are joined using a weighted average within a transition zone which is proportional in size to the wave lengths represented in the band. Finally, these band-pass mosaic images are summed to obtain the desired image mosaic. In this way, the spline is matched to the scale of features within the images themselves. When coarse features occur near borders, these are blended gradually over a relatively large distance without blurring or otherwise degrading finer image details in the neighborhood of th e border.

1,246 citations

Journal ArticleDOI
TL;DR: A tone reproduction operator is presented that preserves visibility in high dynamic range scenes and introduces a new histogram adjustment technique, based on the population of local adaptation luminances in a scene, that incorporates models for human contrast sensitivity, glare, spatial acuity, and color sensitivity.
Abstract: We present a tone reproduction operator that preserves visibility in high dynamic range scenes. Our method introduces a new histogram adjustment technique, based on the population of local adaptation luminances in a scene. To match subjective viewing experience, the method incorporates models for human contrast sensitivity, glare, spatial acuity, and color sensitivity. We compare our results to previous work and present examples of our techniques applied to lighting simulation and electronic photography.

723 citations

Frequently Asked Questions (13)
Q1. What is the effect of a dodging-and-burning algorithm on high dynamic range?

For many high dynamic range images, the compression provided by this technique appears to be sufficient to preserve detail in low contrast areas, while compressing high luminances to a displayable range. 

the range of light the authors can reproduce on their print and screen display devices spans at best about two orders of absolute dynamic range. 

Because of the normalization by V1, their method is insensitive to edge artifacts normally associated with the computation of an FFT. 

Their new local operator uses Gaussian profiles s at 8 discrete scales increasing with a factor of 1.6 from 1 pixel wide to 43 pixels wide. 

Equation 7 is computed for the sole purpose of establishing a measure of locality for each pixel, which amounts to finding a scale sm of appropriate size. 

The total time for a 5122 image is 1.31 seconds for the local operator, which is close to interactive, while their global operator (Equation 3) performs at a rate of 20 frames per second, which the authors consider real-time. 

The authors implemented their algorithm in C++ and obtained the luminance values from the input R, G and B triplets with L = 0.27R + 0.67G + 0.06B. 

If the scene has normal-key the authors would like to map this to middle-grey of the displayed image, or 0.18 on a scale from zero to one. 

the area around the sun in the rendering of the landscape is problematic for any method that attempts to bring the maximum scene luminance within a displayable range without clamping. 

Ansel Adams attempted to bridge this gap with an approach he called the Zone System [Adams 1980; Adams 1981; Adams 1983] which was first developed in the 1940s and later popularized by Minor White [White et al. 1984]. 

Because zones relate logarithmically to scene luminances, dynamic range can be expressed as the difference between highest and lowest distinguishable scene zones (Figure 4). 

For practical purposes the authors would like the Gaussian profile at the smallest scale to have 2 standard deviations overlap with 1 pixel. 

As such, the local FFT based implementation, the local spline based approximation and the global operator provide a useful trade-off between performance and quality, allowing any user to select the best operator given a specified maximum run-time.