scispace - formally typeset
Open AccessProceedings ArticleDOI

Illuminating image-based objects

Reads0
Chats0
TLDR
A new scheme of data representation is presented that allows the illumination to be changed interactively without knowing any geometrical information (e.g. depth or surface normal) of the scene, but the resulting images are physically correct.
Abstract
We present a new scheme of data representation for image-based objects. It allows the illumination to be changed interactively without knowing any geometrical information (e.g. depth or surface normal) of the scene, but the resulting images are physically correct. The scene is first sampled from different view points and under different illuminations. By treating each pixel on the image plane as a surface element, the sampled images are used to measure the apparent BRDF of each surface element. Two compression schemes, spherical harmonics and discrete cosine transform, are proposed to compress the tabular BRDF data. Whenever the user changes the illumination a certain number of views are reconstructed. The correct user perspective view is then displayed using the standard texture mapping hardware. Hence, the intensity, the type and the number of the light sources can be manipulated interactively.

read more

Content maybe subject to copyright    Report

Illuminating Image-based Objects
Tien-Tsin Wong
1
Pheng-Ann Heng
1
Siu-Hang Or
1
Wai-Yin Ng
2
ttwong@acm.org pheng@cse.cuhk.edu.hk shor@cse.cuhk.edu.hk wyng@ie.cuhk.edu.hk
1
Department of Computer Science & Engineering
2
Department of Information Engineering
The Chinese University of Hong Kong
Abstract
In this paper, we present a new scheme of data represen-
tation for image-based objects. It allows the illumination
to be changed interactively without knowing any geometri-
cal information (e.g. depth or surface normal) of the scene.
But the resulting images are physically correct. The scene
is first sampled from different view points and under dif-
ferent illuminations. By treating each pixel on the image
plane as a surface element, the sampled images are used to
measure the apparent BRDF of each surface element.Two
compression schemes, spherical harmonics and discrete co-
sine transform, are proposed to compress the tabular BRDF
data. Whenever the user changes the illumination, a cer-
tain number of views are reconstructed. The correct user
perspective view is then displayed using the standard tex-
ture mapping hardware. Hence, the intensity, the type and
the number of the light sources can be manipulated interac-
tively.
CR Categories: I.3.2[Computer Graphics]: Picture/Image
Generation Digitizingand scanning, viewingalgorithms.
Additional keywords: image-based rendering, spherical
harmonics, light field, Lumigraph, holographic stereogram,
BRDF, illumination.
1. Introduction
Althoughmillionsof textured micropolygons can be ren-
dered within a second using state-of-the-art graphics work-
stations, rendering a realistic complex scene at interactive
speed is still difficult. Unlimited complexity of the scene
and expensive modeling cost are two major problems. Re-
cently researchers have focused on a new approach to ren-
dering, namely, image-based rendering. This approach
breaks the dependency of rendering time on the scene com-
plexity, since the rendering primitives are no longer geomet-
rical entities, but images.
Previous work can be classified into two main streams.
The first stream focuses on determining the correct perspec-
tive view. Foley et al.[8] developed a system which can
rotate raytraced voxel data interactively by view interpola-
tion. However, their interpolation method is not physically
correct. Chen and Williams [6] interpolated views by mod-
eling pixel movement, resulting in physically correct inter-
polation. Later, Chen [4] described an image-based ren-
dering system, QuickTime VR, which generates perspective
views from panoramic image data by warping [5]. McMil-
lan and Bishop [14] mentioned that image-based rendering
is a problem of finding and manipulatingthe plenoptic func-
tion [1]. They also proposed a method to sample and re-
construct this plenoptic function. Levoy and Hanrahan [13]
and Gortler et al.[9] reduced the 5D plenoptic function to a
4D light field or Lumigraph. This simplification allows the
view interpolation to be done by standard texture mapping
techniques, which can be further accelerated by hardware.
Recently, animated image-based objects are developed by
Live Picture [16].
The second stream of research focuses on re-rendering
the scene under differentilluminationusing the sampled im-
ages. Haeberli [10] re-rendered the scene using simple su-
perposition property. However, the direction, the type and
the number of the light sources are limited to the lighting
setup during capturing the scene. Nimeroff et al. [15] effi-
ciently re-rendered the scene under various natural illumi-
nation (overcast or clear skylight)with the knowledgeof the
empirical formulæ that model the skylight. Belhumeur and
Kriegman [2] determined the basis images of an object with
the assumptions that the object is convex, and all surfaces
are Lambertian. With these assumptions, only three basis
images are enough to span the illumination cone of the ob-
ject, i.e., three images are enough to reconstruct/re-render
the scene under various illuminations.
In the first stream of previous work, the illumination of

the scene was assumed to be fixed and carefully designed.
On the other hand, the view point is assumed fixed in the
work of second stream. In this paper, we present an image-
based rendering system which allows the change of view
point as wellas the change of illumination. All image-based
objects can be described as a special form of plenopticfunc-
tion. Most previous work assumed that the time parameter
t
of the plenoptic function was fixed. The method described
in this paper can be thought of as an attempt to allow
t
to
vary.
There are two major motivations for this research.
Firstly, the variability of the illumination allows the viewer
to illuminate only interesting portionsof the scene. This im-
proves the viewer’s recognition of it. Secondly, it is a step
closer to realizing the use of image-based entities (plenop-
tic function, light field or Lumigraph) as basic rendering
entities, just like geometrical entities used currently in con-
ventional graphics systems.
One major goal of image-based rendering is to minimize
the use of geometrical information while generating physi-
cally correct images. With this goal in mind, the proposed
image-based system allows the viewer to change the scene
lighting interactively without knowing geometrical details
(say, depth or surface normal) of the scene. The apparent
BRDF [12, 19] of each pixel on the image plane is sam-
pled. With these pixel BRDFs, physically correct views of
the scene can be reconstructed under different illuminations
by fitting different lighting parameter values and viewing
direction. The BRDF data representation is described in
Section 2. Section 3 describes how the light source can be
manipulated once the pixel BRDFs are recorded. Two com-
pression schemes, spherical harmonic transform and dis-
crete cosine transform, are proposed to compress the huge
amount of BRDF data. They are discussed in Section 4.
The proposed BRDF representation is general enough to
be applied to a wide range of image-based objects, includ-
ing panoramic image, plenopticfunction, light field and Lu-
migraphs. In thispaper, we demonstrate how to extend light
field and Lumigraph systems in order to allow illumination
changes. The reason to choose light field and Lumigraphs
is due to their simplicity and potential to utilize graphics
hardware. But this does not imply the proposed BRDF rep-
resentation is only valid for lightslab based objects. Further
discussions and conclusions on the new data representation
can be found in Sections 5 and 6 respectively.
2. BRDF Representation
2.1. BRDF of Pixel
The bidirectional reflectance distribution function
(BRDF) [12] is the most general form of representing sur-
face reflectivity. To calculate the radiance outgoing from
a surface element in a specific direction, the BRDF of this
surface element must first be determined. Methods for mea-
suring and modeling the BRDF can be found in various
sources [3, 19]. The most straightforward approach to in-
clude the illumination variability of the image-based ren-
dering system is to measure the BRDF of each object ma-
terial visible in the image. However, this approach has sev-
eral drawbacks. While the BRDFs of synthesized object
surfaces may be assigned at will, measuring those of all ob-
jects in a real scene is tedious and often infeasible. Imagine
a scene containing thousands of small stones, each with its
own BRDF. The situation worsens when a single object ex-
hibits spatial variability of surface properties. Furthermore,
associating an BRDF to each object in the scene causes ren-
dering time to depend on the scene complexity.
One might suggest, for each pixel in each view, to mea-
sure the BRDF of the object surface seen through that pixel
window. This approach breaks the link to the scene com-
plexity, but introduces an aliasing problem. Consider pixel
A
in Figure 1: multiple objects are visible through the pixel
window. Note that this will frequently happen in images
showing distant objects. Even if only one object is visi-
ble, there is still the problem of choosing surface normal
for measuring BRDF when the object silhouette is curved
(see pixel
B
in Figure 1).
Our solution is to treat each pixel on the image plane as
a surface element with an apparent BRDF. Imagine the im-
age plane as just a normal planar surface, and each pixel
can be regarded as a surface element. Each surface element
emits different amounts of radiant energy in different direc-
tions under different illuminations. In order to measure the
(apparent) BRDF of each pixel, the location of the image
plane must be specified (see Figure 2), not just the direc-
tion. By recording the BRDF of each pixel (Figure 2), we
capture the aggregate reflectance of objects visible through
that pixel window. The light vector
L
from the light source
and the viewing vector
V
from the view point
E
define the
Figure 1. Aliasing problem of measuring ob-
ject surface visible through the pixel win-
dows.

two directions of the BRDF. This approach does not depend
on the scene complexity, and removes the aliasing problems
above. Moreover, it can be easily integrated in the light slab
based data structure [13, 9]. It is also a unified approach for
both virtual and real world scenes.
Note that the apparent BRDF represents the response of
the object(s) in a pixel to light in each direction, in the pres-
ence of the rest of the scene, not merely the surface reflectiv-
ity. If we work from views (natural or rendered) that include
shadows, therefore, shadows appear in the reconstruction.
Figure 2. Measuring the BRDF of the pixel.
2.2. Measuring BRDF
To measure the BRDF, we have to capture the image of
the virtual or real world scene under different illuminations.
A directional light source is cast on the scene from different
directions. Rendered images and photos of the virtual or
real world scene are captured as usual. The algorithm is,
For each view point
E
For each directional light source’s direction
(
;
)
Render the virtual scene or take photograph of
real world scene illuminated by this
directional light source and named as
I
E;;
.
The parameter
is the polar angle, and
is the azimuth.
The direction
(0
;
)
is orthogonal to the image plane. The
parameters are localized to the image plane coordinate sys-
tem, so transforming the image plane does not affect the
BRDF parameterization. The reason for using a directional
light source is that the incident light direction is identical at
any 3D point. In real life, a directional light source can be
approximated by placing a spotlight at a sufficient distance
from the scene. The BRDF
of each pixel inside a view
can be sampled by the following algorithm,
For each view point
E
For each pixel
(
s; t
)
(
;
)
=
pixel value at
(
s; t
)
of
I
E;;
intensity of light source
One assumption is that there is no intervening medium,
which absorbs, scatters or emits any radiant energy.
Note that instead of recording a single 2D array (image
plane) of pixel BRDFs, we record a set of 2D arrays of pixel
URDFs (described shortly) from multiple view points (
E
)
in our current implementation. Since the viewing direction
of each pixel within one specific view of the image plane
is fixed, the BRDF
is simplified to a unidirectional re-
flectance distribution function (URDF) which depends on
the light vector only. Hence, the function
is parameter-
ized by two parameters
(
;
)
only. There are three rea-
sons we store the partial URDF of each pixel in multiple
fixed views, instead of a complete BRDF for each pixel of
a single image plane. Firstly, with this organization, the
compression methods (described in Section 4) are simpli-
fied. Secondly, the reconstruction from compressed data is
performed only when the lighting changes. No reconstruc-
tion is needed when the user changes view point. This is
important for interactive display, since the user changes the
view point more often than the illumination. Thirdly, pixels
on the same image do not have the same viewing vector
V
.
Resampling is needed to sample the complete pixel BRDF
on the uniform spherical grid. Hence this organization free
us from the resampling process which may introduce error.
From now on, the term view refers to an image of the im-
age plane, viewed from certain view point (
E
). When we
refer BRDF, we actually means the set of partial URDFs in
multiple fixed views.
Traditionally, the BRDF is sampled only on the upper
hemisphere of the surface element, since reflectance must
be zero if the light source is behind the surface element.
However in our case, the reflectance may be nonzero even
the light source direction is from the back of the image
plane. This is because the actual object surface may not
align with the image plane (Figure 3). Instead, the whole
sphere surrounding the pixel has to be sampled for record-
ing its BRDF. Therefore, the range of
should be
[0
;
]
.
Nevertheless, samplingonly the upper hemispherical BRDF
is usually sufficient, since the viewer seldom moves the
light source to the back of objects.
Figure 3. The image plane may not be parallel
with the object surface.

3. Manipulating the Light Sources
Once the BRDFs are sampled and stored, they can be
manipulated. The final radiance (or simply value) of each
pixel in each view is determined by evaluating equation 1,
given the intensity and the direction of the light sources.
value at pixel
(
s; t
)
in a view
(
E
)=
n
X
i
E ;s;t
(
i
;
i
)
I
i
;
(1)
where
n
is the total number of light sources,
(
i
;
i
)
specify the direction of the
i
-th light source
L
i
,
I
i
is the intensity of the
i
-th light source.
Note this equation will give a physically correct image.
This can be proved with as follows. Consider
k
unoccluded
objects, visible through the pixel
(
s; t
)
, viewed from view
point
E
and illuminated by
n
light sources. The radiance
passing through the pixel window in this view will be,
n
X
i
0
i
I
i
+
n
X
i
1
i
I
i
+

+
n
X
i
k
i
I
i
=
k
X
j
j
0
I
0
+
k
X
j
j
1
I
1
+

+
k
X
j
j
n
I
n
=
0
I
0
+
1
I
1
+

+
n
I
n
where
j
i
is the reflectance of the
j
-th object illuminated by
light
L
i
,
i
=
k
X
j
=1
j
i
is the aggregate reflectance we recorded when
measuring the BRDF of the pixel.
Light Direction With equation 1, the light direction can
be changed by substituting a different value of
(
;
)
. Fig-
ures 12(a) and (b) show a teapot illuminated by a light
source from the top and the right respectively.
Light Intensity Another parameter to manipulate is the
intensity of the light source. This can be done by changing
the value of
I
i
for the
i
-th light source. Figure 13(a) shows
the Beethoven statue illuminated by a blue light from the
left.
Multiple Light Sources We can arbitrarily add any num-
ber of light sources. The trade-off is the computational
time. From equation 1, a new reflectance
i
has to be re-
constructed from compressed data (described in Section 4)
for each light source. Our current prototype can still run
at an acceptable interactive speed using up to 3 directional
light sources. In the Figure 13(b), the Beethoven statue is
illuminated by a blue light from the left and a red light from
the right simultaneously.
Type of light sources Up to now, we have made an im-
plicit assumption that the light source for manipulation is
directional. Directional light is very efficient in evaluating
equation 1, because all pixels on the same image plane are
illuminated by lightsource from the same direction
(
i
;
i
)
.
However, the method is not restricted to directional light. It
can be extended to point source and spotlight also. How-
ever, it will be more expensive to evaluate equation 1 for
other type of light sources, since
(
i
;
i
)
will need to be
recalculated from pixel to pixel.
Since the image plane where the pixels are located is
only a window in the 3D space (Figure 2), the intersected
surface element that actually reflects the light may be lo-
cated on any point on the ray
V
in Figure 4. To find the
light vector
L
correctly for other types of light sources, the
intersection point of the ray and the object surface have to
be located first. Note there is no such problem for direc-
tional source, since the light vector is same for all points in
the 3D space. One way to find
L
is to use the depth im-
age. While this can be easily done for rendered images, real
world scenes may be more difficult. Use of a range scanner
may provide a solution. Figures 14(a) and (b) show a box
on a plane illuminated by a point source and a directional
source respectively. Note the difference in the shadow cast
by these sources. However, just as we discussed in Sec-
tion 2, there is an aliasing problem in finding the correct
positions of intersecting objects. Imagine a scene of a furry
teddy bear; thousands of objects may be visible through one
pixel window.
Figure 4. Finding the correct light vector.
4. Compression
Storing the whole BRDF tables requires an enormous
storage space. For a single pixel, if the URDF is sampled in
the polar coordinate system with 20 samples along both the
and
coordinates, there will be 400 floating point num-
bers stored for each pixel. A single view of a
256
256
image plane will require 100Mb of storage.

To represent the BRDF more efficiently, the tabular data
is transformed to the frequency domain and quantization is
performed to reduce storage. We have tested two types of
transforms, spherical harmonic transform and discrete co-
sine transform.
4.1. Spherical Harmonics
Spherical harmonics [7] are analogous to a Fourier series
in the spherical domain. Cabral et al.[3] proposed the rep-
resentation of BRDF using spherical harmonics. The work
is further extended by Sillion et al. [18] to model the entire
range of incident angle. It is especially suitable for repre-
senting smooth spherical functions. In our approach, the
viewing direction
V
for each pixel is actually fixed. Hence,
the function
can be transformed to spherical harmonics
domain using the followingequations directly, withoutcon-
sidering how to represent a bidirectional function described
as in [18].
C
l;m
=
Z
2
0
Z
0
(
;
)
Y
l;m
(
;
) sin
dd;
where
Y
l;m
(
;
)=
8
<
:
N
l;m
P
l;m
(cos
) cos (
m
)
if
m>
0
N
l;
0
P
l;
0
(cos
)
=
p
2
if
m
=0
N
l;m
P
l;
j
m
j
(cos
)sin(
j
m
j
)
if
m<
0
;
N
l;m
=
s
2
l
+1
2
(
l
;j
m
j
)!
(
l
+
j
m
j
)!
;
and
P
l;m
(
x
)=
(
(1
;
2
m
)
p
1
;
x
2
P
m
;
1
;m
;
1
(
x
)
if
l
=
m
x
(2
m
+1)
P
m;m
(
x
)
if
l
=
m
+1
x
2
l
;
1
l
;
m
P
l
;
1
;m
(
x
)
;
l
+
m
;
1
l
;
m
P
l
;
2
;m
(
x
)
otherwise.
where the base case is
P
0
;
0
(
x
)=1
.
C
l;m
s are the coefficients of the spherical harmonics
which are going to be stored for each pixel. The more coef-
ficients are used, the more accurate the spherical harmonics
representation is. Accuracy also depends on the number of
samples in the
(
;
)
space. We found 16 to 25 coefficients
are sufficient in most of our tested examples.
To reconstruct the reflectance given the light vector
(
;
)
, the following equation is solved for each pixel in
each view.
(
;
)=
l
max
X
l
=0
l
X
m
=
;
l
C
l;m
Y
l;m
(
;
)
:
(2)
where
(
l
max
)
2
is the number of spherical harmonics coeffi-
cients to be used.
Figure 5 shows the sampled reflectance distribution of a
pixel on the left and its corresponding reconstructed distri-
butionon the right. There are 900 samples (30 along
in the
range
[0
;
2
]
and 60 along
) in the left original distribution.
The reconstructed distributionon the right is represented by
25 spherical harmonics coefficients only.
Figure 5. Original sampled (left) and recon-
structed (right) distribution. Note the lower
hemisphere of the reconstructed distribution
is interpolated to prevent discontinuity.
4.2. Discrete Cosine Transform
Although spherical harmonics can efficiently represent
smooth spherical functions, it is inferior to represent dis-
continuous function which is quite common if the scene
contains shadow. This phenomenon motivates us to find
another solution for data compression.
The second compression scheme we have tested is dis-
crete cosine transform (DCT). One reason to choose DCT
is that hardware DCT codec is becoming widely available.
Same as before, we do not compress the four dimensional
BRDFs. Instead, the two dimensional URDFs are com-
pressed. Since the URDF is a spherical function, it is first
mapped to a 2D disc (Figure 6), before applying the stan-
dard 2D discrete cosine transform to the disc image.
Figure 6. Mapping a hemisphere to a disc.
To map a spherical function to a plane, the mapping
should be done in two passes, namely, one for the upper
hemisphere and one for the lower half. The mapping from
a hemisphere to a disc is done by stereographic projec-
tion [11]. To project the lower hemisphere, the pointof pro-
jection
C
is first placed at the pole of upper hemisphere and

Citations
More filters
Proceedings ArticleDOI

Image-based modeling and rendering of surfaces with arbitrary BRDFs

TL;DR: This paper presents a method for image-based modeling and rendering of objects with arbitrary (possibly anisotropic and spatially varying) BRDFs and demonstrates how these object models can be embedded in synthetic scenes and rendered under global illumination which captures the interreflections between real and synthetic objects.
Journal ArticleDOI

The plenoptic illumination function

TL;DR: The core of this framework is compression, and it is shown how to exploit two types of data correlation, the intra-pixel and the inter-pixel correlations, in order to achieve a manageable storage size.
Book ChapterDOI

Hierarchical Image-based and Polygon-based Rendering for Large-Scale Visualizations

TL;DR: This paper studies the relative advantages and disadvantages of view-dependent rendering and polygon-based rendering to learn how best to combine these competing techniques towards a hierarchical, robust, and hybrid rendering system for large data visualization.

Time-critical modeling and rendering: geometry-based and image-based approaches

TL;DR: A novel concept of measuring apparent BRDFs of image plane pixels to overcome the unchangeable illumination problem in previous image-based approaches is proposed and the rendering time will now only depend on the resolution of the images.
Journal ArticleDOI

Navigation and illumination control for image-based vr

TL;DR: A triangle-based visibility-ordering algorithm is proposed, which can correctly resolve the occlusion without depth information and be able to re-render (change the illumination of the scene in an image) without any geometry information.
References
More filters
Book

Methods of Mathematical Physics

TL;DR: In this paper, the authors present an algebraic extension of LINEAR TRANSFORMATIONS and QUADRATIC FORMS, and apply it to EIGEN-VARIATIONS.
Journal ArticleDOI

Methods of Mathematical Physics

TL;DR: In this paper, the authors present an algebraic extension of LINEAR TRANSFORMATIONS and QUADRATIC FORMS, and apply it to EIGEN-VARIATIONS.
Proceedings ArticleDOI

Light field rendering

TL;DR: This paper describes a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views, and describes a compression system that is able to compress the light fields generated by more than a factor of 100:1 with very little loss of fidelity.
Proceedings ArticleDOI

The lumigraph

TL;DR: A new method for capturing the complete appearance of both synthetic and real world objects and scenes, representing this information, and then using this representation to render images of the object from new camera positions.

The Plenoptic Function and the Elements of Early Vision

TL;DR: Early vision as discussed by the authors is defined as measuring the amounts of various kinds of visual substances present in the image (e.g., redness or rightward motion energy) rather than in how it labels "things".
Related Papers (5)
Trending Questions (1)
How to increase intensity of light in Sketchup?

Hence, the intensity, the type and the number of the light sources can be manipulated interactively.