Illuminating image-based objects

doi:10.1109/PCCGA.1997.626173

Illuminating Image-based Objects

Tien-Tsin Wong

1

Pheng-Ann Heng

1

Siu-Hang Or

1

Wai-Yin Ng

2

ttwong@acm.org pheng@cse.cuhk.edu.hk shor@cse.cuhk.edu.hk wyng@ie.cuhk.edu.hk

1

Department of Computer Science & Engineering

2

Department of Information Engineering

The Chinese University of Hong Kong

Abstract

In this paper, we present a new scheme of data represen-

tation for image-based objects. It allows the illumination

to be changed interactively without knowing any geometri-

cal information (e.g. depth or surface normal) of the scene.

But the resulting images are physically correct. The scene

is ﬁrst sampled from different view points and under dif-

ferent illuminations. By treating each pixel on the image

plane as a surface element, the sampled images are used to

measure the apparent BRDF of each surface element.Two

compression schemes, spherical harmonics and discrete co-

sine transform, are proposed to compress the tabular BRDF

data. Whenever the user changes the illumination, a cer-

tain number of views are reconstructed. The correct user

perspective view is then displayed using the standard tex-

ture mapping hardware. Hence, the intensity, the type and

the number of the light sources can be manipulated interac-

tively.

CR Categories: I.3.2[Computer Graphics]: Picture/Image

Generation — Digitizingand scanning, viewingalgorithms.

Additional keywords: image-based rendering, spherical

harmonics, light ﬁeld, Lumigraph, holographic stereogram,

BRDF, illumination.

1. Introduction

Althoughmillionsof textured micropolygons can be ren-

dered within a second using state-of-the-art graphics work-

stations, rendering a realistic complex scene at interactive

speed is still difﬁcult. Unlimited complexity of the scene

and expensive modeling cost are two major problems. Re-

cently researchers have focused on a new approach to ren-

dering, namely, image-based rendering. This approach

breaks the dependency of rendering time on the scene com-

plexity, since the rendering primitives are no longer geomet-

rical entities, but images.

Previous work can be classiﬁed into two main streams.

The ﬁrst stream focuses on determining the correct perspec-

tive view. Foley et al.[8] developed a system which can

rotate raytraced voxel data interactively by view interpola-

tion. However, their interpolation method is not physically

correct. Chen and Williams [6] interpolated views by mod-

eling pixel movement, resulting in physically correct inter-

polation. Later, Chen [4] described an image-based ren-

dering system, QuickTime VR, which generates perspective

views from panoramic image data by warping [5]. McMil-

lan and Bishop [14] mentioned that image-based rendering

is a problem of ﬁnding and manipulatingthe plenoptic func-

tion [1]. They also proposed a method to sample and re-

construct this plenoptic function. Levoy and Hanrahan [13]

and Gortler et al.[9] reduced the 5D plenoptic function to a

4D light ﬁeld or Lumigraph. This simpliﬁcation allows the

view interpolation to be done by standard texture mapping

techniques, which can be further accelerated by hardware.

Recently, animated image-based objects are developed by

Live Picture [16].

The second stream of research focuses on re-rendering

the scene under differentilluminationusing the sampled im-

ages. Haeberli [10] re-rendered the scene using simple su-

perposition property. However, the direction, the type and

the number of the light sources are limited to the lighting

setup during capturing the scene. Nimeroff et al. [15] efﬁ-

ciently re-rendered the scene under various natural illumi-

nation (overcast or clear skylight)with the knowledgeof the

empirical formulæ that model the skylight. Belhumeur and

Kriegman [2] determined the basis images of an object with

the assumptions that the object is convex, and all surfaces

are Lambertian. With these assumptions, only three basis

images are enough to span the illumination cone of the ob-

ject, i.e., three images are enough to reconstruct/re-render

the scene under various illuminations.

In the ﬁrst stream of previous work, the illumination of

the scene was assumed to be ﬁxed and carefully designed.

On the other hand, the view point is assumed ﬁxed in the

work of second stream. In this paper, we present an image-

based rendering system which allows the change of view

point as wellas the change of illumination. All image-based

objects can be described as a special form of plenopticfunc-

tion. Most previous work assumed that the time parameter

t

of the plenoptic function was ﬁxed. The method described

in this paper can be thought of as an attempt to allow

t

to

vary.

There are two major motivations for this research.

Firstly, the variability of the illumination allows the viewer

to illuminate only interesting portionsof the scene. This im-

proves the viewer’s recognition of it. Secondly, it is a step

closer to realizing the use of image-based entities (plenop-

tic function, light ﬁeld or Lumigraph) as basic rendering

entities, just like geometrical entities used currently in con-

ventional graphics systems.

One major goal of image-based rendering is to minimize

the use of geometrical information while generating physi-

cally correct images. With this goal in mind, the proposed

image-based system allows the viewer to change the scene

lighting interactively without knowing geometrical details

(say, depth or surface normal) of the scene. The apparent

BRDF [12, 19] of each pixel on the image plane is sam-

pled. With these pixel BRDFs, physically correct views of

the scene can be reconstructed under different illuminations

by ﬁtting different lighting parameter values and viewing

direction. The BRDF data representation is described in

Section 2. Section 3 describes how the light source can be

manipulated once the pixel BRDFs are recorded. Two com-

pression schemes, spherical harmonic transform and dis-

crete cosine transform, are proposed to compress the huge

amount of BRDF data. They are discussed in Section 4.

The proposed BRDF representation is general enough to

be applied to a wide range of image-based objects, includ-

ing panoramic image, plenopticfunction, light ﬁeld and Lu-

migraphs. In thispaper, we demonstrate how to extend light

ﬁeld and Lumigraph systems in order to allow illumination

changes. The reason to choose light ﬁeld and Lumigraphs

is due to their simplicity and potential to utilize graphics

hardware. But this does not imply the proposed BRDF rep-

resentation is only valid for lightslab based objects. Further

discussions and conclusions on the new data representation

can be found in Sections 5 and 6 respectively.

2. BRDF Representation

2.1. BRDF of Pixel

The bidirectional reﬂectance distribution function

(BRDF) [12] is the most general form of representing sur-

face reﬂectivity. To calculate the radiance outgoing from

a surface element in a speciﬁc direction, the BRDF of this

surface element must ﬁrst be determined. Methods for mea-

suring and modeling the BRDF can be found in various

sources [3, 19]. The most straightforward approach to in-

clude the illumination variability of the image-based ren-

dering system is to measure the BRDF of each object ma-

terial visible in the image. However, this approach has sev-

eral drawbacks. While the BRDFs of synthesized object

surfaces may be assigned at will, measuring those of all ob-

jects in a real scene is tedious and often infeasible. Imagine

a scene containing thousands of small stones, each with its

own BRDF. The situation worsens when a single object ex-

hibits spatial variability of surface properties. Furthermore,

associating an BRDF to each object in the scene causes ren-

dering time to depend on the scene complexity.

One might suggest, for each pixel in each view, to mea-

sure the BRDF of the object surface seen through that pixel

window. This approach breaks the link to the scene com-

plexity, but introduces an aliasing problem. Consider pixel

A

in Figure 1: multiple objects are visible through the pixel

window. Note that this will frequently happen in images

showing distant objects. Even if only one object is visi-

ble, there is still the problem of choosing surface normal

for measuring BRDF when the object silhouette is curved

(see pixel

B

in Figure 1).

Our solution is to treat each pixel on the image plane as

a surface element with an apparent BRDF. Imagine the im-

age plane as just a normal planar surface, and each pixel

can be regarded as a surface element. Each surface element

emits different amounts of radiant energy in different direc-

tions under different illuminations. In order to measure the

(apparent) BRDF of each pixel, the location of the image

plane must be speciﬁed (see Figure 2), not just the direc-

tion. By recording the BRDF of each pixel (Figure 2), we

capture the aggregate reﬂectance of objects visible through

that pixel window. The light vector

L

from the light source

and the viewing vector

V

from the view point

E

deﬁne the

Figure 1. Aliasing problem of measuring ob-

ject surface visible through the pixel win-

dows.

two directions of the BRDF. This approach does not depend

on the scene complexity, and removes the aliasing problems

above. Moreover, it can be easily integrated in the light slab

based data structure [13, 9]. It is also a uniﬁed approach for

both virtual and real world scenes.

Note that the apparent BRDF represents the response of

the object(s) in a pixel to light in each direction, in the pres-

ence of the rest of the scene, not merely the surface reﬂectiv-

ity. If we work from views (natural or rendered) that include

shadows, therefore, shadows appear in the reconstruction.

Figure 2. Measuring the BRDF of the pixel.

2.2. Measuring BRDF

To measure the BRDF, we have to capture the image of

the virtual or real world scene under different illuminations.

A directional light source is cast on the scene from different

directions. Rendered images and photos of the virtual or

real world scene are captured as usual. The algorithm is,

For each view point

E

For each directional light source’s direction

(

; 

)

Render the virtual scene or take photograph of

real world scene illuminated by this

directional light source and named as

I

E;;

.

The parameter



is the polar angle, and



is the azimuth.

The direction

(0

;

)

is orthogonal to the image plane. The

parameters are localized to the image plane coordinate sys-

tem, so transforming the image plane does not affect the

BRDF parameterization. The reason for using a directional

light source is that the incident light direction is identical at

any 3D point. In real life, a directional light source can be

approximated by placing a spotlight at a sufﬁcient distance

from the scene. The BRDF



of each pixel inside a view

can be sampled by the following algorithm,

For each view point

E

For each pixel

(

s; t

)



(

; 

)

=

pixel value at

(

s; t

)

of

I

E;;

intensity of light source

One assumption is that there is no intervening medium,

which absorbs, scatters or emits any radiant energy.

Note that instead of recording a single 2D array (image

plane) of pixel BRDFs, we record a set of 2D arrays of pixel

URDFs (described shortly) from multiple view points (

E

)

in our current implementation. Since the viewing direction

of each pixel within one speciﬁc view of the image plane

is ﬁxed, the BRDF



is simpliﬁed to a unidirectional re-

ﬂectance distribution function (URDF) which depends on

the light vector only. Hence, the function



is parameter-

ized by two parameters

(

; 

)

only. There are three rea-

sons we store the partial URDF of each pixel in multiple

ﬁxed views, instead of a complete BRDF for each pixel of

a single image plane. Firstly, with this organization, the

compression methods (described in Section 4) are simpli-

ﬁed. Secondly, the reconstruction from compressed data is

performed only when the lighting changes. No reconstruc-

tion is needed when the user changes view point. This is

important for interactive display, since the user changes the

view point more often than the illumination. Thirdly, pixels

on the same image do not have the same viewing vector

V

.

Resampling is needed to sample the complete pixel BRDF

on the uniform spherical grid. Hence this organization free

us from the resampling process which may introduce error.

From now on, the term view refers to an image of the im-

age plane, viewed from certain view point (

E

). When we

refer BRDF, we actually means the set of partial URDFs in

multiple ﬁxed views.

Traditionally, the BRDF is sampled only on the upper

hemisphere of the surface element, since reﬂectance must

be zero if the light source is behind the surface element.

However in our case, the reﬂectance may be nonzero even

the light source direction is from the back of the image

plane. This is because the actual object surface may not

align with the image plane (Figure 3). Instead, the whole

sphere surrounding the pixel has to be sampled for record-

ing its BRDF. Therefore, the range of



should be

[0

;

]

.

Nevertheless, samplingonly the upper hemispherical BRDF

is usually sufﬁcient, since the viewer seldom moves the

light source to the back of objects.

Figure 3. The image plane may not be parallel

with the object surface.

3. Manipulating the Light Sources

Once the BRDFs are sampled and stored, they can be

manipulated. The ﬁnal radiance (or simply value) of each

pixel in each view is determined by evaluating equation 1,

given the intensity and the direction of the light sources.

value at pixel

(

s; t

)

in a view

(

E

)=

n

X

i



E ;s;t

(



i

;

i

)

I

i

;

(1)

where

n

is the total number of light sources,

(



i

;

i

)

specify the direction of the

i

-th light source

L

i

,

I

i

is the intensity of the

i

-th light source.

Note this equation will give a physically correct image.

This can be proved with as follows. Consider

k

unoccluded

objects, visible through the pixel

(

s; t

)

, viewed from view

point

E

and illuminated by

n

light sources. The radiance

passing through the pixel window in this view will be,

n

X

i



0

i

I

i

+

n

X

i



1

i

I

i

+



+

n

X

i



k

i

I

i

=

k

X

j



j

0

I

0

+

k

X

j



j

1

I

1

+



+

k

X

j



j

n

I

n

=



0

I

0

+



1

I

1

+



+



n

I

n

where



j

i

is the reﬂectance of the

j

-th object illuminated by

light

L

i

,



i

=

k

X

j

=1



j

i

is the aggregate reﬂectance we recorded when

measuring the BRDF of the pixel.

Light Direction With equation 1, the light direction can

be changed by substituting a different value of

(

; 

)

. Fig-

ures 12(a) and (b) show a teapot illuminated by a light

source from the top and the right respectively.

Light Intensity Another parameter to manipulate is the

intensity of the light source. This can be done by changing

the value of

I

i

for the

i

-th light source. Figure 13(a) shows

the Beethoven statue illuminated by a blue light from the

left.

Multiple Light Sources We can arbitrarily add any num-

ber of light sources. The trade-off is the computational

time. From equation 1, a new reﬂectance



i

has to be re-

constructed from compressed data (described in Section 4)

for each light source. Our current prototype can still run

at an acceptable interactive speed using up to 3 directional

light sources. In the Figure 13(b), the Beethoven statue is

illuminated by a blue light from the left and a red light from

the right simultaneously.

Type of light sources Up to now, we have made an im-

plicit assumption that the light source for manipulation is

directional. Directional light is very efﬁcient in evaluating

equation 1, because all pixels on the same image plane are

illuminated by lightsource from the same direction

(



i

;

i

)

.

However, the method is not restricted to directional light. It

can be extended to point source and spotlight also. How-

ever, it will be more expensive to evaluate equation 1 for

other type of light sources, since

(



i

;

i

)

will need to be

recalculated from pixel to pixel.

Since the image plane where the pixels are located is

only a window in the 3D space (Figure 2), the intersected

surface element that actually reﬂects the light may be lo-

cated on any point on the ray

V

in Figure 4. To ﬁnd the

light vector

L

correctly for other types of light sources, the

intersection point of the ray and the object surface have to

be located ﬁrst. Note there is no such problem for direc-

tional source, since the light vector is same for all points in

the 3D space. One way to ﬁnd

L

is to use the depth im-

age. While this can be easily done for rendered images, real

world scenes may be more difﬁcult. Use of a range scanner

may provide a solution. Figures 14(a) and (b) show a box

on a plane illuminated by a point source and a directional

source respectively. Note the difference in the shadow cast

by these sources. However, just as we discussed in Sec-

tion 2, there is an aliasing problem in ﬁnding the correct

positions of intersecting objects. Imagine a scene of a furry

teddy bear; thousands of objects may be visible through one

pixel window.

Figure 4. Finding the correct light vector.

4. Compression

Storing the whole BRDF tables requires an enormous

storage space. For a single pixel, if the URDF is sampled in

the polar coordinate system with 20 samples along both the



and



coordinates, there will be 400 ﬂoating point num-

bers stored for each pixel. A single view of a

256



256

image plane will require 100Mb of storage.

To represent the BRDF more efﬁciently, the tabular data

is transformed to the frequency domain and quantization is

performed to reduce storage. We have tested two types of

transforms, spherical harmonic transform and discrete co-

sine transform.

4.1. Spherical Harmonics

Spherical harmonics [7] are analogous to a Fourier series

in the spherical domain. Cabral et al.[3] proposed the rep-

resentation of BRDF using spherical harmonics. The work

is further extended by Sillion et al. [18] to model the entire

range of incident angle. It is especially suitable for repre-

senting smooth spherical functions. In our approach, the

viewing direction

V

for each pixel is actually ﬁxed. Hence,

the function



can be transformed to spherical harmonics

domain using the followingequations directly, withoutcon-

sidering how to represent a bidirectional function described

as in [18].

C

l;m

=

Z

2



0

Z



0



(

; 

)

Y

l;m

(

; 

) sin

dd;

where

Y

l;m

(

; 

)=

8

<

:

N

l;m

P

l;m

(cos



) cos (

m

)

if

m>

0

N

l;

0

P

l;

0

(cos



)

=

p

2

if

m

=0

N

l;m

P

l;

j

m

j

(cos



)sin(

j

m

j



)

if

m<

0

;

N

l;m

=

s

2

l

+1

2



(

l

;j

m

j

)!

(

l

+

j

m

j

)!

;

and

P

l;m

(

x

)=

(

(1

;

2

m

)

p

1

;

x

2

P

m

;

1

;m

;

1

(

x

)

if

l

=

m

x

(2

m

+1)

P

m;m

(

x

)

if

l

=

m

+1

x

2

l

;

1

l

;

m

P

l

;

1

;m

(

x

)

;

l

+

m

;

1

l

;

m

P

l

;

2

;m

(

x

)

otherwise.

where the base case is

P

0

;

0

(

x

)=1

.

C

l;m

’s are the coefﬁcients of the spherical harmonics

which are going to be stored for each pixel. The more coef-

ﬁcients are used, the more accurate the spherical harmonics

representation is. Accuracy also depends on the number of

samples in the

(

; 

)

space. We found 16 to 25 coefﬁcients

are sufﬁcient in most of our tested examples.

To reconstruct the reﬂectance given the light vector

(

; 

)

, the following equation is solved for each pixel in

each view.



(

; 

)=

l

max

X

l

=0

l

X

m

=

;

l

C

l;m

Y

l;m

(

; 

)

:

(2)

where

(

l

max

)

2

is the number of spherical harmonics coefﬁ-

cients to be used.

Figure 5 shows the sampled reﬂectance distribution of a

pixel on the left and its corresponding reconstructed distri-

butionon the right. There are 900 samples (30 along



in the

range

[0

;



2

]

and 60 along



) in the left original distribution.

The reconstructed distributionon the right is represented by

25 spherical harmonics coefﬁcients only.

Figure 5. Original sampled (left) and recon-

structed (right) distribution. Note the lower

hemisphere of the reconstructed distribution

is interpolated to prevent discontinuity.

4.2. Discrete Cosine Transform

Although spherical harmonics can efﬁciently represent

smooth spherical functions, it is inferior to represent dis-

continuous function which is quite common if the scene

contains shadow. This phenomenon motivates us to ﬁnd

another solution for data compression.

The second compression scheme we have tested is dis-

crete cosine transform (DCT). One reason to choose DCT

is that hardware DCT codec is becoming widely available.

Same as before, we do not compress the four dimensional

BRDFs. Instead, the two dimensional URDFs are com-

pressed. Since the URDF is a spherical function, it is ﬁrst

mapped to a 2D disc (Figure 6), before applying the stan-

dard 2D discrete cosine transform to the disc image.

Figure 6. Mapping a hemisphere to a disc.

To map a spherical function to a plane, the mapping

should be done in two passes, namely, one for the upper

hemisphere and one for the lower half. The mapping from

a hemisphere to a disc is done by stereographic projec-

tion [11]. To project the lower hemisphere, the pointof pro-

jection

C

is ﬁrst placed at the pole of upper hemisphere and

Illuminating image-based objects

Figures

Citations

Image-based modeling and rendering of surfaces with arbitrary BRDFs

The plenoptic illumination function

Hierarchical Image-based and Polygon-based Rendering for Large-Scale Visualizations

Time-critical modeling and rendering: geometry-based and image-based approaches

Navigation and illumination control for image-based vr

References

Methods of Mathematical Physics

Methods of Mathematical Physics

Light field rendering

The lumigraph

The Plenoptic Function and the Elements of Early Vision

Related Papers (5)

Relighting with the Reflected Irradiance Field: Representation, Sampling and Reconstruction

Reflectance sharing: image-based rendering from a sparse set of images

Using Specularities to Recover Multiple Light Sources in the Presence of Texture

Estimation of multiple directional light sources for synthesis of mixed reality images

Intrinsic Scene Properties from a Single RGB-D Image

Trending Questions (1)