Proceedings Article•DOI•

Structured light 3D scanning in the presence of global illumination

Q: What contributions have the authors mentioned in the paper "Structured light 3d scanning in the presence of global illumination" ?

In this paper, the authors analyze the errors caused by global illumination in structured light-based shape recovery. The authors show results on a variety of scenes with complex shape and material properties and challenging global illumination effects.

Q: What is the way to design patterns with only high frequencies?

For short range effects, the authors draw on tools from the combinatorial maths literature to design patterns with large minimum stripe-widths.

Q: What are the steps to correct the error?

It is important to note that for this error correction strategy to be effective, the error prevention and detection stages are critical.

Q: What is the goal of this paper?

The goal of this paper is to build an end-to-end system for structured light scanning under a broad range of global illumination effects.

Q: How can the authors find codes with large min-SW?

For conventional Gray codes, although short-range effects might result in incorrect binarization of the lower-order bits, the higherorder bits are decoded correctly.

Q: Why did the early work in the field drive the research?

Since the early work in the field about 40 years ago [18, 12], research has been driven by two factors: reducing the acquisition time and increasing the depth resolution.

Q: How many iterations do the authors need to reduce the residual errors?

Repeat steps 1 − 5 to progressively reduce the residual errors (Section 5).first iteration itself, the authors require only a small number of extra iterations (typically 1-2) even for challenging scenes.

Q: What is the effect of low-pass filtering of the incident light?

In the context of structured light, these effects can severely blur the high-frequency patterns, making it hard to correctly binarize them.

Q: What are the main issues that prevent them from being applicable broadly?

While these approaches have shown promise, there are three issues that prevent them from being applicable broadly: (a) the direct component estimation may fail due to strong inter-reflections (as with shiny metallic parts), (b) the residual direct component may be too low and noisy (as with translucent surfaces, milk and murky water), and (c) they require significantly higher number of images than traditional approaches, or rely on weak cues like polarization.

Mohit Gupta¹, Amit Agrawal², Ashok Veeraraghavan², Srinivasa G. Narasimhan¹•Institutions (2)

Carnegie Mellon University¹, Mitsubishi Electric Research Laboratories²

20 Jun 2011-pp 713-720

TL;DR: This paper analyzes the errors caused by global illumination in structured light-based shape recovery and designs structured light patterns that are resilient to individual global illumination effects using simple logical operations and tools from combinatorial mathematics.

read less

Abstract: Global illumination effects such as inter-reflections, diffusion and sub-surface scattering severely degrade the performance of structured light-based 3D scanning. In this paper, we analyze the errors caused by global illumination in structured light-based shape recovery. Based on this analysis, we design structured light patterns that are resilient to individual global illumination effects using simple logical operations and tools from combinatorial mathematics. Scenes exhibiting multiple phenomena are handled by combining results from a small ensemble of such patterns. This combination also allows us to detect any residual errors that are corrected by acquiring a few additional images. Our techniques do not require explicit separation of the direct and global components of scene radiance and hence work even in scenarios where the separation fails or the direct component is too low. Our methods can be readily incorporated into existing scanning systems without significant overhead in terms of capture time or hardware. We show results on a variety of scenes with complex shape and material properties and challenging global illumination effects.

...read moreread less

Summary (2 min read)

Jump to: [1. Introduction] – [2. Related Work] – [3. Errors due to Global Illumination] – [4. Patterns for Error Prevention] – [4.1. Logical coding-decoding for long range effects] – [4.2. Maximizing the minimum stripe-widths for] – [4.3. Ensemble of codes for general scenes] – [5. Error detection and correction] and [6. Limitations]

1. Introduction

Structured light triangulation has become the method of choice for shape measurement in several applications including industrial automation, graphics, human-computer interaction and surgery.
Most structured light techniques make an important assumption: scene points receive illumination only directly from the light source.
Imagine a robot trying to navigate an underground cave or an indoor scenario, a surgical instrument inside human body, a robotic arm sorting a heap of metallic machine parts, or a movie director wanting to image the face of an actor.
The authors show that the types and magnitude of errors depend on the region of influence of global illumination at any scene point.
Ery pixel, without any prior knowledge about the types of effects in the scene (Figure 1d).

3. Errors due to Global Illumination

The type and magnitude of errors due to global illumination depends on the spatial frequencies of the patterns and the global illumination effect.
For a scene point Si, its irradiances Li and Li are compared.
In the following, the authors analyze the errors in the binarization process due to various global illumination effects and defocus, leading to systematic errors2.
Such a situation can commonly arise due to long-range inter-reflections, when scenes are illuminated with lowfrequency patterns.

4. Patterns for Error Prevention

Errors due to global illumination are systematic, scene-dependent errors that are hard to eliminate in 2Errors for the particular case of laser range scanning of translucent materials are analyzed in [7].
The authors design patterns that modulate global illumination and prevent errors from happening at capture time itself.
Thus, if the authors use low frequency patterns for short-range effects, the global component actually helps in correct decoding even when the direct component is low (Figure 3).
For short-range effects, the authors want patterns with only low frequencies (high minimum stripe-widths).

4.1. Logical coding-decoding for long range effects

The authors introduce the concept of logical coding and decoding to design patterns with only high frequencies.
Thus, the authors can bypass explicitly computing the direct component.
If the authors use the last Gray code pattern (stripe width of 2) as the base pattern, all the projected patterns have a maximum width of 2.
Similarly, if the authors use the second-last pattern as the base-plane, they get the XOR-04 codes (Figure 5).

4.2. Maximizing the minimum stripe-widths for

Short-range effects can severely blur the highfrequency base plane of the logical XOR codes.
On the contrary, it is easy to generate codes with small maximum stripe-width (9), as compared to 512 for the conventional Gray codes, by performing a brute-force search min-SW (8) is given by Goddyn et al . [6].
In comparison, conventional Gray codes have a minSW of 2.
Kim et al . [11] used a variant of Gray codes with large min-SW called the antipodal Gray codes to mitigate errors due to defocus.
Thus, these codes can be used in the presence of short-range effects as well.

4.3. Ensemble of codes for general scenes

Global illumination in most real world scenes is not limited to either short or long range effects.
For phase-shifting, the authors project 18 patterns (3 frequencies, 6 shifts for each frequency).
On the other hand, scene points where only the two Gray codes agree correspond to translucent materials (sub-surface scattering).
Conventional Gray codes and phase-shifting result in large errors.
Only the logical codes (optimized for long-range interactions) are sufficient to achieve a nearly error-free reconstruction, instead of the full ensemble.

5. Error detection and correction

The patterns presented in the previous section can successfully prevent a large fraction of errors.
For highly challenging scenes, however, some errors might still be made (for example, see Figure 8).
These codes can not handle the systematic errors made due to global illumination.
Mark the camera pixels where no two codes agree as error pixels (Section 5), also known as Error detection.
If the direct component is low (for example, in the presence of sub-surface scattering), this technique may not converge.

6. Limitations

The authors methods assume a single dominant mode of light transport for every scene point.
The authors thank Jay Thornton, Joseph Katz, John Barnwell and Haruhisa Okuda (Mitsubishi Electric Japan) for their help and support.
A coaxial optical scanner for synchronous acquisition of 3d geometry and surface reflectance.

Did you find this useful? Give us your feedback

Figures (8)

Figure 6. Depth map computation for the fruitbasket scene. Parentheses contain the number of input images. Conventional Gray codes (b) and phase-shifting (c) result in errors at points receiving inter-reflections. Modulated phase-shifting (d) produces errors on translucent fruits, due to low direct component. (e) Our result.

Figure 1. Measuring shape for the ‘bowl on marble-slab’ scene. This scene is challenging because of strong interreflections inside the concave bowl and sub-surface scattering on the translucent marble slab. (b-d) Shape reconstructions. Parentheses contain the number of input images. (b) Conventional Gray codes result in incorrect depths due to interreflections. (c) Modulated phase-shifting results in errors on the marble-slab because of low direct component. (d) Our technique uses an ensemble of codes optimized for individual light transport effects, and results in the best shape reconstruction. (e) By analyzing the errors made by the individual codes, we can infer qualitative information about light-transport. Points marked in green correspond to translucent materials. Points marked in light-blue receive heavy inter-reflections. Maroon points do not receive much global illumination. For more results and detailed comparisons to existing techniques, please see the project web-page [1].

Figure 2. Errors due to inter-reflections: First row: Conventional coding and decoding. (a) A concave V-groove. The center edge is concave. (b) Low frequency pattern. (c-d) Images captured with pattern (b) and its inverse respectively. Point S is directly illuminated in (c). However, because of inter-reflections, its intensity is higher in (d), resulting in a decoding error. (e) Decoded bit plane. Points decoded as one (directly illuminated) and zero (not illuminated) are marked in yellow and blue respectively. In the correct decoding, only the points to the right of the concave edge should be one, and the rest zero. (k) Depth map computed with the conventional codes. Because of incorrect binarization of the low frequency patterns (higher-order bits), depth map has large errors. Second row: Logical coding and decoding (Section 4.1). (f-g) Pattern in (b) is decomposed into two high-frequency patterns. (h-i) Binarization of images captured with (f-g) respectively. (j) Binary decoding under (b) computed by taking pixel-wise XOR of (h) and (i). (l) Depth map computed using logical coding-decoding. The errors have been nearly completely removed. (m) Comparison with the ground-truth along the dotted lines in (k-l). Ground truth was computed by manually binarizing the captured images.

Figure 8. Shiny metal lamp. (a) A hemispherical lamp made of brushed shiny metal. (b-c) Both conventional Gray codes and modulated phase-shifting perform poorly due to strong and high-frequency inter-reflections. (d) Our ensemble codes reduce the errors significantly. (e) Using our error detection and correction, we get nearly perfect reconstruction. Please see text for ground truth comparisons.

Figure 7. Shower-curtain: The correct shape of the curtain is planar, without ripples. Light diffuses through the curtain and is reflected from the background. (b-c) Conventional Gray codes and phase-shifting result in large errors. (d) Reconstruction using logical codes is nearly error-free, with same number of input images as conventional codes.

Figure 4. Depth computation under defocus: (a) A scene consisting of industrial parts. (b) Due to defocus, the high frequency patterns in the conventional Gray codes can not be decoded, resulting in a loss of depth resolution. Notice the quantization artifacts. (c) Depth map computed using Gray codes with large minimum stripe-width (minSW) does not suffer from loss of depth resolution.

Figure 3. Errors due to sub-surface scattering: (a) This scene consists of a translucent slab of marble on the left and an opaque plane on the right. (b) A high frequency pattern is severely blurred on the marble, and can not be binarized correctly (c). Image captured (d) under a lowfrequency pattern can be binarized correctly (e).

Figure 5. Different codes: The range of stripe-widths for conventional Gray codes is [2, 512]. For XOR-04 codes (optimized for long range effects) and Gray codes with maximized min-SW (optimized for short-range effects), the ranges are [2, 4] and [8, 32] respectively.

Content maybe subject to copyright Report

Structured Light 3D Scanning in the Presence of Global Illumination

Mohit Gupta

†

, Amit Agrawal

‡

, Ashok Veeraraghavan

‡

and Srinivasa G. Narasimhan

†

Robotics Institute, Carnegie Mellon University, Pittsburgh, USA

‡

Mitsubishi Electric Research Labs, Cambridge, USA

Abstract

Global illumination eﬀects such as inter-reﬂections,

diﬀusion and sub-s u r face scattering severely degrade

the performance of structured light-based 3D scanning.

In this paper, we analyze the errors caused by global

illumination in structured light-based shape recovery.

Based on this analysis, we design s t ru ctu red light pat-

terns that are resilient to individual global illumination

eﬀects using simple logical operations and tools from

combinatori al mat hema tics . Scenes exhibiting mul ti -

ple phenomena are handled by combining results from

a small ensemble of such patterns. This combina tio n

also allows us to detect any residual errors that are

corrected by acquiring a few additional images.

Our techniques do not require explicit separation of

the direct and global components of scene radiance and

hence work even in scenarios where the separation fails

or the direct component is too low. Our methods can

be readily inco r porated into existing scanning systems

without signiﬁcant overhead in terms of capture time or

hardwa re. We show results on a variety of scenes with

complex shape and material properties and challenging

global illumination eﬀects.

1. Introduction

Structured light triangulation has become the

method of choice for shape measurement in several

applications including industrial automation, graphics ,

human-computer interaction and surgery. Since the

early work in the ﬁeld about 40 years ago [

18, 12], re-

search has been driven by two fact or s: redu ci ng the

acquisition time and increasing the depth r es olu tion .

Signiﬁcant progress has been made on both fronts (see

the survey by Salvi et al [

16]) as demonstrated by sys-

tems which can recover shapes at close to 1000 Hz. [

21]

and at a depth resolution better than 30 microns [5].

Despite these advances, most structured light tech-

niques make an important assumption: scene points

receive illumination only directly from the light source.

For many real world scenarios, this is not true. Imag-

ine a robot trying to navigate an underground cave or

an indoor scenario, a surgical instrument ins id e human

body, a robotic arm sorting a heap of metallic machine

parts, or a movie director wanting to image th e face

of an actor. In all these settings, scene points receive

illumination indirectly in the form of inter-reﬂections,

sub-surface or volumetric scattering. Such eﬀects, col-

lectively termed global or indirect illumination

, often

dominate the direct illumination and strongly depend

on the shape and material properties of the scene. Not

accounting for these eﬀects results in large and system-

atic errors in the recovered shape (see Figure

1b).

The goal of this paper is to build an end-to-end sys-

tem for structured light scanning under a broad range

of global illumination eﬀects. We be gin by formally

analyzing errors caused due to diﬀerent global illumi-

nation eﬀects. We show that th e types and magnitude

of errors depend on the region of inﬂuence of global illu-

mination at any scene point. For instance, some scene

points may receive global illumination only from a local

neighborhood (sub-surface scattering). We call these

short-range eﬀects. Some points may receive global

illumination fr om a larger region ( inter-reﬂections or

diﬀusion). We call these long range eﬀects.

The key idea is to design patterns that modulate

global illumination and prevent the errors at capture

time itself. Short and long range eﬀects place con-

trasting demands on the patt er n s. Whereas low spa-

tial frequency patterns are best suited for short range

eﬀects, long range eﬀects require the patterns to have

high-frequencies. Since most curre ntly used patterns

(e.g., binary and sinusoidal codes) contain a combi-

nation of both low and high spati al frequencies, they

are ill-equipped to prevent errors. We show that such

patterns can be converted to those with only high fre-

quencies by applying s imp le logical operations, mak-

ing them resilient to long range eﬀects. Similarly, we

use tools from combinatorial mathematics to design

patterns cons i st in g solely of frequencies that are low

enough to make them resilient to short range eﬀects.

But how do we handle scenes that exhibit more than

one type of global illumination eﬀect ( s uch as t he one

in Figure 1a)? To answer this, we observe t hat it i s

highly unlikely for two diﬀerent patterns to produce the

same erroneous decoding. This observation allows us to

project a small ensemble of patterns and use a simple

voting scheme to compute the corre ct d ecoding at ev-

Global illumination should not be confused with the oft-used

“ambient illumination” that is subtracted by capturing image

with the structured light source turned oﬀ.

713

(a) Bowl on a (b) Conventional Gray (c) Modulated phase (d) Our ensemble (e) Error map

translucent marble slab codes ( 1 1 images) shifting [

4] (162 images) codes (41 images) for codes

Figure 1. Measuring shape for the ‘bowl on marble-slab’ scene. This scene is challenging because of strong inter-

reﬂections inside the concave bowl and sub-surface scattering on the translucent marble slab. (b-d) Shape reconstructions.

Parentheses contain the number of input images. (b) Co nventional Gray codes result in incorrect depths due to inter-

reﬂections. (c) Modulated phase-shifting results in errors on the marble-slab because of low direct component. (d) Our

technique uses an ensemble of codes optimized for individual light transport eﬀect s, and results in the best shape reco n s tr u c -

tion. (e) By analyz in g the errors made by the individual codes, we can infer qualitative information about light-transport.

Points marked in green correspond to translu c ent materials. Points marked in light-blue receive heavy inter-reﬂections.

Maroon p o ints do not receive much global illumination. For more results and detailed comparisons to existing

techniques, please see the project web-page [1].

ery pi xe l, without any prior knowledge about the types

of eﬀects in the scene (Figure 1d) . For very challenging

scenes, we present an error detect ion scheme based on

a simple consistency check over the r e su lt s of the indi-

vidual codes in the ensemble. Finally, we present an

error corr ec ti on scheme by collect in g a few additional

images. We demonstrate accurate reconstructions on

scenes with complex geometry and material properties,

such as shiny brushed met al, translucent wax and mar-

ble and thick plastic diﬀu se r s (like shower curtains).

Our techniques do not require explicit separation

of the direct and global components of scene radiance

and hence work even in scenarios where the separa-

tion fails (e.g., strong inter-reﬂections among metallic

objects) or where the direct component is too low and

noisy (e.g., tr ans lu ce nt objects or in the presence of de-

focus). Our techniques consistently outperform many

traditional coding schemes and techniques which re -

quire explicit separation of the global component, such

as modulated phase-shifting [

4]. Our met hods are sim-

ple to imp le ment and can be readily incorporated into

existing systems without signiﬁcant overhead in terms

of acquisition time or har d ware.

2. Related Work

In this section, we summarize the works that address

the problem of shape recovery under global illumina-

tion. The seminal work of Nayar et al. [

13] presented an

iterative approach for reconstructing shape of Lamber-

tian objects in the presence of inter-reﬂections. Gupta

et al. [

8] presented methods for recovering depths us-

ing projector defocus [20] under global illumination ef-

fects. Chandraker et al . [2] use inter-reﬂections to re-

solve the bas-relief ambiguity inhe r ent in shape-from-

shading techniques. Holroyd et al [

10] proposed an ac-

tive multi-view ste r eo technique where high-frequency

illumination is used as scene texture that is invariant to

global illumin ation. Park et al. [15] move the camera

or the scene to mitigate the errors due to global illumi-

nation in a structured light setup. Hermans et al [

9] us e

a moving pr ojector in a variant of structured light tri-

angulation. The de pt h measure used in this technique

(frequency of the intensity proﬁle at each pixel) is in-

variant to global light transport eﬀects. In this paper,

our focus is on d es i gnin g structured light systems while

avoiding the overhead due to moving components.

Recently, it was shown that the direct and global

components of sc en e radiance could be eﬃci ently sep-

arated [14] usin g high-frequ e nc y illumination p atterns.

This has led to several attempts to perform structured

light scanning unde r global illumination [3, 4]. All

these techniques rely on subtracting or reducing the

global component and apply conventional approaches

on the residual direct comp one nt. While these ap-

proaches have shown promise, there are three issues

that prevent them from being applicable broadly: (a)

the direct component estimation may fail due to strong

inter-reﬂections (as with shiny metallic parts), (b) the

residual direct component may be too low and noisy

(as with translucent surfaces, milk and murky water),

and (c) they require signiﬁcantly higher number of im-

ages than traditional approaches, or rely on weak cues

like polarization. In contrast, we explicitly design en-

sembles of illumination patterns that are resilient to a

broader range of global illuminati on eﬀects, using sig-

niﬁcantly les s number of i mages.

3. Errors due to Global Illumination

The type and magnitude of errors due to global il-

lumination depends on the spatial frequen cie s of th e

patterns and the global illumination eﬀect. As shown

in Figures 2 and 3, long range eﬀects and short range

714

Conventional Coding and Decoding

(a) (b) (c) (d) (e)

Logical Coding and Decoding

(f) (g) (h) (i) (j)

Results and Comparison

0 200 400 600

600

700

800

900

1000

1100

Pixels

Depth (mm)

Our XOR−04 Codes

Conventional Gray Codes

Ground Truth

(k) Depth (conventional Gray codes) (l) Depth (our XOR04 codes) (m) Comparison with the

Mean absolute error = 28.8mm Mean absolute error = 1.4mm ground-truth

Figure 2. Errors du e to inter-reﬂections: First row: Conventional coding a n d decoding. (a) A concave V-groove. The

center edge is concave. (b) Low frequency pattern. (c-d) Images captured with pattern (b) and its inverse respectively. Point

S is directly illuminated in (c). However, because of inter-reﬂections, its intensity is higher in (d), resulting in a decoding

error. (e) Decoded bit plane. Points decoded as one (directly illuminated) and zero (not illuminated) are marked in yellow

and blue respectively. In the correct decoding, only the points to the right of the concave edge should be one, and the rest

zero. (k) Depth map computed with the conventional codes. Because of incorrect binarization of the low frequency patterns

(higher-order bits), depth map has large errors. Second row: Lo gic al coding and decoding (Section

4.1). (f-g)

Pattern in (b) is decomposed into two high-frequency patterns. (h-i) Binarization of images captured with (f-g) respectively.

(j) Binary decoding under (b) computed by taking pixel-wise XOR of (h) and (i). (l) Dep th map computed using logical

coding-decoding . Th e errors have been nearly completely removed. (m) Comparison with the ground-truth alo n g the dotted

lines in (k-l). Ground truth was computed by manually binariz in g the captured images.

eﬀects result in incorrect decoding of low and high spa-

tial frequency patterns, respectively. In this section, we

formally analyze these err ors . For ease of exposition,

we have focused on binary patterns. The analys is and

techniques ar e easily extended to N-ary codes.

Binary patt er n s are decoded by binarizing the cap-

tured images into projector-illuminated vs. non-

illuminated pixels. A robust way to do this is to cap-

ture two images L and

L, under the pattern P and the

inverse pattern P , respectively. For a scene point S

, its

irradiances L

and L

are compared. If, L

> L

, then

the point is classiﬁed as directly lit. A fundamental

assumption for correct binarization is that each scene

point receives irradiance from only a single illumination

element (light stripe or a projector pixel). However,

due to global illumination eﬀects and projector defo-

cus, a scene point can receive irradiance from multiple

projector pixels, resulting in in cor r ec t binarization.

In the following, we derive the condition for correct

binarization in the presence of global illumin ation and

defocus. Suppose S

is directly lit under a pattern P .

The irradiances L

and L

are given as:

= L

+ β L

, (1)

= (1 − β) L

, (2)

where L

and L

are the direct and global compo-

nents of the irradiance at S

when the scene is fully

lit. β is the fraction of the global component under

715

the pattern P . In the presence of projector defocus, S

receives fractions of the direc t component, both under

the pattern and its inverse [

8].

= α L

+ β L

, (3)

= (1 − α) L

+ (1 − β) L

. (4)

The fractions (α and 1−α) depend on the projected

pattern and the amount of defocus. In the absence of

defocus, α = 1. For correct binarization, it is required

that L

, i.e.

α L

+ β L

> (1 − α) L

+ (1 − β) L

(5)

This cond it ion is satisﬁed in the absence of global

illumination (L

= 0) and defocus (α = 1). In the fol-

lowing, we analyze the errors in the binarization pro-

cess due to various global illumination eﬀects and de-

focus, leading to systematic errors

Long range eﬀects (diﬀuse and specular inter-

reﬂections): Consider the scenario when S

receives

a major fr action of the global component when it is

not directly lit (β ≈ 0), and the global component is

larger than the direct component (L

< L

) as well.

Substituting in the binarization condition (Eqn.

5),

we get L

, which results in a binariz ation error.

Such a situation can commonly arise due to long-range

inter-reﬂections, when scenes are il lu minate d with low-

frequency patterns. For example, consider the v-groove

concavity as shown in Figure

2. Under a low fre-

quency pattern, several scene points in the concavity

are brighter when they ar e not directly lit, resulting in

a b in ari zati on error. Since the low freque nc y patter n s

correspond to the higher-order bits, this results in a

large error in the recovered shape.

Short-range eﬀects (sub-surface scattering and

defocus): Short range eﬀects result in low-pass ﬁl-

tering of the incident illumination. In the context

of structured light, these eﬀects can severely blur the

high-frequency patterns, making it hard to correctly

binarize them. This can be exp lain ed in terms of the

binarization condition in Eqn

5. For high frequency

patterns, β ≈ 0.5 [14]. If t he diﬀerence in the dir e ct

terms |α L

− (1 − α ) L

| is small, either because the

direct component is low due to sub-surface scattering

≈ 0) or because of severe defocus (α ≈ 0.5), the

pattern can not be robustly binar iz ed due to low signal-

to-noise-ratio (SNR). An example is shown in Figure

For conventional Gray codes, this results in a loss of

depth resolution, as illustrated in Figure 4.

4. Patterns for Error Prevention

Errors due to global illumination are sys te matic,

scene-dependent errors that are hard to eliminate in

Errors for the particular case of laser range scanning of

translucent materials are analyzed in [

7]. Errors due to sensor

noise and spatial mis-alignment of projector-camera pixels were

analyzed in [

17].

(a) (b) (c) (d) (e)

Figure 3. Errors due to sub-surface scattering: (a)

This scene consists of a translucent slab of marble on the

left and an opaque plane o n the right. (b) A high frequency

pattern is severely blurred on the marble, and can not be

binarized correctly (c). Image ca p t u red (d) un d er a low-

frequency pattern can be binarized correctly (e).

(a) Scene (b) Conventional Gray codes

380 385 390 395 400

547

548

549

550

551

Pixels

Depth (mm)

Conventional Gray Codes

Our Gray Codes

Figure 4. Depth computation under defocus: (a) A

scene consisting of industrial parts. (b) Due to defocus,

the high frequency patterns in the c o nventional Gray codes

can not be decoded, resulting in a loss of depth resolutio n .

Notice the quantization artifacts. (c) Depth map computed

using Gray cod es with large minimum stripe-width (min-

SW) does not suﬀer from loss of depth resolution.

post-proces s in g. In this section, we design patterns

that modulate global illumination and prevent errors

from happening at capture time itself. In the presence

of only long range eﬀects and no short-range eﬀects,

high-frequency binary patterns (with equal oﬀ and on

pixels) are decoded correctly because β ≈ 0.5 [

14], as

shown in Figures 2(f-i). On the other hand, in the

presence of short-range eﬀects, most of the global illu-

mination comes from a local neighborhoo d. Thus, for

low frequency patterns, when a scene point is directly

illuminated, most of its local neighborhood is directly

illuminated as well. Hence, α ≥ 0.5 and β ≥ 0.5. Thus,

if we use low frequency patterns for short-range eﬀects,

the global component actually helps in correct decod-

ing even when the direct component is low (Figure

3).

Because of the contrasting requirements on spatial

716

frequencies, it is clear that we need diﬀerent codes for

diﬀerent eﬀects. For long range eﬀects, we want pat-

terns with only high frequencies (low maximum stripe-

widths). For short-range eﬀects, we want patterns with

only low frequencies (high minimum stripe-widths).

But most currently used patterns contain a combina-

tion of both low and high spatial frequencies. How

do we design patterns with only low or only high fre-

quencies? We show that by perfor ming simple logical

operations, it is possible to design codes with only high

frequency patterns. For short range eﬀects, we draw on

tools from the combinatorial maths literature to design

patterns with large minimum stripe-widths.

4.1. Logical coding-decoding for long range effects

We introduce the concept of logical coding and de-

coding to design patterns with only high fr e qu en ci es .

An example of logical coding-decoding is given in Fig-

ure

2. The important observation is that for structured

light decoding, the direct component is just an inter me-

diate representation, with the eventual goal being the

correct binarization of the captured image. Thus, we

can bypass explicitly computing the direct component.

Instead, we can model the binar izat ion proc es s as a

scene-dependent function from the set of b in ary pro-

jected patterns (P) to the set of binary classiﬁcations

of the captured image (B):

f : P ⇒ B . (6)

For a given pattern P ∈ P, this function returns

a binarization of the captu r ed image if the scene is

illuminated by P . Under inter-reﬂections, t hi s function

can be computed robu s tl y for high-frequency patterns

but not for low-frequency patterns. For a low frequency

pattern P

, we would like to decompose it into two

high-frequency patterns P

and P

using a pixel-wise

binary operator ⊙ such that:

f(P

) = f



⊙ P



= f





⊙ f





(7)

If we ﬁnd such a decomposition, we can robustly

compute the binarizations f





and f





under

the two high frequency patterns, and compose these to

achieve the correct binarization f (P

) under the low

frequency pattern. Two questions remain: (a) What

binary operator can be u s ed ? (b) How can we decom-

pose a low frequency pattern into two high frequency

patterns? For the binary operator, we choose the logi-

cal XOR (⊕) because it has the following p r operty:

⊕ P

= P

⇒ P

= P

⊕ P

(8)

This choice of operator provides a simple means to

decompose P

. We ﬁrst choose a high-frequen cy pat-

tern P

. The second pattern P

is then computed

Conventional Gray Codes

XOR-04 Codes

Gray codes with maxi mum min-SW

Figure 5. Diﬀerent codes: The range o f stripe-widths fo r

conventional Gray codes is [2, 512]. For XOR-04 codes

(optimized for long range eﬀects) and Gray codes with

maximized min-SW (optimized for short-range eﬀects), the

ranges are [2, 4] and [8, 32] respectively.

by simply taking the pixel-wise logical XOR of P

and P

. We call the ﬁrst high f r eq ue nc y pattern the

base pattern. Instead of pr ojecting the original low fre-

quency pattern s , we project the base pattern P

and

the second h igh- fr eq u en cy patterns P

. For example,

if we use the last Gray code pattern (stripe width of 2)

as the base pattern, all the projected patterns have a

maximum width of 2. We call these the XOR-02 codes.

In contrast, the original Gray codes have a maximum

stripe-width of 512. Note that there is no overhead

introduced; the number of projected patterns remains

the same as the conventional codes. Simil arl y, if we use

the second-last pattern as the base-plane, we get the

XOR-04 codes (Figure

5). The patter n images can be

downloaded from the project web-page [1].

4.2. Maximizing the minimum stripe-widths for

short-range effects

Short-range eﬀects can severely blur the high-

frequency base plane of the logical XOR codes. The

resulting binarization error will propagate to all the

decoded patterns. Thus, for short-r ange e ﬀe ct s , we

need to design codes with large minimum stri pe-width

(min-SW). It is not feasible to ﬁnd such codes with a

brute-force search as these codes are extremely rare

Fortunately, this problem has been well studied in

combinatorial mathematics. There are constructions

available to generate codes with large min-SW. For in-

stance, the 10-bit Gray code with the maximum known

On the contrary, it is easy to generate codes with small max-

imum stripe-width (9), as compared to 512 for the conventional

Gray codes, by performing a brute-force search

717

HTML Viewer

Frequently Asked Questions (11)

Q1. What contributions have the authors mentioned in the paper "Structured light 3d scanning in the presence of global illumination" ?

In this paper, the authors analyze the errors caused by global illumination in structured light-based shape recovery. The authors show results on a variety of scenes with complex shape and material properties and challenging global illumination effects.

Q2. What is the way to design patterns with only high frequencies?

For short range effects, the authors draw on tools from the combinatorial maths literature to design patterns with large minimum stripe-widths.

Q3. What is the common way of detecting a scene?

In all these settings, scene points receive illumination indirectly in the form of inter-reflections, sub-surface or volumetric scattering.

Q4. What are the steps to correct the error?

It is important to note that for this error correction strategy to be effective, the error prevention and detection stages are critical.

Q5. What is the goal of this paper?

The goal of this paper is to build an end-to-end system for structured light scanning under a broad range of global illumination effects.

Q6. How can the authors find codes with large min-SW?

For conventional Gray codes, although short-range effects might result in incorrect binarization of the lower-order bits, the higherorder bits are decoded correctly.

Q7. Why did the early work in the field drive the research?

Since the early work in the field about 40 years ago [18, 12], research has been driven by two factors: reducing the acquisition time and increasing the depth resolution.

Q8. How many iterations do the authors need to reduce the residual errors?

Repeat steps 1 − 5 to progressively reduce the residual errors (Section 5).first iteration itself, the authors require only a small number of extra iterations (typically 1-2) even for challenging scenes.

Q9. What is the effect of low-pass filtering of the incident light?

In the context of structured light, these effects can severely blur the high-frequency patterns, making it hard to correctly binarize them.

Q10. What are the main issues that prevent them from being applicable broadly?

While these approaches have shown promise, there are three issues that prevent them from being applicable broadly: (a) the direct component estimation may fail due to strong inter-reflections (as with shiny metallic parts), (b) the residual direct component may be too low and noisy (as with translucent surfaces, milk and murky water), and (c) they require significantly higher number of images than traditional approaches, or rely on weak cues like polarization.

Q11. How is it possible to generate codes with large min-SW?

On the contrary, it is easy to generate codes with small maximum stripe-width (9), as compared to 512 for the conventional Gray codes, by performing a brute-force searchmin-SW (8) is given by Goddyn et al . [6]

Structured light 3D scanning in the presence of global illumination

Summary (2 min read)

1. Introduction

3. Errors due to Global Illumination

4. Patterns for Error Prevention

4.1. Logical coding-decoding for long range effects

4.2. Maximizing the minimum stripe-widths for

4.3. Ensemble of codes for general scenes

5. Error detection and correction

6. Limitations

Figures (8)

Citations

Cites background or methods from "Structured light 3D scanning in the..."

Cites background from "Structured light 3D scanning in the..."

Cites background or methods from "Structured light 3D scanning in the..."

References

"Structured light 3D scanning in the..." refers background in this paper

"Structured light 3D scanning in the..." refers background in this paper

"Structured light 3D scanning in the..." refers methods in this paper

"Structured light 3D scanning in the..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (11)

Q1. What contributions have the authors mentioned in the paper "Structured light 3d scanning in the presence of global illumination" ?

Q2. What is the way to design patterns with only high frequencies?

Q3. What is the common way of detecting a scene?

Q4. What are the steps to correct the error?

Q5. What is the goal of this paper?

Q6. How can the authors find codes with large min-SW?

Q7. Why did the early work in the field drive the research?

Q8. How many iterations do the authors need to reduce the residual errors?

Q9. What is the effect of low-pass filtering of the incident light?

Q10. What are the main issues that prevent them from being applicable broadly?

Q11. How is it possible to generate codes with large min-SW?