What have the authors contributed in "Transfer of albedo and local depth variation to photo-textures" ?

In this paper, the authors present a material appearance transfer method, Transfer by Analogy, designed to infer surface detail and diffuse reflectance for textured surfaces like the present in building façades. Their approach allows super-resolution inference of albedo and displacement from information in the photo-texture. When transferring appearance from multiple exemplars to façades containing multiple materials, their approach also sidesteps the need for segmentation. The authors show how they use these methods to create relightable models with a high degree of texture detail, reproducing the visually rich self-shadowing effects that would normally be difficult to capture using just simple consumer equipment. The authors begin by acquiring small exemplars ( displacement and albedo maps ), in accessible areas, where capture conditions can be controlled.

What have the authors stated for future works in "Transfer of albedo and local depth variation to photo-textures" ?

As future work, the authors consider that further investigation of transferoriented texture descriptors is a promising field in order to improve the performance of Transfer by Analogy. Finally, using a simple acquisition process and consumer SLR equipment, the authors demonstrate the efficacy of their processes to significantly enhance the detail recovered for building façades within a full imagebased reconstruction and relighting pipeline.

What is the strength of Transfer by Analogy?

A particular strength of the Transfer by Analogy process is that the matching finds coordinates in an exemplar which the authors are able to exploit in order to boost resolution of the photo-texture.

How do the authors compute an approximate albedo map?

By subtracting the ambient image from the flash image, and dividing by a calibration image, the authors compute an approximate albedo map.

What is the real strength of Transfer by Analogy?

Transfer by Analogy, naturally facilitates searching for the best match between several materials, and therefore, the real strength of the approach is that no segmentation is required for the transfer process.

How do the authors use pixel voting to avoid blocky effects?

The authors scale up the coordinate map that associates the photo-texture to the exemplar and use pixel voting, as described before, to avoid blocky effects.

What could be used to fill the unmatched areas?

In these cases when the phototexture-exemplar match present a large error, other tecniques such as texture inpainting could be used to fill the unmatched areas.

Why is it important to keep the structure of the texture coherent?

This is important to keep the structure of the texture coherent, especially in the shading image where noise can produce undesirable high frequencies in the resulting geometry.

how to improve the detail of building facades?

using a simple acquisition process and consumer SLR equipment, the authors demonstrate the efficacy of their processes to significantly enhance the detail recovered for building façades within a full imagebased reconstruction and relighting pipeline.

What is the way to estimate the reflectance properties of a large complex scene?

This data can be used within the inverse rendering framework [9, 4] to estimate the reflectance properties of a large complex scene.

(Open Access) Transfer of albedo and local depth variation to photo-textures (2012) | Francho Melendez

This item was submitted to Loughborough’s Institutional Repository

(https://dspace.lboro.ac.uk/) by the author and is made available under the

following Creative Commons Licence conditions.

For the full text of this licence, please go to:

http://creativecommons.org/licenses/by-nc-nd/2.5/

Transfer of Albedo and Local Depth Variation to

Photo-Textures

Francho Melendez

Loughborough University

F.A.Melendez@lboro.ac.uk

Mashhuda Glencross

Loughborough University

M.Glencross@lboro.ac.uk

Jonathan Starck

The Foundry Visionmongers

jon.starck@thefoundry.co.uk

Gregory J. Ward

Dolby Laboratories

gward@dolby.com

ABSTRACT

Acquisition of displacement and albedo maps for full building façades

is a difﬁcult problem and traditionally achieved through a labor in-

tensive artistic process.

In this paper, we present a material appearance transfer method,

Transfer by Analogy, designed to infer surface detail and diffuse re-

ﬂectance for textured surfaces like the present in building façades.

We begin by acquiring small exemplars (displacement and albedo

maps), in accessible areas, where capture conditions can be con-

trolled. We then transfer these properties to a complete photo-

texture constructed from reference images and captured under dif-

fuse daylight illumination.

Our approach allows super-resolution inference of albedo and dis-

placement from information in the photo-texture. When transfer-

ring appearance from multiple exemplars to façades containing mul-

tiple materials, our approach also sidesteps the need for segmenta-

tion.

We show how we use these methods to create relightable models

with a high degree of texture detail, reproducing the visually rich

self-shadowing effects that would normally be difﬁcult to capture

using just simple consumer equipment.

Categories and Subject Descriptors

I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Re-

alismColor, shading, shadowing, and texture

Keywords

Texture Transfer, Albedo, Displacement Map, 3D Reconstruction

1. INTRODUCTION

In this paper we present a semi-automatic method to produce dis-

placement (meso-structure) and reﬂectance (albedo) maps, for vi-

sually rich textures, suitable for creating relightable 3D building

models. Techniques to infer material appearance for use in making

3D assets are valuable for computer games, ﬁlm post-production,

archeology and architectural visualization. These applications need

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

CVMP ’12, December 5 – 6, 2012, London, United Kingdom

to be capable of rendering the recovered scenes under changing

lighting as well as view point and require an increasing level of de-

tail and realism. Material appearance is often conveyed by texture

alone (recovered from images), but this appearance is only valid

under the originally photographed viewing and lighting conditions.

Realistic relighting requires a model that represents surface geom-

etry and reﬂectance characteristics. A vast body of literature ex-

ists on methods for capturing reﬂectance and geometry, but cap-

turing this information for large outdoor urban scenes, remains a

difﬁcult problem. Such scenes in particular offer interesting chal-

lenges since controlling lighting and access to areas to obtain suit-

able views rapidly becomes impractical. A commonly employed

approach is for artists to manually create displacement and albedo

maps, using a photo-texture as a reference.

To address this problem, we present a technique designed for trans-

ferring visually high-quality material appearance, captured at close

range using standard digital SLR equipment, to a photo-texture.

Material appearance, in this case, is constrained to diffuse albedo

and shading information.

Employing a transfer approach rather than a synthesis approach

preserves the original structure and appearance contained in the

photo-texture. Our transfer approaches represent a good trade-off

between simplicity of data capture and the quality of the results

achieved for a range of broadly Lambertian materials used in con-

struction.

The main advantages of the method presented here are:

• A simple capture process reduced to minimum equipment

and calibration.

• The method is able to increase the resolution of the original

photo-texture using information captured at close range in

the exemplars.

• The algorithm automates the segmentation of materials present

in the photo-texture and the associations with exemplars.

• In this paper we also provide a practical solution for address-

ing differences in lighting conditions between exemplars and

the target photo-texture.

The remainder of this paper is organized as a brief outline of related

work, a detailed description of our material appearance transfer ap-

proach, a an evaluation of its capabilities, results of the transfer ap-

plied to full building façades, future work and concluding remarks.

2. RELATED WORK

We begin by ﬁrst giving a brief overview of these, structured in

two sections: the state of the art in reﬂectance and shape estimation

from images; and texture synthesis and transfer techniques.

2.1 Image Based Reﬂectance and Shape

Recovery

The reﬂectance properties of opaque surfaces have been captured

and modeled in a variety of different ways; we refer the reader to

Dorsey et al. [10] for a thorough treatment of the topic. The Bidi-

rectional Reﬂectance Distribution Function (BRDF) can be mea-

sured from images captured under a range of different viewing and

lighting conditions [31, 24, 23, 29]. Capturing similar data for

textured surfaces enables the creation of the Bidirectional Texture

Function (BTF) [7]. Researchers have built upon the idea of the

use of multiple light sources (photometric stereo) [36] to capture

material appearance from fewer view points, effectively recovering

albedo and local surface orientation [28, 12]. Paterson et al’s [26]

material capture approach combines photometric stereo with mul-

tiple view geometry captures to recover displacement maps and in-

homogeneous BRDFs over nearly planar samples. Ward and Glen-

cross [32] employ a similar approach to estimate albedo (diffuse

reﬂectance), based on single view multi-ﬂash captures in conjunc-

tion with shape from shading [16, 21, 39, 13].

For large surfaces, a possible solution often employed is to com-

bine laser-scans of a scene, together with sky captures using an

incident illumination measuring device and representative BRDF

samples [8]. This data can be used within the inverse rendering

framework [9, 4] to estimate the reﬂectance properties of a large

complex scene. However, this approach requires specialized equip-

ment and careful data collection. Without measuring the lighting

for every image captured in the scene, further assumptions have

to be made. Yu et al. estimated two pseudo-BRDFs for a polygo-

nal model by ﬁtting a small number of photographs, captured un-

der clear sky conditions, to a parameterized model of the sky [38].

This approach offers improved relightable textures compared to us-

ing the original images, however no surface detail is captured and

at least two lighting conditions per texture are required.

An interesting approach was also presented by Xu et al. [37] where

the ratio between the green channel of the capture images and the

reﬂected laser intensity is used to correct all color channels.

Rather than trying to explicitly solve the inverse rendering problem,

which is ill-posed without capturing lighting and accurate geome-

try, we approach this problem from the texture transfer perspective.

We propose capturing material appearance models of samples (ex-

emplars) where we can control lighting conditions and then extrap-

olate these properties to the rest of the surface. We focus on two

essential characteristics of the texture: albedo and meso-structure

(depth). Glencross et al. [13] showed through evaluation with hu-

man subjects that this provides enough information to produce per-

ceptually plausible relightable models of a great variety of textures.

Figure 1: The capture process for Surface Depth Hallucination.

We capture our stimuli exemplars using Surface Depth Hallucina-

tion which is summarized in Figure 1. This technique requires cap-

turing a photograph of opaque surfaces under natural diffuse light-

ing (ambient image), ideally on a cloudy day thus avoiding hard

shadows, and another one ﬁring a ﬂash (ﬂash image). By subtract-

ing the ambient image from the ﬂash image, and dividing by a cali-

bration image, we compute an approximate albedo map. A shading

image is calculated as the ratio of the ambient image and the albedo

image, and used to create a depth map through a per-pixel dark-is-

deep approach. A relightable 3D model can be created and ren-

dered from the depth map and the albedo map. Due to ﬂash guide

distance limitations, exemplars captured using this method must be

restricted to small surfaces (around one square meter), in order to

keep detail in cracks and crevices sufﬁciently illuminated.

2.2 Texture Synthesis and Transfer

Over the last decade texture synthesis by exemplar has been a very

active area of research. We refer the reader to state of the art reports

for a complete review [20, 33]. This idea has been demonstrated to

work effectively for synthesizing a wide variety of textures. New

pixels or patches are generated by choosing the best candidate from

a given exemplar, such that it is coherent with the already synthe-

sized texture. For globally-varying textures, a control map is often

used to drive this type of synthesis. Ashikhmin [1] synthesized an

output conditioned by a colored map, and similar ideas are used in

patch-based synthesis, deﬁning the concept of texture transfer [11].

An extension of this idea, is the notion of Image Analogies [15].

Histogram matching was explored for synthesizing stochastic tex-

tures [14] and detail in textures [17] by matching the histogram of

noise patterns to a texture sample. Glencross et al. [13] extended

this work to transfer albedo and shading to similar sized surface

samples. Melendez et al. [25] described a pipeline to apply this

process to full building façades. Their method requires exemplar

and target image to be statistically similar. For example, if we have

a brick wall with moss, the amount of moss should be similar in

proportion to the amount of brick in both exemplar and target im-

age. Our novel approach can be included in the pipeline of Melen-

dez el at., but since it matches materials localy, it overcomes the

necesity of a good match in global statistics. This also allows us to

run the transfer against a set of exemplars, without the necesity of

segmenting the photo-texture and manually providing associations.

Our method also allows for super-resolution to ensure scalability

of our results to larger surfaces. Normally a photo-texture for a

building is captured from far away, and therefore inference of high

quality surface detail is severely limited by the resolution of the

texture map.

We draw inspiration from the Image Analogies algorithm to de-

ﬁne both albedo estimation and depth inference as a multi-exemplar

based texture transfer problem.

3. MATERIAL TRANSFER

We consider the problem of acquiring albedo and depth maps for

a photo-texture as a material transfer problem from image-based

exemplars. In this context, we identify a material with a texture that

has speciﬁc characteristics in terms of its reﬂectance and geometric

structure, for example a type of brick wall. This would include the

brick, the mortar, and even the dirt and other texture variations.

Using this deﬁnition of a material, Surface Depth Hallucination

allows us to capture valid image-based exemplars in the form of

albedo and depth maps, for texture reﬂectance and meso-structure.

Our proposed transfer techniques facilitate the creation of the cor-

responding albedo and depth maps for large surfaces, such as a

building façade, from a photo-texture and the previously captured

exemplars. We capture a texture map under natural diffuse lighting

conditions (no ﬂash needed) for the complete façade and use this as

a guide to transfer albedo and depth from representative exemplars.

3.1 Problem Statement

To illustrate the problem and evaluate our transfer technique, we

use pairs of exemplars of the same texture as stimuli in which the

scale and lighting conditions are consistent. This allows us to quan-

tify the quality of the transfer process by comparing our results

with the captured maps considered as ground-truth for perceptually

plausible relightable models.

Figure 2: Material Transfer Problem: Using an exemplar A,

generate new albedo and shading maps for a new ambient im-

age B

The material transfer process is described in Figure 2. A material M

is deﬁned by an exemplar A which is in turn composed from three

maps: an ambient map (A

), a shading map (A

), and an albedo

map (A

). Now consider another sample B of the same material

M, for which only the ambient map is available. The aim is to

synthesize a new shading map (B

) and albedo map (B

), from B

and the exemplar A.

The ambient capture contains shadowing due to the geometry of the

texture, and also color shifts due to natural lighting. From this in-

formation alone, resolving the ambiguity between albedo and shad-

ing is an ill-posed problem. We use an exemplar with similar char-

acteristics where this ambiguity has been solved to help us arrive

at a good approximation of both albedo and shading of the new

image.

3.2 Transfer by Analogy

We propose using a locally adaptable transfer approach inspired by

the idea of Image Analogies [15]. This method takes as input an

image and a ﬁltered version of it, and a target image to which we

want to apply the same ﬁlter. Applying this terminology to our

case, given an unﬁltered source image A

and two ﬁltered source

images A

, and A

, along with an additional unﬁltered target image

, the aim is to synthesize two new ﬁltered target images B

and

This idea is illustrated in Figure 3. By comparing A

and B

, we

ﬁnd the patch in the exemplar A

that best matches the appearance

of every patch in the target image (B

). Taking the same patch

from the albedo and shading maps (A

), we can create a new

albedo and shading map for the new image (B

). We deﬁne a

patch for every pixel which can be understood as a descriptor for

this pixel. The best match for a pixel will be the one with the most

similar descriptors for a given metric.

Figure 3: Trasnfer by Analogy.

3.2.1 Pixel Descriptor

The use of patches as descriptors is a common approach for exemplar-

based texture synthesis [33]. In our experiments, we achieved the

best results with 7x7 patches using the RGB channels. This also

provides a good trade-off between complexity of the descriptor and

execution time. However, the optimal patch size can vary depend-

ing on the feature size of the texture, as it happens with most patch-

based texture synthesis algorithms.

We add an extra channel to the RGB containing a distance-to-feature

mask similar to the one used by Lefebvre and Hoppe [22]. Feature

masks, and distance to feature masks, are used in texture synthesis

to help the new synthesized texture to preserve the spatial struc-

ture of the exemplar. This is generated by computing the Signed

Distance Field of a binary feature mask which is computed auto-

matically from the ambient images A

and B

. The binary mask

is the result of dividing the grey scale version of the image by a

blurred version of itself, efectively extracting the high frequencies,

and then thresholding it to create a binary image.

Figure 4: Close-up detail for shading images. LBM and

SSD have gray areas where high-frequency detail is lost, while

LBM+Fd better preserves the detail.

We observed that the use of a mask with this characteristic im-

proves the transfer of high frequency detail, especially in the case

of the shading map. In Figure 4 we show this effect on a texture

with high frequency detail using different metrics to match descrip-

tors. Both sets of results based only on RGB channels, Sum of

square differences (SSD) and our novel Log-Based Metric (LBM)

(described in Section 3.2.2), contain areas with almost no high fre-

quency detail (inside the red rectangles). Adding our distance to

feature mask (LBM+Fd) clearly recovers more detail in the result-

ing shading image in the areas where the other metrics fail.

3.2.2 Appearance Metric

A second factor for the transfer is the metric or distance between

descriptors used. SSD is the most common metric used in patch-

based texture synthesis. Empirically we found that a novel LBM

presented in Equation 1 can provide a better structural coherence

than SSD for some textures.

LBM =

∑

x,y

log(1 + abs(x − y))

(1)

The idea behind this metric is to limit the penalty for pixels that are

very different. Figure 5 shows the proﬁles of SSD and our LBM

for one pixel. The penalty increases rapidly but beyond a certain

point it tends towards levelling-off for large pixel differences.

(a) SSD (b) LBM

Figure 5: Proﬁle of penalty for difference on the compared

pixel.

It is important to note that there are 7 × 7 × 3 values to be com-

pared and added. This metric also makes the penalty smaller per

pixel. Consequently, more bad pixel matches are needed to pe-

nalize a good match. For example, a patch containing 49 pixels

and one channel, where all the pixels match perfectly except for

one which has the maximum difference x = 255, would have an

SSD equal to x

= 65025 and a LBM of log(x + 1)

= 5.799.

On the other hand, a patch with all it’s pixels with difference x

36, would have an SSD equal to

∑

= 65024 and a LBM of

∑

log(1 + x

)

= 120.50. Our new metric therefore selects good

global matches and dismisses large local errors in opposition to

SSD. The local errors are compensated for when reconstructing the

ﬁnal image from the patches, (see Equation 2) where pixels are av-

eraged according to the local error.

We evaluate our metrics against the sum of squared differences by

computing the mean error between the transferred maps and the

exemplars. Also we compare the results with and without using

the feature mask, and the effect of giving more importance to the

feature mask (LBM+Fd2). Figure 6 shows the computed average

percentage error against the maximum possible error for a num-

ber of tested metrics. Our LBM shows consistently lower error.

Although the numerical improvement is limited, we observed that

when applied to several materials, it maintains better spatial con-

sistency, resulting in better association of the correct material. We

discuss this process in more detail in Section 5.

With this metric and descriptor, we have deﬁned the best match

between pixels of two images. In the next section, we present how

Figure 6: Comparison between metrics. SSD+Fd: Sum of

Square Differences with Feature Distance Mask; SSD: Sum

of Square Differences without Mask; LBM+Fd2: Log-Based

Metric with Feature Distance Mask Squared; LMB+Fd: Log-

Based Metric with Feature Distance Mask; LBM: Log-Based

Metric without Mask.

to efﬁciently ﬁnd the best match and how to create the ﬁnal albedo

and shading maps for the target image.

3.2.3 Approximate Nearest Neighbor Search

We ﬁnd the best match by performing a nearest neighbor search.

The problem of ﬁnding the nearest neighboring patch in a medium

/ large sized image rapidly becomes computationally expensive.

Since patch-based sampling methods have become popular for im-

age and video synthesis, many researchers have studied optimizing

this process [34, 35, 18, 19]. In our implementation, we employ

a recent patch matching algorithm from Barnes et al. [2] that per-

forms at interactive rates for their application. This algorithm be-

gins with a random initialization, and then uses an iterative process

consisting of two steps: a propagation that searches within the pre-

viously matched pixels, and a random search to avoid local minima.

This process proceeds in scan order (from left to right, top to bot-

tom) for odd iterations, and in the inverse order for even iterations.

Since the algorithm converges quickly to a solution, it typically

provides a good level of detail within 5-10 iterations. We perform

our matching in a coarse-to-ﬁne grain fashion, as described in Fig-

ure 7, to improve the chances of converging to a globally optimal

solution using 10 iterations per level.

Figure 7: Fast Transfer by Analogy Algorithm.

In comparison with the original Image Analogies algorithm, we

have removed the synthesis step, since we noticed in our experi-

ments that this step can remove original features from the target

texture, expecially when such features are not present in the exem-

Transfer of albedo and local depth variation to photo-textures

Figures

Citations

Guided Fine-Tuning for Large-Scale Material Transfer

A Review of Photogrammetry and Photorealistic 3D Models in Education From a Psychological Perspective

Guided Fine-Tuning for Large-Scale Material Transfer

Session details: Course 15: Example-based texture synthesis

Neural Photometry-Guided Visual Attribute Transfer

References

Fast approximate energy minimization via graph cuts

PatchMatch: a randomized correspondence algorithm for structural image editing

Image quilting for texture synthesis and transfer

Photometric Method For Determining Surface Orientation From Multiple Images

Shape-from-shading: a survey

Related Papers (5)

Relightable Buildings from Images

Capturing Relightable Human Performances under General Uncontrolled Illumination

LIME: Live Intrinsic Material Estimation

Illumination animating and editing in a single picture using scene structure estimation

Rapid synchronous acquisition of geometry and appearance of cultural heritage artefacts

Frequently Asked Questions (14)

Q1. What have the authors contributed in "Transfer of albedo and local depth variation to photo-textures" ?

Q2. What have the authors stated for future works in "Transfer of albedo and local depth variation to photo-textures" ?

Q3. What is the way to preserve the original structure and appearance of the photo-texture?

Q4. What is the strength of Transfer by Analogy?

Q5. How do the authors compute an approximate albedo map?

Q6. What is the result of dividing the grey scale version of the image by a blurred?

Q7. What is the real strength of Transfer by Analogy?

Q8. How do the authors use pixel voting to avoid blocky effects?

Q9. What could be used to fill the unmatched areas?

Q10. Why is it important to keep the structure of the texture coherent?

Q11. How do the authors generate new pixels or patches?

Q12. how to improve the detail of building facades?

Q13. What is the way to estimate the reflectance properties of a large complex scene?

Q14. What is the technique used to capture a photograph of opaque surfaces?