A Morphable Face Albedo Model
Abstract: In this paper, we bring together two divergent strands of research: photometric face capture and statistical 3D face appearance modelling. We propose a novel lightstage capture and processing pipeline for acquiring ear-to-ear, truly intrinsic diffuse and specular albedo maps that fully factor out the effects of illumination, camera and geometry. Using this pipeline, we capture a dataset of 50 scans and combine them with the only existing publicly available albedo dataset (3DRFE) of 23 scans. This allows us to build the first morphable face albedo model. We believe this is the first statistical analysis of the variability of facial specular albedo maps. This model can be used as a plug in replacement for the texture model of the Basel Face Model and we make our new albedo model publicly available. We ensure careful spectral calibration such that our model is built in a linear sRGB space, suitable for inverse rendering of images taken by typical cameras. We demonstrate our model in a state of the art analysis-by-synthesis 3DMM fitting pipeline, are the first to integrate specular map estimation and outperform the Basel Face Model in albedo reconstruction.
Summary (3 min read)
- 3D Morphable Models (3DMMs) were proposed over 20 years ago  as a dense statistical model of 3D face geometry and texture.
- Existing 3DMMs are built using ill-defined “textures” that bake in shading, shadowing, specularities, light source colour, camera spectral sensitivity and colour transformations.
- Capturing truly intrinsic face appearance parameters is a well studied problem in graphics but this work has been done largely independently of the computer vision and 3DMM communities.
- In this paper the authors present a novel capture setup and processing pipeline for measuring ear-to-ear diffuse and specular albedo maps.
- The authors capture their own dataset of 50 faces, combine this with the 3DRFE dataset  and build a statistical albedo model that can be used as a dropin replacement for existing texture models.
2. Data capture
- A lightstage exploits the phenomenon that specular reflection from a dielectric material preserves the plane of polarisation of linearly polarised incident light whereas subsurface diffuse reflection randomises it.
- A polarising filter on each lightsource is oriented such that a specular reflection towards the viewer has the same plane of polarisation.
- The second, Iperp, has the polarising filter oriented perpendicularly, blocking the specular but still permitting transmission of the diffuse reflectance.
- The authors augment the photometric camera with additional cameras providing multiview, single-shot images captured in sync with the photometric images.
- The authors participants range in age from 18 to 67 and cover skin types I-V of the Fitzpatrick scale .
3. Data processing
- The authors then warp the 3DMM template mesh to the scan geometry.
- As well as other sources of alignment error, since the three photometric views are not acquired simultaneously, there is likely to be non-rigid deformation of the face between these views.
- For this reason, in Section 3.3 the authors propose a robust algorithm for stitching the photometric views without blurring potentially misaligned features.
3.1. Multiview stereo
- The authors commence by applying uncalibrated structure-frommotion followed by dense multiview stereo  to all 24 viewpoints (see Fig. 2, blue boxed images).
- Solving this uncalibrated multiview reconstruction problem provides both the base mesh (see Fig. 2, bottom left) to which the authors fit the 3DMM template and also intrinsic and extrinsic camera parameters for the three photometric views.
- These form the input to their stitching process.
3.2. Template fitting
- The authors use the Basel Face Pipeline  which uses smooth deformations based on Gaussian Processes.
- The authors adopted the threshold to exclude vertices from the optimisation for the different levels (to 32mm, 16mm, 8mm, 4mm, 2mm, 1mm, 0.5mm from coarse to fine) to reach better performance for missing parts of the scans.
- Besides this minor change the authors used the Basel Face Pipeline as is, with between 25 and 45 manually annotated landmarks (eyes: 8, nose 9, mouth 6, eyebrows 4, ears 18).
- The authors used the template of the BFM 2017 for registration which makes their model compatible to this model.
3.3. Sampling and stitching
- The authors stitch the multiple photometric viewpoints into seamless diffuse and specular per-vertex albedo maps using Poisson blending.
- Blending in the gradient domain via solution of a Poisson equation was first proposed by Pérez et al.  for 2D images.
- The approach allows us to avoid visible seams where texture or geometry from different views are inconsistent.
- Otherwise, the authors take the dot product between the surface normal and view vectors as the weight, giving preference to observations whose projected resolution is higher.
- (1) We define an additional selection matrix Svk+1 that selects all triangles not selected in any view (i.e. that have no nonzero weight).the authors.the authors.
3.4. Calibrated colour transformation
- The authors photometric camera captures RAW linear images.
- The authors transform these to linear sRGB space using a colour transformation matrix computed from light SPD and camera spectral sensitivity calibrations, discretised at D evenly spaced wavelengths.
- The authors measure the spectral power distribution of the LEDs used in their lightstage, e ∈ RD, using a B&W Tek BSR111E-VIS spectroradiometer.
- The first performs white balancing: Twb(C, e) = diag(C T e)−1. (4) The second converts from the camera-specific colour space to the standardised XYZ space: Traw2xyz(C) = CCIEC +, (5) where CCIE ∈ R D×3 contains the wavelength discrete CIE1931 2-degree color matching function and C+ is the pseudoinverse of C.
- To preserve white balance the authors rescale each row such that: Traw2xyz(C)1 =.
4. Integrating 3DRFE
- The authors augment their own dataset by additionally including the 23 scans from the 3DRFE dataset .
- This enables us to estimate geometric camera calibration parameters from the 3D vertex positions and corresponding 2D UV coordinates.
- The authors fit the BFM template to the meshes in the same way as for their own data (see Section 3.2).
- To account for variation in overall skin brightness, during capture the camera gain (ISO) was adjusted for each subject.
- The authors apply this transformation to all of the linearised, ISO-normalised albedo maps to give the final set of maps used in their model.
- The authors model diffuse and specular albedo using a linear sta- tistical model learnt with PCA: x(b) = Pb+ x̄, (6) where P ∈ R3n×d contains the d principal components, x̄ ∈ R3n is the vectorised average map and x : Rd 7→ R3n is the generator function that maps from the low dimensional parameter vector b ∈.
- For triangles on the boundary between masked and non-masked regions the authors encourage zero gradient.
- For this reason, the authors replace specular albedo values in the eyeball region by a robust maximum (95th percentile) of the estimated specular albedo values in that region (see Fig. 3(e)).
- The authors use symmetry augmentation in their modelling.
- The final model is a combination of the proposed diffuse and specular albedo model to model facial appearance and the BFM 2017 to model face shape and expressions.
- The authors adopted the publicly available model adaptation framework1 based on  and compare it directly to model adaptation results based on the BFM in Fig.7.
- The authors perform the experiment on the LFW dataset  exactly as proposed in  and just exchanged the model (including applying gamma) and used statistical specular albedo maps during model adaptation.
- These are simply SLR cameras in auto mode with no polarisation, representing a realistic image in approximately ambient light.
- The authors apply the inverse rendering framework with the same configuration, except for limiting the illumination condition to an ambient one and estimate albedo and observe better albedo reconstruction performance for their proposed model compared to the BFM for every single case.
- The authors built and make available the first statistical model of facial diffuse and specular albedo.
- The model at hand fills a gap in 3DMM literature and might be beneficial in various directions.
- Besides applications for computer graphics and vision, the authors also see a benefit for studying human face perception.
- B. Egger and J. Tenenbaum are supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216.
- The authors acknowledge Abhishek Dutta for the original design and construction of their light stage.
Did you find this useful? Give us your feedback