scispace - formally typeset
Open AccessJournal ArticleDOI

Face recognition based on fitting a 3D morphable model

TLDR
This paper presents a method for face recognition across variations in pose, ranging from frontal to profile views, and across a wide range of illuminations, including cast shadows and specular reflections, using computer graphics.
Abstract
This paper presents a method for face recognition across variations in pose, ranging from frontal to profile views, and across a wide range of illuminations, including cast shadows and specular reflections. To account for these variations, the algorithm simulates the process of image formation in 3D space, using computer graphics, and it estimates 3D shape and texture of faces from single images. The estimate is achieved by fitting a statistical, morphable model of 3D faces to images. The model is learned from a set of textured 3D scans of heads. We describe the construction of the morphable model, an algorithm to fit the model to images, and a framework for face identification. In this framework, faces are represented by model parameters for 3D shape and texture. We present results obtained with 4,488 images from the publicly available CMU-PIE database and 1,940 images from the FERET database.

read more

Content maybe subject to copyright    Report

Face Recognition Based on
Fitting a 3D Morphable Model
Volker Blanz and Thomas Vetter, Member, IEEE
Abstract—This paper presents a method for face recognition across variations in pose, ranging from frontal to profile views, and
across a wide range of illuminations, including cast shadows and specular reflections. To account for these variations, the algorithm
simulates the process of image formation in 3D space, using computer graphics, and it estimates 3D shape and texture of faces from
single images. The estimate is achieved by fitting a statistical, morphable model of 3D faces to images. The model is learned from a set
of textured 3D scans of heads. We describe the construction of the morphable model, an algorithm to fit the model to images, and a
framework for face identification. In this framework, faces are represented by model parameters for 3D shape and texture. We present
results obtained with 4,488 images from the publicly available CMU-PIE database and 1,940 images from the FERET database.
Index Terms—Face recognition, shape estimation, deformable model, 3D faces, pose invariance, illumination invariance.
æ
1INTRODUCTION
I
N face recognition from images, the gray-level or color
values provided to the recognition system depend not
only on the identity of the person, but also on parameters
such as head pose and illumination. Variations in pose and
illumination, which may produce changes larger than the
differences between different people’s images, are the main
challenge for face recognition [39]. The goal of recognition
algorithms is to separate the characteristics of a face, which
are determined by the intrinsic shape and color (texture) of
the facial surface, from the random conditions of image
generation. Unlike pixel noise, these conditions may be
described consistently acrosstheentireimagebya
relatively small set of extrinsic parameters, such as camera
and scene geometry, illumination direction and intensity.
Methods in face recognition range within two fundamental
strategies: One approach is to treat these parameters as
separate variables and model their functional role explicitly.
The other approach does not formally distinguish between
intrinsic and extrinsic parameters, and the fact that extrinsic
parameters are not diagnostic for faces is only captured
statistically.
The latter strategy is taken in algorithms that analyze
intensity images directly using statistical methods or neural
networks (for an overview, see Section 3.2 in [39]).
To obtain a separate parameter for orientation, some
methods parameterize the manifold formed by different
views of an individual within the eigenspace of images [16],
or define separate view-based eigenspaces [28]. Another
way of capturing the viewpoint dependency is to represent
faces by eigen-lightfields [17].
Two-dimensional face models represent gray values
and their image locations independently [3], [4], [18], [23],
[13], [22]. These models, however, do not distinguish
between rotation angle and shape, and only some of them
separate illumination from texture [18]. Since large rota-
tions cannot be generated easily by the 2D warping used
in these algorithms due to occlusions, multiple view-based
2D models have to be combined [36], [11]. Another
approach that separates the image locations of facial
features from their appearance uses an approximation of
how features deform during rotations [26].
Complete separation of shape and orientation is
achieved by fitting a deformable 3D model to images. Some
algorithms match a small number of feature vertices to
image positions, and interpolate deformations of the surface
in between [21]. Others use restricted, but class-specific
deformations, which can be defined manually [24], or
learned from images [10], from nontextured [1] or textured
3D scans of heads [8].
In order to separate texture (albedo) from illumination
conditions, some algorithms, which are derived from shape-
from-shading, use models of illumination that explicitly
consider illumination direction and intensity for Lamber-
tian [15], [38] or non-Lambertian shading [34]. After
analyzing images with shape-from-shading, some algo-
rithms use a 3D head model to synthesize images at novel
orientations [15], [38].
The face recognition system presented in this paper
combines deformable 3D models with a computer graphics
simulation of projection and illumination. This makes
intrinsic shape and texture fully independent of extrinsic
parameters [8], [7]. Given a single image of a person, the
algorithm automatically estimates 3D shape, texture, and all
relevant 3D scene parameters. In our framework, rotations
in depth or c hanges of il luminati on are very simple
operations, and all poses and illuminations are covered by
a single model. Illumination is not restricted to Lambertian
reflection, but takes into account specular reflections and
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 9, SEPTEMBER 2003 1063
. V. Blanz is with the Max-Planck-Institut fu
¨
r Informatik, Stuhlsatzen-
hausweg 85, 66123 Saarbru
¨
cken, Germany.
E-mail: blanz@mpi-sb.mpg.de.
. T. Vetter is with the University of Basel, Departement Informatik,
Bernoullistrasse 16, 4057 Basel, Switzerland.
E-mail: thomas.vetter@unibas.ch.
Manuscript received 9 Aug. 2002; accepted 10 Mar. 2003.
Recommended for acceptance by P. Belhumeur.
For information on obtaining reprints of this article, please send e-mail to:
tpami@computer.org, and reference IEEECS Log Number 117108.
0162-8828/03/$17.00 ß 2003 IEEE Published by the IEEE Computer Society

cast shadows, which have considerable influence on the
appearance of human skin.
Our approach is based on a morphable model of 3D faces
that captures the class-specific properties of faces. These
properties are learned automatically from a data set of
3D scans. The morphable model represents shapes and
textures of faces as vectors in a high-dimensional face space,
and involves a probability density function of natural faces
within face space.
Unlike previous systems [8], [7], the algorithm presented
in this paper estimates all 3D scene parameters automati-
cally, including head position and orientation, focal length
of the camera, and illumination direction. This is achieved
by a new initialization procedure that also increases
robustness and reliability of the system considerably. The
new initialization uses image coordinates of between six
and eight feature points. Currently, most face recognition
algorithms require either some initialization, or they are,
unlike our system, restricted to front views or to faces that
are cut out from images.
In this paper, we give a comprehensive description of the
algorit hms inv olved in 1) constructing the morphable
model from 3D scans (Section 3), 2) fitting the model to
images for 3D shape reconstruction (Section 4), which
includes a novel algorithm for parameter optimization
(Appendix B), and 3) measuring similarity of faces for
recognition (Section 5). Recognition results for the image
databases of CMU-PIE [33] and FERET [29] are presented in
Section 5. We start in Section 2 by describing two general
strategies for face recognition with 3D morphable models.
2PARADIGMS FOR MODEL-BASED RECOGNITION
In face recognition, the set of images that shows all
individuals who are known to the system is often referred
to as gallery [39], [30]. In this paper, one gallery image per
person is provided to the system. Recognition is then
performed on novel probe images. We consider two
particular recognition tasks: For identification, the system
reports which person from the gallery is shown on the
probe image. For verification, a person claims to be a
particular member of the gallery. The system decides if the
probe and the gallery image show the same person (cf. [30]).
Fitting the 3D morphable model to images can be used in
two ways for recognition across different viewing conditions:
Paradigm 1. After fitting the model, recognition can be
based on model coefficients, which represent intrinsic shape
and texture of faces, and are independent of the imaging
conditions. For identification, all gallery images are ana-
lyzed by the fitting algorithm, and the shape and texture
coefficients are stored (Fig. 1). Given a probe image, the
fitting algorithm computes coefficients which are then
compared with all gallery data in order to find the nearest
neighbor. Paradigm 1 is the approach taken in this paper
(Section 5).
Paradigm 2. Three-dimension face reconstruction can
also be employed to generate synthetic views from gallery
or probe images [3], [35], [15], [38]. The synthetic views are
then transferred to a second, viewpoint-dependent recogni-
tion system. This paradigm has been evaluated with 10 face
recognition systems in the Face Recognition Vendor Test
2002 [30]: For 9 out of 10 systems, our morphable model and
fitting procedure (Sections 3 and 4) improved performance
on nonfrontal faces substantially.
In many applications, synthetic views have to meet
standard imaging conditions, which may be defined by the
properties of the recognition algorithm, by the way the
gallery images are taken (mug shots), or by a fixed camera
setup for probe images. Standard conditions can be
estimated from an example image by our system (Fig. 2).
If more than one image is required for the second system or
no standard conditions are defined, it may be useful to
synthesize a set of different views of each person.
3AMORPHABLE MODEL OF 3D FACES
The morphable face model is based on a vector space
representation of faces [36] that is constructed such that any
convex combination
1
of shape and texture vectors S
i
and T
i
of a set of examples describes a realistic human face:
S ¼
X
m
i¼1
a
i
S
i
; T ¼
X
m
i¼1
b
i
T
i
: ð1Þ
Continuous changes in the model parameters a
i
generate
a smooth transition such that each point of the initial
surface moves toward a point on the final surface. Just as in
morphing, artifacts in intermediate states of the morph are
avoided only if the initial and final points are correspond-
ing structures in the face, such as the tip of the nose.
Therefore, dense point-to-point correspondence is crucial
for defining shape and texture vectors. We describe an
automated method to establish this correspondence in
Section 3.2, and give a definition of S and T in Section 3.3.
3.1 Database of Three-Dimensional Laser Scans
The morphable model was derived from 3D scans of
100 males and 100 females, aged between 18 and 45 years.
One person is Asian, all others are Caucasian. Applied to
image databases that cover a much larger ethnic variety
1064 IEEE TRANSA CTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 9, SEPTEMBER 2003
Fig. 1. Derived from a database of laser scans, the 3D morphable face
model is used to encode gallery and probe images. For identification, the
model coefficients
i
,
i
of the probe image are compared with the
stored coefficients of all gallery images.
1. To avoid changes in overall size and brightness, a
i
and b
i
should sum
to 1. The additional constraints a
i
;b
i
0; 1 imposed on convex combina-
tions will be replaced by a probabilistic criterion in Section 3.4.

(Section 5), the model seemed to generalize well beyond
ethnic boundaries. Still, a more diverse set of examples
would certainly improve performance.
Recorded with a Cyberware
TM
3030PS laser scanner, the
scans represen t face shape in cylind rical coord inates
relative to a vertical axis centered with respect to the head.
In 512 angular steps covering 360
and 512 vertical steps h
at a spacing of 0.615mm, the device measures radius r,
along with red, green, and blue components of surface
texture R; G; B. We combine radius and texture data:
Iðh; Þ¼ rðh; Þ;Rðh; Þ;Gðh; Þ;Bðh; ÞðÞ
T
;
h; 2f0; ...; 511g:
ð2Þ
Preprocessing of raw scans involves:
1. filling holes and removing spikes in the surface with
an interactive tool,
2. automated 3D alignment of the faces with the
method of 3D-3D Absolute Orientation [19],
3. semiautomatic trimming along the edge of a bathing
cap, and
4. a vertical, planar cut behind the ears and a
horizontal cut at the neck, to remove the back of
the head, and the shoulders.
3.2 Correspondence Based on Optic Flow
The core step of building a morphable face model is to
establish dense point-to -point co rrespondence bet ween
each face an d a reference face. The representation in
cylindrical coordinates provides a parameterization of the
two-dimensional manifold of the facial surface by para-
meters h and . Correspondence is given by a dense vector
field vðh; Þ¼ðhðh; Þ; ðh; ÞÞ
T
such that each point
I
1
ðh; Þ on the first scan corresponds to the point I
2
ðh þ
h; þ Þ on the second scan. We employ a modified
optic flow algorithm to determine this vector field. The
following two sections describe the original algorithm and
our modifications.
Optic Flow on Gray-Level Images. Many optic flow
algorithms (e.g., [20], [25], [2]) are based on the assumption
that objects in image sequences Iðx; y; tÞ retain their bright-
nesses as they move across the image at a velocity ðv
x
;v
y
Þ
T
.
This implies
dI
dt
¼ v
x
@I
@x
þ v
y
@I
@y
þ
@I
@t
¼ 0: ð3Þ
For pairs of images I
1
;I
2
taken at two discrete moments,
temporal derivatives v
x
, v
y
,
@I
@t
in (3) are approximated by
finite differences x, y, and I ¼ I
2
I
1
. If the images are
not from a temporal sequence, but show two different
objects, corresponding points can no longer be assumed to
have equal brightnesses. Still, optic flow algorithms may be
applied successfully.
A unique solution for both components of v ¼ðv
x
;v
y
Þ
T
from (3) can be obtained if v is assumed to be constant on
each neighborhood Rðx
0
;y
0
Þ, and the following expression
[25], [2] is minimized in each point ðx
0
;y
0
Þ:
Eðx
0
;y
0
Þ¼
X
x;y2Rðx
0
;y
0
Þ
v
x
@Iðx; yÞ
@x
þ v
y
@Iðx; yÞ
@y
þ Iðx; yÞ

2
:
ð4Þ
We use a 5 5 pixel neighborhood Rðx
0
;y
0
Þ. In each
point ðx
0
;y
0
Þ, vðx
0
;y
0
Þ can be found by solving a 2 2 linear
system (Appendix A).
In order to deal with large displacements v,the
algorithm of Bergen and Hingorani [2] employs a coarse-
to-fine strategy using Gaussian pyramids of downsampled
images: With the gradient-based method described above,
the algorithm computes the flow field on the lowest level of
resolution and refines it on each subsequent level.
Generalization to three-dimensional surfaces. For pro-
cessing 3D laser scans Iðh; Þ, (4) is replaced by
BLANZ AND VETTER: FACE RECOGNITION BASED ON FITTING A 3D MORPHABLE MODEL 1065
Fig. 2. In 3D model fitting, light direction and intensity are estimated automatically, and cast shadows are taken into account. The figure shows
original PIE images (top), reconstructions rendered into the originals (second row), and the same reconstructions rendered with standard illumination
(third row) taken from the top right image.

E ¼
X
h;2R
v
h
@Iðh;Þ
@h
þ v
@Iðh;Þ
@
þ I
2
; ð5Þ
with a norm I
kk
2
¼ w
r
r
2
þ w
R
R
2
þ w
G
G
2
þ w
B
B
2
: ð6Þ
Weights w
r
, w
R
, w
G
, and w
B
compensate for different
variations within the radius data and the red, green, and
blue texture components, and control the overall weighting
of shape versus texture information. The weights are chosen
heuristically. The minimum of (5) is again given by a 2 2
linear system (Appendix A).
Correspondence between scans of different individuals,
who may differ in overall brightness and size, is improved
by using Laplacian pyramids (band-pass filtering) rather
than Gaussian pyramids (low-pass filtering). Additional
quantities, such as Gaussian curvature, mean curvature, or
surface normals, may be incorporated in Iðh; Þ. To obtain
reliable results even in regions of the face with no salient
structures, a specifically designed smoothing and interpola-
tion algorithm (Appendix A.1) is added to the matching
procedure on each level of resolution.
3.3 Definition of Face Vectors
The definition of shape and texture vectors is based on a
reference face I
0
, which can be any three-dimensional face
model. Our reference face is a triangular mesh with
75,972 vertices derived from a laser scan. Let the vertices
k 2f1; ...;ng of this mesh be located at ðh
k
;
k
;rðh
k
;
k
ÞÞ
in cylindrical and at ðx
k
;y
k
;z
k
Þ in Cartesian coordinates
and have colors ðR
k
;G
k
;B
k
Þ. Reference shape and texture
vectors are then defined by
S
0
¼ðx
1
;y
1
;z
1
;x
2
; ...;x
n
;y
n
;z
n
Þ
T
; ð7Þ
T
0
¼ðR
1
;G
1
;B
1
;R
2
; ...;R
n
;G
n
;B
n
Þ
T
: ð8Þ
To encode a novel scan I (Fig. 3, bottom), we compute
the flow field from I
0
to I, and convert Iðh
0
;
0
Þ to
Cartesian coordinates xðh
0
;
0
Þ, yðh
0
;
0
Þ, zðh
0
;
0
Þ. Coordi-
nates ðx
k
;y
k
;z
k
Þ and color values ðR
k
;G
k
;B
k
Þ for the
shape and texture vectors S and T are then sampled at
h
0
k
¼ h
k
þ hðh
k
;
k
Þ,
0
k
¼
k
þ v
ðh
k
;
k
Þ.
3.4 Principal Component Analysis
We perform a Principal Component Analysis (PCA, see
[12]) on the set of shape and texture vectors S
i
and T
i
of
example faces i ¼ 1...m. Ignoring the correlation between
shape and texture data, we analyze shape and texture
separately.
For shape, we subtract the average
s ¼
1
m
P
m
i¼1
S
i
from
each shape vector, a
i
¼ S
i
s, and define a data matrix
A ¼ða
1
; a
2
; ...; a
m
Þ.
The essential step of PCA is to compute the eigenvec-
tors s
1
; s
2
; ... of the covariance matrix C ¼
1
m
AA
T
¼
1
m
P
m
i¼1
a
i
a
T
i
, which can be achieved by a Singular Value
Decomposition [31] of A.TheeigenvaluesofC,
2
S;1
2
S;2
..., are the variances of the data along each
eigenvector. By the same procedure, we obtain texture
eigenvectors t
i
and variances
2
T;i
. Results are visualized
in Fig. 4. The eigenvectors form an orthogonal basis,
S ¼
s þ
X
m1
i¼1
i
s
i
; T ¼ t þ
X
m1
i¼1
i
t
i
ð9Þ
and PCA provides an estimate of the probability density
within face space:
p
S
ðSÞe
1
2
P
i
2
i
2
S;i
;p
T
ðTÞe
1
2
P
i
2
i
2
T;i
: ð10Þ
3.5 Segments
From a given set of examples, a larger variety of different
faces can be generated if linear combinations of shape and
texture are formed separately for different regions of the
face. In our system, these regions are the eyes, nose, mouth,
and the surrounding area [8]. Once manually defined on the
referen ce face, the segmentation applies to the entire
morphable model.
For continuous transitions between the segments, we
apply a modification of the image blending technique of [9]:
x; y; z coordinates and colors R; G; B are stored in arrays
xðh; Þ, ... based on the mapping i 7h
i
;
i
Þ of the reference
face. The blending technique interpolates x; y; z and R; G; B
across an overlap in the ðh; Þ-domain, which is large for
low spatial frequencies and small for high frequencies.
1066 IEEE TRANSA CTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 9, SEPTEMBER 2003
Fig. 3. For 3D laser scans parameterized by cylindrical coordinates
ðh; Þ, the flow field that maps each point of the reference face (top) to
the corresponding point of the example (bottom) is used to form shape
and texture vectors S and T.
Fig. 4. The average and the first two principal components of a data set
of 200 3D face scans, visualized by adding 3
S;i
s
i
and 3
T;i
t
i
to the
average face.

4MODEL-BASED IMAGE ANALYSIS
The goal of model-based image analysis is to represent a
novel face in an image by model coefficients
i
and
i
(9)
and provide a reconstruction of 3D shape. Moreover, it
automatically estimates all relevant parameters of the three-
dimensional scene, such as pose, focal length of the camera,
light intensity, color, and direction.
In an analysis-by-synthesis loop, the algorithm finds
model parameters and scene parameters such that the
model, rendered by computer graphics algorithms, pro-
duces an image as similar as possible to the input image
I
input
(Fig. 5).
2
The iterative optimization starts from the
average face and standard rendering conditions (front view,
frontal illumination, cf. Fig. 6).
For initialization, the system currently requires image
coordinates of about seven facial feature points, such as the
corners of the eyes or the tip of the nose (Fig. 6). With an
interactive tool, the user defines these points j ¼ 1...7 by
alternately clicking on a point of the reference head to select
a vertex k
j
of the morphable model and on the correspond-
ing point q
x;j
;q
y;j
in the image. Depending on what part of
the face is visible in the image, different vertices k
j
may be
selected for each image. Some salient features in images,
such as the contour line of the cheek, cannot be attributed to
a single vertex of the model, but depend on the particular
viewpoint and shape of the face. The user can define such
points in the image and label them as contours. During the
fitting procedure, the algorithm determines potential con-
tour points of the 3D model based on the angle between
surface normal and viewing direction and selects the closest
contour point of the model as k
j
in each iteration.
The following section summarizes the image synthesis
from the model, and Section 4.2 describes the analysis-by-
synthesis loop for parameter estimation.
4.1 Image Synthesis
The three-dimensional positions and the color values of the
model’s vertices are given by the coefficients
i
and
i
and
(9). Rendering an image includes the following steps.
4.1.1 Image Positions of Vertices
A rigid transformation maps the object-centered coordi-
nates x
k
¼ðx
k
;y
k
;z
k
Þ
T
of each vertex k to a position relative
to the camera:
ðw
x;k
;w
y;k
;w
z;k
Þ
T
¼ R
R
R
x
k
þ t
w
: ð11Þ
The angles and control in-depth rotations around the
vertical and horizontal axis, and defines a rotation around
the camera axis. t
w
is a spatial shift.
A perspective projection then maps vertex k to image
plane coordinates p
x;k
;p
y;k
:
p
x;k
¼ P
x
þ f
w
x;k
w
z;k
;p
y;k
¼ P
y
f
w
y;k
w
z;k
: ð12Þ
f is the focal length of the camera which is located in the
origin, and ðP
x
;P
y
Þ defines the image-plane position of the
optical axis (principal point).
4.1.2 Illumination and Color
Shading of surfaces depends on the direction of the surface
normals n. The normal vector to a triangle k
1
k
2
k
3
of the
face mesh is given by a vector product of the edges,
ðx
k
1
x
k
2
Þðx
k
1
x
k
3
Þ, which is normalized to unit length
and rotated along with the head (11). For fitting the model
to an image, it is sufficient to consider the centers of
triangles only, most of which are about 0:2mm
2
in size. The
BLANZ AND VETTER: FACE RECOGNITION BASED ON FITTING A 3D MORPHABLE MODEL 1067
2. Fig. 5 is illustrated with linear combinations of example faces
according to (1) rather than principal components (9) for visualization.
Fig. 5. The goal of the fitting process is to find shape and texture
coefficients
i
and
i
describing a three-dimensional face model such
that rendering R
produces an image I
model
that is as similar as possible
to I
input
.
Fig. 6. Face reconstruction from a single image (top, left) and a set of
feature points (top, center): Starting from standard pose and illumination
(top, right), the algorithm computes a rigid transformation and a slight
deformation to fit the features. Subsequently, illumination is estimated.
Shape, texture, transformation, and illumination are then optimized for
the entire face and refined for each segment (second row). From the
reconstructed face, novel views can be generated (bottom row).

Citations
More filters
Journal ArticleDOI

Face recognition: A literature survey

TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Book

Computer Vision: Algorithms and Applications

TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Proceedings ArticleDOI

Face detection, pose estimation, and landmark localization in the wild

TL;DR: It is shown that tree-structured models are surprisingly effective at capturing global elastic deformation, while being easy to optimize unlike dense graph structures, in real-world, cluttered images.
Proceedings Article

A morphable model for the synthesis of 3D faces

Matthew Turk
Journal ArticleDOI

Computer and Robot Vision

TL;DR: Computer and Robot Vision Vol.
References
More filters

Numerical recipes in C

TL;DR: The Diskette v 2.06, 3.5''[1.44M] for IBM PC, PS/2 and compatibles [DOS] Reference Record created on 2004-09-07, modified on 2016-08-08.
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book ChapterDOI

Neural Networks for Pattern Recognition

TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Proceedings Article

An iterative image registration technique with an application to stereo vision

TL;DR: In this paper, the spatial intensity gradient of the images is used to find a good match using a type of Newton-Raphson iteration, which can be generalized to handle rotation, scaling and shearing.
Journal ArticleDOI

Determining optical flow

TL;DR: In this paper, a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image, and an iterative implementation is shown which successfully computes the Optical Flow for a number of synthetic image sequences.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What are the contributions mentioned in the paper "Face recognition based on fitting a 3d morphable model" ?

This paper presents a method for face recognition across variations in pose, ranging from frontal to profile views, and across a wide range of illuminations, including cast shadows and specular reflections. The authors describe the construction of the morphable model, an algorithm to fit the model to images, and a framework for face identification. In this framework, faces are represented by model parameters for 3D shape and texture. The authors present results obtained with 4,488 images from the publicly available CMU-PIE database and 1,940 images from the FERET database. 

It is straightforward to extend their morphable model to different ages, ethnic groups, and facial expressions by including face vectors from more 3D scans. Future work will also concentrate on automated initi- alization and a faster fitting procedure. In applications that require a fully automated system, their algorithm may be combined with an additional feature detector. 

The core step of building a morphable face model is toestablish dense point-to-point correspondence betweeneach face and a reference face. 

For Gaussian pixel noise with a standard deviation The author, the likelihood of observing Iinput, given ; ; , is a product of one-dimensional normal distributions, with one distribu-tion for each pixel and each color channel. 

Given a probe image, the fitting algorithm computes coefficients which are then compared with all gallery data in order to find the nearest neighbor. 

Given an input imageIinputðx; yÞ ¼ ðIrðx; yÞ; Igðx; yÞ; Ibðx; yÞÞT ;the primary goal in analyzing a face is to minimize the sum of square differences over all color channels and all pixels between this image and the synthetic reconstruction,EI ¼ X x;y Iinputðx; yÞ Imodelðx; yÞ 2: ð17Þ 

In face recognition, the set of images that shows all individuals who are known to the system is often referred to as gallery [39], [30]. 

Shape coefficients i and rigid transformation, however, influence both the image coordinates ðpx;k; py;kÞ and color values Imodel;k due to the effect of geometry on surface normals and shading (14). 

The first iterations only optimize the first parameters i; i; i 2 f1; . . . ; 10g and all parameters i. Subsequent iterations consider more and more coefficients. 

Heads in the CMU-PIE database are not fully aligned in space, but, since front, side, and profile images are taken simultaneously, the relative angles between views should be constant. 

In each vertex k, the red channel isLr;k¼ Rk Lr;amb þRk Lr;dir nk; lh i þ ks Lr;dir rk; bvkh i ; ð14Þ where Rk is the red component of the diffuse reflection coefficient stored in the texture vector T, ks is the specular reflectance, defines the angular distribution of the specular reflections, bvk is the viewing direction, and rk ¼ 2 nk; lh ink l is the direction of maximum specular reflection [14]. 

Reference shape and texture vectors are then defined byS0 ¼ ðx1; y1; z1; x2; . . . ; xn; yn; znÞT ; ð7Þ T0 ¼ ðR1; G1; B1; R2; . . . ; Rn;Gn;BnÞT : ð8ÞTo encode a novel scan The author(Fig. 3, bottom), the authors compute the flow field from I0 to I, and convert Iðh0; 0Þ to Cartesian coordinates xðh0; 0Þ, yðh0; 0Þ, zðh0; 0Þ.