Journal Article•DOI•

3D Face Recognition under Expressions, Occlusions, and Pose Variations

Hassen Drira¹, Boulbaba Ben Amor¹, Anuj Srivastava², Mohamed Daoudi¹, Rim Slama¹ - Show less +1 more•Institutions (2)

TELECOM Lille 1¹, Florida State University²

01 Sep 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 35, Iss: 9, pp 2270-2283

TL;DR: A novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and averaging their shapes, which allows for formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average shapes.

read less

Abstract: We propose a novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and averaging their shapes. Here we represent facial surfaces by radial curves emanating from the nose tips and use elastic shape analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. This representation, along with the elastic Riemannian metric, seems natural for measuring facial deformations and is robust to challenges such as large facial expressions (especially those with open mouths), large pose variations, missing parts, and partial occlusions due to glasses, hair, and so on. This framework is shown to be promising from both-empirical and theoretical-perspectives. In terms of the empirical evaluation, our results match or improve upon the state-of-the-art methods on three prominent databases: FRGCv2, GavabDB, and Bosphorus, each posing a different type of challenge. From a theoretical perspective, this framework allows for formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average shapes.

...read moreread less

Summary (4 min read)

Jump to: [1 INTRODUCTION] – [1.1 Previous Work] – [1.2 Overview of Our Approach] – [2.1 Motivation for Radial Curves] – [2.2 Motivation for Elasticity] – [2.3 Automated Extraction of Radial Curves] – [2.4 Curve Quality Filter] – [3.1 Background on the Shapes of Curves] – [3.2 Shape Metric for Facial Surfaces] – [3.3 Computation of the Mean Shape] – [3.4 Completion of Partially-Obscured Curves] – [4.1 Data Preprocessing] – [4.2 Comparative Evaluation on the FRGCv2 Dataset] – [4.3 Evaluation on the GavabDB Dataset] – [4.4 3D Face Recognition on the Bosphorus Dataset: Recognition Under External Occlusion] – [5 DISCUSSION] and [6 CONCLUSION]

1 INTRODUCTION

Due to the natural, non-intrusive, and high throughput nature of face data acquisition, automatic face recognition has many benefits when compared to other biometrics.
Amongst different modalities available for face imaging, 3D scanning has a major advantage over 2D color imaging in that nuisance variables, such as illumination and small pose changes, have a relatively smaller influence on the observations.
3D scans often suffer from the problem of missing parts due to self occlusions or external occlusions, or some imperfections in the scanning technology.
Additionally, the authors provide some basic tools for statistical shape analysis of facial surfaces.

1.1 Previous Work

The task of recognizing 3D face scans has been approached in many ways, leading to varying levels of successes.
Similar approaches, but using manually annotated models, are presented in [31], [17].
To handle the open mouth problem, they first detect and remove the lip region, and then compute the surface distance in presence of a hole corresponding to the removed part [5].
Samir et al. [28] use the level curves of the surface distance function (from the tip of the nose) as features for face recognition.
Fig. 2 shows some facial expressions leading to a significant shrinking or stretching of the skin surface and, thus, causing both Euclidean and surface distances between these points to change.

1.2 Overview of Our Approach

This paper presents a Riemannian framework for 3D facial shape analysis.
This framework is based on elastically matching and comparing radial curves emanating from the tip of the nose and it handles several of the problems described above.
To handle the missing data, it introduces a restoration step that uses statistical estimation on shape manifolds of curves.
This basic setup is evaluated on the FRGCv2 dataset following the standard protocol (see Section 4.2).
These steps include occlusion detection (Component I) and missing data restoration (Component II).

2.1 Motivation for Radial Curves

The changes in facial expressions affect different regions of a facial surface differently.
In the case of the missing parts and partial occlusion, at least some part of every radial curve is usually available.
Based on these arguments, the authors choose a novel geometrical representation of facial surfaces using radial curves that start from the nose tip.

2.2 Motivation for Elasticity

Consider the two parameterized curves shown in Fig. 5; call them β1 and β2.
The expression on the left has the mouth open whereas the expression on the right has the mouth closed.
In order to compare their shapes, the authors need to register points across those curves.
For curves, the problem of optimal registration is actually the same as that of optimal re-parameterization.
This optimization leads to a proper distance ( distance) and an optimal deformation between the shapes of curves.

2.3 Automated Extraction of Radial Curves

Each facial surface is represented by an indexed collection of radial curves that are defined and extracted as follows.
Pα that has the nose tip as its origin and makes an angle α with the plane containing the reference curve.
Using these curves, the authors will demonstrate that the elastic framework is well suited to modeling of deformations associated with changes in facial expressions and for handling missing data.
The gallery face in this example belongs to the same person under the same expression.
Since the curve extraction on the probe face is based on the gallery nose coordinates which belongs to another person, the curves may be shifted in this nose region.

2.4 Curve Quality Filter

In situations involving non-frontal 3D scans, some curves may be partially hidden due to self occlusion.
The use of these curves in face recognition can severely degrade the recognition performance and, therefore, they should be identified and discarded.
The authors introduce a quality filter that uses the continuity and the length of a curve to detect such curves.
The discontinuity or the shortness of a curve results either from missing data or large noise.
Recall that during the pre-processing step, there is a provision for filling holes.

3.1 Background on the Shapes of Curves

More precisely, as shown in [30], an elastic metric for comparing shapes of curves becomes the simple L2-metric under the SRVF representation.
(A similar metric and representation for curves was also developed by Younes et al. [33] but it only applies to planar curves and not to facial curves).
Furthermore, under L2-metric, the re-parametrization group acts by isometries on the manifold of q functions, which is not the case for the original curve β.
By iterating between these two, the authors can reach a solution for the joint optimiza- tion problem.

3.2 Shape Metric for Facial Surfaces

Now the authors extend the framework from radial curves to full facial surfaces.
The indexing provides a correspondence between curves across faces.
Since the authors have deformations (geodesic paths) between corresponding curves, they can combine these deformations to obtain deformations between full facial surfaces.
Algorithm 1 is used to calculate the geodesic path in the shape space.
The upper lips match the upper lips, for instance, and this helps produce a natural opening of the mouth as illustrated in the top row in Fig. 10.

3.3 Computation of the Mean Shape

One can use the notion of Karcher mean [14] to define an average face that can serve as a representative face of a group of faces.
The Karcher mean is then defined by: S = argminS∈Sn V(S).
The algorithm for computing Karcher mean is a standard one, see e.g. [8], and is not repeated here to save space.
This minimizer may not be unique and, in practice, one can pick any one of those solutions as the mean face.

3.4 Completion of Partially-Obscured Curves

Earlier the authors have introduced a filtering step that finds and removes curves with missing parts.
Once the authors detect points that belong to the face and points that belong to the occluding object, they first remove the occluding object and use a statistical model in the shape space of radial curves to complete the broken curves.
To keep the model simple, the authors use the PCA of the training data, in an appropriate vector space, to form an orthogonal basis representing training shapes.
In order to evaluate this reconstruction step, the authors have compared the restored surface (shown in the top row of Fig. 12) with the complete neutral face of that class, as shown in Fig. 13.
In the remainder of this paper, the authors will apply this comprehensive framework for 3D face recognition using a variety of well known and challenging datasets.

4.1 Data Preprocessing

Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial.
As illustrated in Fig. 14, this step includes the following items: .
The hole-filling filter identifies and fills holes in input meshes.
The holes are created either because of the absorption of laser in dark areas, such as eyebrows and mustaches, or self-occlusion or open mouths.
The nose tip is automatically detected for frontal scans and manually annotated for scans with occlusions and large pose variation.

4.2 Comparative Evaluation on the FRGCv2 Dataset

For the first evaluation the authors use the FRGCv2 dataset in which the scans have been manually clustered into three categories: neutral expression, small expression, and large expression.
Note that this method results in 97.7% rank-1 recognition rate in the case of neutral vs. all.
For that end, one would need a systematic evaluation on a dataset with the missing data issues, e.g. the GavabDB.
For the standard protocol testings, the ROC III mask of FRGC v2, the authors obtain the verification rates of around 97%, which is comparable to the best published results.
Since scans in FRGCv2 are mostly frontal and have high quality, many methods are able to provide good performance.

4.3 Evaluation on the GavabDB Dataset

Since GavabDB [21] has many noisy 3D face scans under large facial expressions, the authors will use that database to help evaluate their framework.
Each subject was scanned nine times from different angles and under different facial expressions (six with the neutral expression and three with nonneutral expressions).
As noted, their approach provides the highest recognition rate for faces with non-neutral expressions (94.54%).
Fig. 17 illustrates examples of correct and incorrect matches for some probe faces.
The performance decreases for scans from the left or right sides because more parts are occluded in those scans.

4.4 3D Face Recognition on the Bosphorus Dataset: Recognition Under External Occlusion

In this section the authors will use components I (occlusion detection and removal) and II (missing data restora- tion) in the algorithm.
In each iteration, the authors match the current face scan with the template using ICP and remove those points on the scan that are more than a certain threshold away from the corresponding points on the template.
The rank-1 recognition rate is reported in Fig. 20 for different approaches depending upon the type of occlusion.
The rank-1 recognition rate is 78.63% when the authors remove the occluded parts and apply the recognition algorithm using the remaining parts, as described in Section 2.4.
Even if the part added with restoration introduces some error, it still allows us to use the shapes of the partially observed curves.

5 DISCUSSION

In order to study the performance of the proposed approach in presence of different challenges, the authors have presented experimental results using three wellknown 3D face databases.
The authors have obtained com- petitive results relative to the state of the art for 3D face recognition in presence of large expressions, nonfrontal views and occlusions.
Table 4 also reports the computational time of their approach and some state of the art methods on the FRGCv2 dataset.
For each approach, the authors report the time needed for preprocessing and/or feature extraction in the first column.
In the case of GavabDB and Bosphorus, the nose tip was manually annotated for non frontal and occluded faces.

6 CONCLUSION

The authors have also presented results on 3D face recognition designed to handle variations of facial expression, pose variations and occlusions between gallery and probe scans.
This method has several properties that make it appropriate for 3D face recognition in non-cooperative scenarios.
Lastly, in the presence of occlusion, the authors have proposed to remove the occluded parts then to recover only the missing data on the 3D scan using statistical shape models.
That is, the authors have constructed a low dimensional shape subspace for each element of the indexed collection of curves, and then represent a curve (with missing data) as a linear combination of its basis elements.

Did you find this useful? Give us your feedback

Figures (25)

Fig. 5. An example of matching radial curves extracted from two faces belonging to the same person: a curve with an open mouth (on the left) and a curve with a closed mouth (on the right). One needs a combination of stretching and shrinking to match similar points (upper lips, lower lips, etc)

Fig. 6. Extraction of radial curves: images in the middle illustrate the intersection between the face surface and planes to form two radial curves. The collection of radial curves is illustrated in the rightmost image.

Fig. 14. The different steps of preprocessing: acquisition, filling holes, cropping and smoothing

Fig. 15. The CMC curves of our approach for the following scenario: neutral vs. neutral, neutral vs. expressions and neutral vs. all.

Fig. 20. Recognition results on the Bosphorus database and comparison with state-of-the-art approaches.

Fig. 1. Different challenges of 3D face recognition: expressions, missing data and occlusions.

Fig. 8. Curve quality filter: examples of detection of broken and short curves (in red) and good curves (in blue).

Fig. 7. Curves extraction on a probe face after its rigid alignment with a gallery face. In (a), the nose region of the probe is missing and filled using linear interpolation. The probe and gallery faces are from the same class for (a) and (b), while they are from different classes for (c).

Fig. 12. (a) Faces with external occlusion, (b) faces after the detection and removal of occluding parts and (c) the estimation of the occluded parts using a statistical model on the shape spaces of curves.

Fig. 13. Illustration of a face with missing data (after occlusion removal) and its restoration. The deviation between the restored face and the corresponding neutral face is also illustrated.

Fig. 21. Examples of non recognized faces. Each row illustrates, from left to right, the occluded face, the result of occlusion removal and the result of restoration.

TABLE 4 Comparative study of time implementations and recognition accuracy on FRGCv2 of the proposed approach with state-of-the-art.

Fig. 19. Examples of faces from the Bosphorus database. The unoccluded face on the left and the different types of occlusions are illustrated.

TABLE 3 Recognition results comparison of the different methods on the GavabDB.

Fig. 18. Gradual removal of occluding parts in a face scan using Recursive-ICP.

Fig. 2. Significant changes in both Euclidean and surface distances under large facial expressions.

TABLE 1 Comparison of rank-1 scores on the FRGCv2 dataset with the state-of-the-art results.

Fig. 16. The ROC curves of our approach for the following scenario: All vs. All and the ROC III mask.

TABLE 2 Comparison of verification rates at FAR=0.1% on the FRGCv2 dataset with state-of-the-art results (the ROC III mask and the All vs. All scenario).

Fig. 17. Examples of correct (top row) and incorrect matches (bottom row). For each pair, the probe (on the left) and the ranked-first face from the gallery (on the right) are reported.

Fig. 11. Karcher mean of eight faces (left) is shown on the right.

Content maybe subject to copyright Report

HAL Id: halshs-00783066

https://halshs.archives-ouvertes.fr/halshs-00783066

Submitted on 31 Jan 2013

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-

entic research documents, whether they are pub-

lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diusion de documents

scientiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

3D Face Recognition Under Expressions,Occlusions and

Pose Variations

Hassen Drira, Ben Amor Boulbaba, Srivastava Anuj, Mohamed Daoudi, Rim

Slama

To cite this version:

Hassen Drira, Ben Amor Boulbaba, Srivastava Anuj, Mohamed Daoudi, Rim Slama. 3D Face Recog-

nition Under Expressions,Occlusions and Pose Variations. IEEE Transactions on Pattern Analysis

and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2013, pp.2270 - 2283.

�halshs-00783066�

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1

3D Face Recognition Under Expressions,

Occlusions and Pose Variations

Hassen Drira, Boulbaba Ben Amor, Anuj Srivastava, Mohamed Daoudi, and Rim Slama

Abstract—We propose a novel geometric framework for analyzing 3D faces, with the speciﬁc goals of comparing, matching, and

averaging their shapes. Here we represent facial surfaces by radial curves emanating from the nose tips and use elastic shape

analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. This representation,

along with the elastic Riemannian metric, seems natural for measuring facial deformations and is robust to challenges such as

large facial expressions (especially those with open mouths), large pose variations, missing parts, and partial occlusions due

to glasses, hair, etc. This framework is shown to be promising from both – empirical and theoretical – perspectives. In terms

of the empirical evaluation, our results match or improve the state-of-the-art methods on three prominent databases: FRGCv2,

GavabDB, and Bosphorus, each posing a different type of challenge. From a theoretical perspective, this framework allows for

formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average

shapes.

Index Terms—3D face recognition, shape analysis, biometrics, quality control, data restoration.

✦

1 INTRODUCTION

Due to the natural, non-intrusive, and high through-

put nature of face data acquisition, automatic face

recognition has many beneﬁts when compared to

other biometrics. Accordingly, automated face recog-

nition has received a growing attention within the

computer vision community over the past three

decades. Amongst different modalities available for

face imaging, 3D scanning has a major advantage

over 2D color imaging in that nuisance variables, such

as illumination and small pose changes, have a rela-

tively smaller inﬂuence on the observations. However,

3D scans often suffer from the problem of missing

parts due to self occlusions or external occlusions,

or some imperfections in the scanning technology.

Additionally, variations in face scans due to changes

in facial expressions can also degrade face recognition

performance. In order to be useful in real-world appli-

cations, a 3D face recognition approach should be able

to handle these challenges, i.e., it should recognize

people despite large facial expressions, occlusions and

large pose variations. Some examples of face scans

highlighting these issues are illustrated in Fig. 1.

We note that most recent research on 3D face

analysis has been directed towards tackling changes

in facial expressions while only a relatively modest

This paper was presented in part at BMVC 2010 [7].

• H. Drira, B. Ben Amor and M. Daoudi are with LIFL (UMR CNRS

8022), Institut Mines-T´el´ecom/TELECOM Lille 1, France.

E-mail: hassen.drira@telecom-lille1.eu

• R. Slama is with LIFL (UMR CNRS 8022), University of Lille 1,

France.

• A. Srivastava is with the Department of Statistics, FSU, Tallahassee,

FL, 32306, USA.

Fig. 1. Different challenges of 3D face recognition:

expressions, missing data and occlusions.

effort has been spent on handling occlusions and

missing parts. Although a few approaches and cor-

responding results dealing with missing parts have

been presented, none, to our knowledge, has been ap-

plied systematically to a full real database containing

scans with missing parts. In this paper, we present a

comprehensive Riemannian framework for analyzing

facial shapes, in the process dealing with large expres-

sions, occlusions and missing parts. Additionally, we

provide some basic tools for statistical shape analysis

of facial surfaces. These tools help us to compute a

typical or average shape and measure the intra-class

variability of shapes, and will even lead to face atlases

in the future.

1.1 Previous Work

The task of recognizing 3D face scans has been

approached in many ways, leading to varying levels

of successes. We refer the reader to one of many

extensive surveys on the topic, e.g. see Bowyer et

al. [3]. Below we summarize a smaller subset that is

more relevant to our paper.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2

1. Deformable template-based approaches: There

have been several approaches in recent years that

rely on deforming facial surfaces into one another,

under some chosen criteria, and use quantiﬁcations

of these deformations as metrics for face recognition.

Among these, the ones using non-linear deformations

facilitate the local stretching, compression, and

bending of surfaces to match each other and

are referred to as elastic methods. For instance,

Kakadiaris et al. [13] utilize an annotated face model

to study geometrical variability across faces. The

annotated face model is deformed elastically to ﬁt

each face, thus matching different anatomical areas

such as the nose, eyes and mouth. In [25], Passalis et

al. use automatic landmarking to estimate the pose

and to detect occluded areas. The facial symmetry is

used to overcome the challenges of missing data here.

Similar approaches, but using manually annotated

models, are presented in [31], [17]. For example, [17]

uses manual landmarks to develop a thin-plate-spline

based matching of facial surfaces. A strong limitation

of these approaches is that the extraction of ﬁducial

landmarks needed during learning is either manual

or semi-automated, except in [13] where it is fully

automated.

2. Local regions/ features approaches: Another com-

mon framework, especially for handling expression

variability, is based on matching only parts or regions

rather than matching full faces. Lee et al. [15] use

ratios of distances and angles between eight ﬁducial

points, followed by a SVM classiﬁer. Similarly, Gupta

et al. [11] use Euclidean/geodesic distances between

anthropometric ﬁducial points, in conjunction with

linear classiﬁers. As stated earlier, the problem of au-

tomated detection of ﬁducial points is non-trivial and

hinders automation of these methods. Gordon [10]

argues that curvature descriptors have the potential

for higher accuracy in describing surface features and

are better suited to describe the properties of faces in

areas such as the cheeks, forehead, and chin. These

descriptors are also invariant to viewing angles. Li et

al. [16] design a feature pooling and ranking scheme

in order to collect various types of low-level geometric

features, such as curvatures, and rank them according

to their sensitivity to facial expressions. Along similar

lines, Wang et al. [32] use a signed shape-difference

map between two aligned 3D faces as an interme-

diate representation for shape comparison. McKeon

and Russ [19] use a region ensemble approach that

is based on Fisherfaces, i.e., face representations are

learned using Fisher’s discriminant analysis.

In [12], Huang et al. use a multi-scale Local Binary

Pattern (LBP) for a 3D face jointly with shape index.

Similarly, Moorthy et al. [20] use Gabor features

around automatically detected ﬁducial points.

To avoid passing over deformable parts of faces

encompassing discriminative information, Faltemier

et al. [9] use 38 face regions that densely cover the

face, and fuse scores and decisions after performing

ICP on each region. A similar idea is proposed in [29]

that uses PCA-LDA for feature extraction, treating

the likelihood ratio as a matching score and using

the majority voting for face identiﬁcation. Queirolo et

al. [26] use Surface Inter-penetration Measure (SIM)

as a similarity measure to match two face images.

The authentication score is obtained by combining

the SIM values corresponding to the matching of

four different face regions: circular and elliptical

areas around the nose, forehead, and the entire

face region. In [1], the authors use Average Region

Models (ARMs) locally to handle the challenges of

missing data and expression-related deformations.

They manually divide the facial area into several

meaningful components and the registration of faces

is carried out by separate dense alignments to the

corresponding ARMs. A strong limitation of this

approach is the need for manual segmentation of a

face into parts that can then be analyzed separately.

3. Surface-distance based approaches: There are sev-

eral papers that utilize distances between points on

facial surfaces to deﬁne features that are eventually

used in recognition. (Some papers call it geodesic dis-

tance but, in order to distinguish it from our later use

of geodesics on shape spaces of curves and surfaces,

we shall call it surface distance.) These papers assume

that surface distances are relatively invariant to small

changes in facial expressions and, therefore, help gen-

erate features that are robust to facial expressions.

Bronstein et al. [4] provide a limited experimental

illustration of this invariance by comparing changes

in surface distances with the Euclidean distances

between corresponding points on a canonical face

surface. To handle the open mouth problem, they ﬁrst

detect and remove the lip region, and then compute

the surface distance in presence of a hole correspond-

ing to the removed part [5]. The assumption of preser-

vation of surface distances under facial expressions

motivates several authors to deﬁne distance-based

features for facial recognition. Samir et al. [28] use

the level curves of the surface distance function (from

the tip of the nose) as features for face recognition.

Since an open mouth affects the shape of some level

curves, this method is not able to handle the problem

of missing data due to occlusion or pose variations.

A similar polar parametrization of the facial surface

is proposed in [24] where the authors study local

geometric attributes under this parameterization. To

deal with the open mouth problem, they modify the

parametrization by disconnecting the top and bottom

lips. The main limitation of this approach is the need

for detecting the lips, as proposed in [5]. Berretti et al.

[2] use surface distances to deﬁne facial stripes which,

in turn, is used as nodes in a graph-based recognition

algorithm.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 3

The main limitation of these approaches, apart from

the issues resulting from open mouths, is that they

assume that surface distances between facial points

are preserved within face classes. This is not valid

in the case of large expressions. Actually, face ex-

pressions result from the stretching or the shrinking

of underlying muscles and, consequently, the facial

skin is deformed in a non-isometric manner. In other

words, facial surfaces are also stretched or compressed

locally, beyond a simple bending of parts.

In order to demonstrate this assertion, we placed

four markers on a face and tracked the changes in

the surface and Euclidean (straight line) distances

between the markers under large expressions. Fig. 2

shows some facial expressions leading to a signiﬁcant

shrinking or stretching of the skin surface and, thus,

causing both Euclidean and surface distances between

these points to change. In one case these distances

decrease (from 113 mm to 103 mm for the Euclidean

distance, and from 115 mm to 106 mm for the surface

distance) while in the other two cases they increase.

This clearly shows that large expressions can cause

stretching and shrinking of facial surfaces, i.e., the

facial deformation is elastic in nature. Hence, the

assumption of an isometric deformation of the shape

of the face is not strictly valid, especially for large

expressions. This also motivates the use of elastic

shape analysis in 3D face recognition.

71 mm

77 mm

57 mm

56 mm

Neutral face

Stretching

Expressive face

Distance along line (Euclidian)

Distance along surface (Geodesic)

65 mm

4 mm

62 mm

59 mm

Neutral face

Stretching

Expressive face

106 mm

115 mm

113 mm

103 mm

Shrinking

Neutral face Expressive face

Fig. 2. Signiﬁcant changes in both Euclidean and

surface distances under large facial expressions.

1.2 Overview of Our Approach

This paper presents a Riemannian framework for

3D facial shape analysis. This framework is based

on elastically matching and comparing radial curves

emanating from the tip of the nose and it handles

several of the problems described above. The main

contributions of this paper are:

• It extracts, analyzes, and compares the shapes of

radial curves of facial surfaces.

• It develops an elastic shape analysis of 3D faces

by extending the elastic shape analysis of curves

[30] to 3D facial surfaces.

• To handle occlusions, it introduces an occlusion

detection and removal step that is based on

recursive-ICP.

• To handle the missing data, it introduces a

restoration step that uses statistical estimation on

shape manifolds of curves. Speciﬁcally, it uses

PCA on tangent spaces of the shape manifold to

model the normal curves and uses that model to

complete the partially-observed curves.

The different stages and components of our method

are laid out in Fig. 3. While some basic steps are

common to all application scenarios, there are also

some specialized tools suitable only for speciﬁc situa-

tions. The basic steps that are common to all situations

include 3D scan preprocessing (nose tip localization,

ﬁlling holes, smoothing, face cropping), coarse and

ﬁne alignment, radial curve extraction, quality ﬁlter-

ing, and elastic shape analysis of curves (Component

III and quality module in Component II). This basic

setup is evaluated on the FRGCv2 dataset following

the standard protocol (see Section 4.2). It is also tested

on the GAVAB dataset where, for each subject, four

probe images out of nine have large pose variations

(see Section 4.3). Some steps are only useful where

one anticipates some data occlusion and missing data.

These steps include occlusion detection (Component

I) and missing data restoration (Component II). In

these situations, the full processing includes Compo-

nents I+II+III to process the given probes. This ap-

proach has been evaluated on a subset of the Bosphorus

dataset that involves occlusions (see Section 4.4). In

the last two experiments, except for the manual de-

tection of nose coordinates, the remaining processing

is automatic.

2 RADIAL, ELASTIC CURVES: MOTIVATION

Since an important contribution of this paper is its

novel use of radial facial curves studied using elastic

shape analysis.

2.1 Motivation for Radial Curves

Why should one use the radial curves emanating from

the tip of the nose for representing facial shapes?

Firstly, why curves and not other kinds of facial

features? Recently, there has been signiﬁcant progress

in the analysis of curves shapes and the resulting

algorithms are very sophisticated and efﬁcient [30],

[33]. The changes in facial expressions affect different

regions of a facial surface differently. For example,

during a smile, the top half of the face is relatively

unchanged while the lip area changes a lot, and

when a person is surprised the effect is often the

opposite. If chosen appropriately, curves have the

potential to capture regional shapes and that is why

their role becomes important. The locality of shapes

represented by facial curves is an important reason

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 4

3D face scan

preprocessing

Probe image

Gallery image

Coarse

registra!on

Radial curves

extrac!on

Gallery

Fine

registra!on

Occlusion

Presence?

Occlusion

removal

Yes

Quality

Filter?

For Each curve

Curve to be kept

Curve to be

restored

Curve

Comple!on

Restored

face

Elas!c Shape analysis

framework of radial curves

III. Elas!c matching of facial curves/surfaces

(a) Example of inter-class geodesic (change in iden!ty)

(b) Example of intra-class geodesic (change in facial expression)

II. Missing data restora!on

I. Occlusion detec!on and removal

Common stages

Speciﬁed stages

Approach Components

For Each curve

Fig. 3. Overview of the proposed method.

Fig. 4. A smile (see middle) changes the shapes of the

curves in the lower part of a the face while the act of

surprise changes shapes of curves in the upper part of

the face (see right).

for their selection. The next question is: Which facial

curves are suitable for recognizing people? Curves

on a surface can, in general, be deﬁned either as

the level curves of a function or as the streamlines

of a gradient ﬁeld. Ideally, one would like curves

that maximally separate inter-class variability from

the intra-class variability (typically due to expression

changes). The past usage of the level curves (of the

surface distance function) has the limitation that each

curve goes through different facial regions and that

makes it difﬁcult to isolate local variability. Actually,

the previous work on shape analysis of facial curves

for 3D face recognition was mostly based on level

curves [27], [28].

In contrast, the radial curves with the nose tip as

origin have a tremendous potential. This is because:

(i) the nose is in many ways the focal point of a face.

It is relatively easy and efﬁcient to detect the nose

tip (compared to other facial parts) and to extract

radial curves, with nose tip as the center, in a com-

pletely automated fashion. It is much more difﬁcult

to automatically extract other types of curves, e.g.

those used by sketch artists ( cheek contours, fore-

head proﬁles, eye boundaries, etc). ( ii) Different radial

curves pass through different regions and, hence, can

be associated with different facial expressions. For

instance, differences in the shapes of radial curves in

the upper-half of the face can be loosely attributed

to the inter-class variability while those for curves

passing through the lips and cheeks can largely be

due to changes in expressions. This is illustrated in

Fig. 4 which shows a neutral face (left), a smiling

face (middle), and a surprised face (right). The main

difference in the middle face, relative to the left face,

lies in the lower part of the face, while for the right

face the main differences lie in the top half. (iii) Radial

curves have a more universal applicability. The curves

used in the past have worked well for some speciﬁc

tasks, e.g., lip contours in detecting certain expres-

sions, but they have not been as efﬁcient for some

other tasks, such as face recognition. In contrast, radial

curves capture the full geometry and are applicable to

a variety of applications, including facial expression

recognition. (iv) In the case of the missing parts and

partial occlusion, at least some part of every radial

curve is usually available. It is rare to miss a full

radial curve. In contrast, it is more common to miss

an eye due to occlusion by glasses, the forehead due

to hair, or parts of cheeks due to a bad angle for

laser reﬂection. This issue is important in handling the

missing data via reconstruction, as shall be described

later in this paper. (v) Natural face deformations

HTML Viewer

Frequently Asked Questions (11)

Q1. What are the advantages of using parallel techniques?

Regarding computational efficiency, parallel techniques can also be exploited to improve performance of their approach since the computation of curve distances, preprocessing, etc, are independent tasks.

Q2. How do Faltemieret al. avoid passing over deformable parts of faces?

To avoid passing over deformable parts of faces encompassing discriminative information, Faltemieret al. [9] use 38 face regions that densely cover the face, and fuse scores and decisions after performing ICP on each region.

Q3. What is the importance of the pre-processing step?

Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial.

Q4. What is the main limitation of this approach?

A strong limitation of this approach is the need for manual segmentation of a face into parts that can then be analyzed separately.

Q5. What is the reason for the differences in the shape of the nose curve?

For instance, differences in the shapes of radial curves in the upper-half of the face can be loosely attributed to the inter-class variability while those for curves passing through the lips and cheeks can largely be due to changes in expressions.

Q6. How many scans are available for each face?

The number of total face scans is 4652; at least 54 scans each are available for most of the subjects, while there are only 31 scans each for 34 of them.

Q7. What is the rank-1 recognition rate in the difficult scenario of neutral vs. expressions?

In the difficult scenario of neutral vs. expressions, the rank-1 recognition rate is 96.8%, which represents a high performance, while in the simpler case of neutral vs. neutral the rate is 99.2%.

Q8. What is the common framework for handling expression variability?

Another common framework, especially for handling expression variability, is based on matching only parts or regions rather than matching full faces.

Q9. What is the main limitation of these approaches?

The main limitation of these approaches, apart from the issues resulting from open mouths, is that they assume that surface distances between facial points are preserved within face classes.

Q10. What is the equivalence class of a curve?

In order to study shapes of curves, one should identify all rotations and re-parameterizations of a curve as an equivalence class.

Q11. What is the correct recognition rate for the faces in the two bottom rows?

The faces in the two bottom rows are examples of incorrectly recognized faces by their algorithm without restoration (as described earlier), but after the restoration step, they are correctly recognized.

3D Face Recognition under Expressions, Occlusions, and Pose Variations

Summary (4 min read)

1 INTRODUCTION

1.1 Previous Work

1.2 Overview of Our Approach

2.1 Motivation for Radial Curves

2.2 Motivation for Elasticity

2.3 Automated Extraction of Radial Curves

2.4 Curve Quality Filter

3.1 Background on the Shapes of Curves

3.2 Shape Metric for Facial Surfaces

3.3 Computation of the Mean Shape

3.4 Completion of Partially-Obscured Curves

4.1 Data Preprocessing

4.2 Comparative Evaluation on the FRGCv2 Dataset

4.3 Evaluation on the GavabDB Dataset

4.4 3D Face Recognition on the Bosphorus Dataset: Recognition Under External Occlusion

5 DISCUSSION

6 CONCLUSION

Figures (25)

Citations

Cites background from "3D Face Recognition under Expressio..."

Cites methods from "3D Face Recognition under Expressio..."

References

"3D Face Recognition under Expressio..." refers background in this paper

"3D Face Recognition under Expressio..." refers background in this paper

"3D Face Recognition under Expressio..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (11)

Q1. What are the advantages of using parallel techniques?

Q2. How do Faltemieret al. avoid passing over deformable parts of faces?

Q3. What is the importance of the pre-processing step?

Q4. What is the main limitation of this approach?

Q5. What is the reason for the differences in the shape of the nose curve?

Q6. How many scans are available for each face?

Q7. What is the rank-1 recognition rate in the difficult scenario of neutral vs. expressions?

Q8. What is the common framework for handling expression variability?

Q9. What is the main limitation of these approaches?

Q10. What is the equivalence class of a curve?

Q11. What is the correct recognition rate for the faces in the two bottom rows?