scispace - formally typeset
Search or ask a question
Journal ArticleDOI

3D Face Recognition under Expressions, Occlusions, and Pose Variations

01 Sep 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 35, Iss: 9, pp 2270-2283
TL;DR: A novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and averaging their shapes, which allows for formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average shapes.
Abstract: We propose a novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and averaging their shapes. Here we represent facial surfaces by radial curves emanating from the nose tips and use elastic shape analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. This representation, along with the elastic Riemannian metric, seems natural for measuring facial deformations and is robust to challenges such as large facial expressions (especially those with open mouths), large pose variations, missing parts, and partial occlusions due to glasses, hair, and so on. This framework is shown to be promising from both-empirical and theoretical-perspectives. In terms of the empirical evaluation, our results match or improve upon the state-of-the-art methods on three prominent databases: FRGCv2, GavabDB, and Bosphorus, each posing a different type of challenge. From a theoretical perspective, this framework allows for formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average shapes.

Summary (4 min read)

1 INTRODUCTION

  • Due to the natural, non-intrusive, and high throughput nature of face data acquisition, automatic face recognition has many benefits when compared to other biometrics.
  • Amongst different modalities available for face imaging, 3D scanning has a major advantage over 2D color imaging in that nuisance variables, such as illumination and small pose changes, have a relatively smaller influence on the observations.
  • 3D scans often suffer from the problem of missing parts due to self occlusions or external occlusions, or some imperfections in the scanning technology.
  • Additionally, the authors provide some basic tools for statistical shape analysis of facial surfaces.

1.1 Previous Work

  • The task of recognizing 3D face scans has been approached in many ways, leading to varying levels of successes.
  • Similar approaches, but using manually annotated models, are presented in [31], [17].
  • To handle the open mouth problem, they first detect and remove the lip region, and then compute the surface distance in presence of a hole corresponding to the removed part [5].
  • Samir et al. [28] use the level curves of the surface distance function (from the tip of the nose) as features for face recognition.
  • Fig. 2 shows some facial expressions leading to a significant shrinking or stretching of the skin surface and, thus, causing both Euclidean and surface distances between these points to change.

1.2 Overview of Our Approach

  • This paper presents a Riemannian framework for 3D facial shape analysis.
  • This framework is based on elastically matching and comparing radial curves emanating from the tip of the nose and it handles several of the problems described above.
  • To handle the missing data, it introduces a restoration step that uses statistical estimation on shape manifolds of curves.
  • This basic setup is evaluated on the FRGCv2 dataset following the standard protocol (see Section 4.2).
  • These steps include occlusion detection (Component I) and missing data restoration (Component II).

2.1 Motivation for Radial Curves

  • The changes in facial expressions affect different regions of a facial surface differently.
  • In the case of the missing parts and partial occlusion, at least some part of every radial curve is usually available.
  • Based on these arguments, the authors choose a novel geometrical representation of facial surfaces using radial curves that start from the nose tip.

2.2 Motivation for Elasticity

  • Consider the two parameterized curves shown in Fig. 5; call them β1 and β2.
  • The expression on the left has the mouth open whereas the expression on the right has the mouth closed.
  • In order to compare their shapes, the authors need to register points across those curves.
  • For curves, the problem of optimal registration is actually the same as that of optimal re-parameterization.
  • This optimization leads to a proper distance ( distance) and an optimal deformation between the shapes of curves.

2.3 Automated Extraction of Radial Curves

  • Each facial surface is represented by an indexed collection of radial curves that are defined and extracted as follows.
  • Pα that has the nose tip as its origin and makes an angle α with the plane containing the reference curve.
  • Using these curves, the authors will demonstrate that the elastic framework is well suited to modeling of deformations associated with changes in facial expressions and for handling missing data.
  • The gallery face in this example belongs to the same person under the same expression.
  • Since the curve extraction on the probe face is based on the gallery nose coordinates which belongs to another person, the curves may be shifted in this nose region.

2.4 Curve Quality Filter

  • In situations involving non-frontal 3D scans, some curves may be partially hidden due to self occlusion.
  • The use of these curves in face recognition can severely degrade the recognition performance and, therefore, they should be identified and discarded.
  • The authors introduce a quality filter that uses the continuity and the length of a curve to detect such curves.
  • The discontinuity or the shortness of a curve results either from missing data or large noise.
  • Recall that during the pre-processing step, there is a provision for filling holes.

3.1 Background on the Shapes of Curves

  • More precisely, as shown in [30], an elastic metric for comparing shapes of curves becomes the simple L2-metric under the SRVF representation.
  • (A similar metric and representation for curves was also developed by Younes et al. [33] but it only applies to planar curves and not to facial curves).
  • Furthermore, under L2-metric, the re-parametrization group acts by isometries on the manifold of q functions, which is not the case for the original curve β.
  • By iterating between these two, the authors can reach a solution for the joint optimiza- tion problem.

3.2 Shape Metric for Facial Surfaces

  • Now the authors extend the framework from radial curves to full facial surfaces.
  • The indexing provides a correspondence between curves across faces.
  • Since the authors have deformations (geodesic paths) between corresponding curves, they can combine these deformations to obtain deformations between full facial surfaces.
  • Algorithm 1 is used to calculate the geodesic path in the shape space.
  • The upper lips match the upper lips, for instance, and this helps produce a natural opening of the mouth as illustrated in the top row in Fig. 10.

3.3 Computation of the Mean Shape

  • One can use the notion of Karcher mean [14] to define an average face that can serve as a representative face of a group of faces.
  • The Karcher mean is then defined by: S = argminS∈Sn V(S).
  • The algorithm for computing Karcher mean is a standard one, see e.g. [8], and is not repeated here to save space.
  • This minimizer may not be unique and, in practice, one can pick any one of those solutions as the mean face.

3.4 Completion of Partially-Obscured Curves

  • Earlier the authors have introduced a filtering step that finds and removes curves with missing parts.
  • Once the authors detect points that belong to the face and points that belong to the occluding object, they first remove the occluding object and use a statistical model in the shape space of radial curves to complete the broken curves.
  • To keep the model simple, the authors use the PCA of the training data, in an appropriate vector space, to form an orthogonal basis representing training shapes.
  • In order to evaluate this reconstruction step, the authors have compared the restored surface (shown in the top row of Fig. 12) with the complete neutral face of that class, as shown in Fig. 13.
  • In the remainder of this paper, the authors will apply this comprehensive framework for 3D face recognition using a variety of well known and challenging datasets.

4.1 Data Preprocessing

  • Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial.
  • As illustrated in Fig. 14, this step includes the following items: .
  • The hole-filling filter identifies and fills holes in input meshes.
  • The holes are created either because of the absorption of laser in dark areas, such as eyebrows and mustaches, or self-occlusion or open mouths.
  • The nose tip is automatically detected for frontal scans and manually annotated for scans with occlusions and large pose variation.

4.2 Comparative Evaluation on the FRGCv2 Dataset

  • For the first evaluation the authors use the FRGCv2 dataset in which the scans have been manually clustered into three categories: neutral expression, small expression, and large expression.
  • Note that this method results in 97.7% rank-1 recognition rate in the case of neutral vs. all.
  • For that end, one would need a systematic evaluation on a dataset with the missing data issues, e.g. the GavabDB.
  • For the standard protocol testings, the ROC III mask of FRGC v2, the authors obtain the verification rates of around 97%, which is comparable to the best published results.
  • Since scans in FRGCv2 are mostly frontal and have high quality, many methods are able to provide good performance.

4.3 Evaluation on the GavabDB Dataset

  • Since GavabDB [21] has many noisy 3D face scans under large facial expressions, the authors will use that database to help evaluate their framework.
  • Each subject was scanned nine times from different angles and under different facial expressions (six with the neutral expression and three with nonneutral expressions).
  • As noted, their approach provides the highest recognition rate for faces with non-neutral expressions (94.54%).
  • Fig. 17 illustrates examples of correct and incorrect matches for some probe faces.
  • The performance decreases for scans from the left or right sides because more parts are occluded in those scans.

4.4 3D Face Recognition on the Bosphorus Dataset: Recognition Under External Occlusion

  • In this section the authors will use components I (occlusion detection and removal) and II (missing data restora- tion) in the algorithm.
  • In each iteration, the authors match the current face scan with the template using ICP and remove those points on the scan that are more than a certain threshold away from the corresponding points on the template.
  • The rank-1 recognition rate is reported in Fig. 20 for different approaches depending upon the type of occlusion.
  • The rank-1 recognition rate is 78.63% when the authors remove the occluded parts and apply the recognition algorithm using the remaining parts, as described in Section 2.4.
  • Even if the part added with restoration introduces some error, it still allows us to use the shapes of the partially observed curves.

5 DISCUSSION

  • In order to study the performance of the proposed approach in presence of different challenges, the authors have presented experimental results using three wellknown 3D face databases.
  • The authors have obtained com- petitive results relative to the state of the art for 3D face recognition in presence of large expressions, nonfrontal views and occlusions.
  • Table 4 also reports the computational time of their approach and some state of the art methods on the FRGCv2 dataset.
  • For each approach, the authors report the time needed for preprocessing and/or feature extraction in the first column.
  • In the case of GavabDB and Bosphorus, the nose tip was manually annotated for non frontal and occluded faces.

6 CONCLUSION

  • The authors have also presented results on 3D face recognition designed to handle variations of facial expression, pose variations and occlusions between gallery and probe scans.
  • This method has several properties that make it appropriate for 3D face recognition in non-cooperative scenarios.
  • Lastly, in the presence of occlusion, the authors have proposed to remove the occluded parts then to recover only the missing data on the 3D scan using statistical shape models.
  • That is, the authors have constructed a low dimensional shape subspace for each element of the indexed collection of curves, and then represent a curve (with missing data) as a linear combination of its basis elements.

Did you find this useful? Give us your feedback

Figures (25)

Content maybe subject to copyright    Report

HAL Id: halshs-00783066
https://halshs.archives-ouvertes.fr/halshs-00783066
Submitted on 31 Jan 2013
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
3D Face Recognition Under Expressions,Occlusions and
Pose Variations
Hassen Drira, Ben Amor Boulbaba, Srivastava Anuj, Mohamed Daoudi, Rim
Slama
To cite this version:
Hassen Drira, Ben Amor Boulbaba, Srivastava Anuj, Mohamed Daoudi, Rim Slama. 3D Face Recog-
nition Under Expressions,Occlusions and Pose Variations. IEEE Transactions on Pattern Analysis
and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2013, pp.2270 - 2283.
�halshs-00783066�

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1
3D Face Recognition Under Expressions,
Occlusions and Pose Variations
Hassen Drira, Boulbaba Ben Amor, Anuj Srivastava, Mohamed Daoudi, and Rim Slama
Abstract—We propose a novel geometric framework for analyzing 3D faces, with the specific goals of comparing, matching, and
averaging their shapes. Here we represent facial surfaces by radial curves emanating from the nose tips and use elastic shape
analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. This representation,
along with the elastic Riemannian metric, seems natural for measuring facial deformations and is robust to challenges such as
large facial expressions (especially those with open mouths), large pose variations, missing parts, and partial occlusions due
to glasses, hair, etc. This framework is shown to be promising from both empirical and theoretical perspectives. In terms
of the empirical evaluation, our results match or improve the state-of-the-art methods on three prominent databases: FRGCv2,
GavabDB, and Bosphorus, each posing a different type of challenge. From a theoretical perspective, this framework allows for
formal statistical inferences, such as the estimation of missing facial parts using PCA on tangent spaces and computing average
shapes.
Index Terms—3D face recognition, shape analysis, biometrics, quality control, data restoration.
1 INTRODUCTION
Due to the natural, non-intrusive, and high through-
put nature of face data acquisition, automatic face
recognition has many benefits when compared to
other biometrics. Accordingly, automated face recog-
nition has received a growing attention within the
computer vision community over the past three
decades. Amongst different modalities available for
face imaging, 3D scanning has a major advantage
over 2D color imaging in that nuisance variables, such
as illumination and small pose changes, have a rela-
tively smaller influence on the observations. However,
3D scans often suffer from the problem of missing
parts due to self occlusions or external occlusions,
or some imperfections in the scanning technology.
Additionally, variations in face scans due to changes
in facial expressions can also degrade face recognition
performance. In order to be useful in real-world appli-
cations, a 3D face recognition approach should be able
to handle these challenges, i.e., it should recognize
people despite large facial expressions, occlusions and
large pose variations. Some examples of face scans
highlighting these issues are illustrated in Fig. 1.
We note that most recent research on 3D face
analysis has been directed towards tackling changes
in facial expressions while only a relatively modest
This paper was presented in part at BMVC 2010 [7].
H. Drira, B. Ben Amor and M. Daoudi are with LIFL (UMR CNRS
8022), Institut Mines-T´el´ecom/TELECOM Lille 1, France.
E-mail: hassen.drira@telecom-lille1.eu
R. Slama is with LIFL (UMR CNRS 8022), University of Lille 1,
France.
A. Srivastava is with the Department of Statistics, FSU, Tallahassee,
FL, 32306, USA.
Fig. 1. Different challenges of 3D face recognition:
expressions, missing data and occlusions.
effort has been spent on handling occlusions and
missing parts. Although a few approaches and cor-
responding results dealing with missing parts have
been presented, none, to our knowledge, has been ap-
plied systematically to a full real database containing
scans with missing parts. In this paper, we present a
comprehensive Riemannian framework for analyzing
facial shapes, in the process dealing with large expres-
sions, occlusions and missing parts. Additionally, we
provide some basic tools for statistical shape analysis
of facial surfaces. These tools help us to compute a
typical or average shape and measure the intra-class
variability of shapes, and will even lead to face atlases
in the future.
1.1 Previous Work
The task of recognizing 3D face scans has been
approached in many ways, leading to varying levels
of successes. We refer the reader to one of many
extensive surveys on the topic, e.g. see Bowyer et
al. [3]. Below we summarize a smaller subset that is
more relevant to our paper.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2
1. Deformable template-based approaches: There
have been several approaches in recent years that
rely on deforming facial surfaces into one another,
under some chosen criteria, and use quantifications
of these deformations as metrics for face recognition.
Among these, the ones using non-linear deformations
facilitate the local stretching, compression, and
bending of surfaces to match each other and
are referred to as elastic methods. For instance,
Kakadiaris et al. [13] utilize an annotated face model
to study geometrical variability across faces. The
annotated face model is deformed elastically to fit
each face, thus matching different anatomical areas
such as the nose, eyes and mouth. In [25], Passalis et
al. use automatic landmarking to estimate the pose
and to detect occluded areas. The facial symmetry is
used to overcome the challenges of missing data here.
Similar approaches, but using manually annotated
models, are presented in [31], [17]. For example, [17]
uses manual landmarks to develop a thin-plate-spline
based matching of facial surfaces. A strong limitation
of these approaches is that the extraction of fiducial
landmarks needed during learning is either manual
or semi-automated, except in [13] where it is fully
automated.
2. Local regions/ features approaches: Another com-
mon framework, especially for handling expression
variability, is based on matching only parts or regions
rather than matching full faces. Lee et al. [15] use
ratios of distances and angles between eight fiducial
points, followed by a SVM classifier. Similarly, Gupta
et al. [11] use Euclidean/geodesic distances between
anthropometric fiducial points, in conjunction with
linear classifiers. As stated earlier, the problem of au-
tomated detection of fiducial points is non-trivial and
hinders automation of these methods. Gordon [10]
argues that curvature descriptors have the potential
for higher accuracy in describing surface features and
are better suited to describe the properties of faces in
areas such as the cheeks, forehead, and chin. These
descriptors are also invariant to viewing angles. Li et
al. [16] design a feature pooling and ranking scheme
in order to collect various types of low-level geometric
features, such as curvatures, and rank them according
to their sensitivity to facial expressions. Along similar
lines, Wang et al. [32] use a signed shape-difference
map between two aligned 3D faces as an interme-
diate representation for shape comparison. McKeon
and Russ [19] use a region ensemble approach that
is based on Fisherfaces, i.e., face representations are
learned using Fisher’s discriminant analysis.
In [12], Huang et al. use a multi-scale Local Binary
Pattern (LBP) for a 3D face jointly with shape index.
Similarly, Moorthy et al. [20] use Gabor features
around automatically detected fiducial points.
To avoid passing over deformable parts of faces
encompassing discriminative information, Faltemier
et al. [9] use 38 face regions that densely cover the
face, and fuse scores and decisions after performing
ICP on each region. A similar idea is proposed in [29]
that uses PCA-LDA for feature extraction, treating
the likelihood ratio as a matching score and using
the majority voting for face identification. Queirolo et
al. [26] use Surface Inter-penetration Measure (SIM)
as a similarity measure to match two face images.
The authentication score is obtained by combining
the SIM values corresponding to the matching of
four different face regions: circular and elliptical
areas around the nose, forehead, and the entire
face region. In [1], the authors use Average Region
Models (ARMs) locally to handle the challenges of
missing data and expression-related deformations.
They manually divide the facial area into several
meaningful components and the registration of faces
is carried out by separate dense alignments to the
corresponding ARMs. A strong limitation of this
approach is the need for manual segmentation of a
face into parts that can then be analyzed separately.
3. Surface-distance based approaches: There are sev-
eral papers that utilize distances between points on
facial surfaces to define features that are eventually
used in recognition. (Some papers call it geodesic dis-
tance but, in order to distinguish it from our later use
of geodesics on shape spaces of curves and surfaces,
we shall call it surface distance.) These papers assume
that surface distances are relatively invariant to small
changes in facial expressions and, therefore, help gen-
erate features that are robust to facial expressions.
Bronstein et al. [4] provide a limited experimental
illustration of this invariance by comparing changes
in surface distances with the Euclidean distances
between corresponding points on a canonical face
surface. To handle the open mouth problem, they first
detect and remove the lip region, and then compute
the surface distance in presence of a hole correspond-
ing to the removed part [5]. The assumption of preser-
vation of surface distances under facial expressions
motivates several authors to define distance-based
features for facial recognition. Samir et al. [28] use
the level curves of the surface distance function (from
the tip of the nose) as features for face recognition.
Since an open mouth affects the shape of some level
curves, this method is not able to handle the problem
of missing data due to occlusion or pose variations.
A similar polar parametrization of the facial surface
is proposed in [24] where the authors study local
geometric attributes under this parameterization. To
deal with the open mouth problem, they modify the
parametrization by disconnecting the top and bottom
lips. The main limitation of this approach is the need
for detecting the lips, as proposed in [5]. Berretti et al.
[2] use surface distances to define facial stripes which,
in turn, is used as nodes in a graph-based recognition
algorithm.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 3
The main limitation of these approaches, apart from
the issues resulting from open mouths, is that they
assume that surface distances between facial points
are preserved within face classes. This is not valid
in the case of large expressions. Actually, face ex-
pressions result from the stretching or the shrinking
of underlying muscles and, consequently, the facial
skin is deformed in a non-isometric manner. In other
words, facial surfaces are also stretched or compressed
locally, beyond a simple bending of parts.
In order to demonstrate this assertion, we placed
four markers on a face and tracked the changes in
the surface and Euclidean (straight line) distances
between the markers under large expressions. Fig. 2
shows some facial expressions leading to a significant
shrinking or stretching of the skin surface and, thus,
causing both Euclidean and surface distances between
these points to change. In one case these distances
decrease (from 113 mm to 103 mm for the Euclidean
distance, and from 115 mm to 106 mm for the surface
distance) while in the other two cases they increase.
This clearly shows that large expressions can cause
stretching and shrinking of facial surfaces, i.e., the
facial deformation is elastic in nature. Hence, the
assumption of an isometric deformation of the shape
of the face is not strictly valid, especially for large
expressions. This also motivates the use of elastic
shape analysis in 3D face recognition.
71 mm
77 mm
57 mm
56 mm
Neutral face
Stretching
Expressive face
Distance along line (Euclidian)
Distance along surface (Geodesic)
65 mm
7
4 mm
62 mm
59 mm
Neutral face
Stretching
Expressive face
106 mm
115 mm
113 mm
103 mm
Shrinking
Neutral face Expressive face
Fig. 2. Significant changes in both Euclidean and
surface distances under large facial expressions.
1.2 Overview of Our Approach
This paper presents a Riemannian framework for
3D facial shape analysis. This framework is based
on elastically matching and comparing radial curves
emanating from the tip of the nose and it handles
several of the problems described above. The main
contributions of this paper are:
It extracts, analyzes, and compares the shapes of
radial curves of facial surfaces.
It develops an elastic shape analysis of 3D faces
by extending the elastic shape analysis of curves
[30] to 3D facial surfaces.
To handle occlusions, it introduces an occlusion
detection and removal step that is based on
recursive-ICP.
To handle the missing data, it introduces a
restoration step that uses statistical estimation on
shape manifolds of curves. Specifically, it uses
PCA on tangent spaces of the shape manifold to
model the normal curves and uses that model to
complete the partially-observed curves.
The different stages and components of our method
are laid out in Fig. 3. While some basic steps are
common to all application scenarios, there are also
some specialized tools suitable only for specific situa-
tions. The basic steps that are common to all situations
include 3D scan preprocessing (nose tip localization,
filling holes, smoothing, face cropping), coarse and
fine alignment, radial curve extraction, quality filter-
ing, and elastic shape analysis of curves (Component
III and quality module in Component II). This basic
setup is evaluated on the FRGCv2 dataset following
the standard protocol (see Section 4.2). It is also tested
on the GAVAB dataset where, for each subject, four
probe images out of nine have large pose variations
(see Section 4.3). Some steps are only useful where
one anticipates some data occlusion and missing data.
These steps include occlusion detection (Component
I) and missing data restoration (Component II). In
these situations, the full processing includes Compo-
nents I+II+III to process the given probes. This ap-
proach has been evaluated on a subset of the Bosphorus
dataset that involves occlusions (see Section 4.4). In
the last two experiments, except for the manual de-
tection of nose coordinates, the remaining processing
is automatic.
2 RADIAL, ELASTIC CURVES: MOTIVATION
Since an important contribution of this paper is its
novel use of radial facial curves studied using elastic
shape analysis.
2.1 Motivation for Radial Curves
Why should one use the radial curves emanating from
the tip of the nose for representing facial shapes?
Firstly, why curves and not other kinds of facial
features? Recently, there has been significant progress
in the analysis of curves shapes and the resulting
algorithms are very sophisticated and efficient [30],
[33]. The changes in facial expressions affect different
regions of a facial surface differently. For example,
during a smile, the top half of the face is relatively
unchanged while the lip area changes a lot, and
when a person is surprised the effect is often the
opposite. If chosen appropriately, curves have the
potential to capture regional shapes and that is why
their role becomes important. The locality of shapes
represented by facial curves is an important reason

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 4
3D face scan
preprocessing
Probe image
Gallery image
Coarse
registra!on
Radial curves
extrac!on
Gallery
Fine
registra!on
Occlusion
Presence?
Occlusion
removal
Yes
No
Quality
Filter?
For Each curve
Curve to be kept
Curve to be
restored
Curve
Comple!on
Restored
face
Elas!c Shape analysis
framework of radial curves
III. Elas!c matching of facial curves/surfaces
(a) Example of inter-class geodesic (change in iden!ty)
(b) Example of intra-class geodesic (change in facial expression)
II. Missing data restora!on
I. Occlusion detec!on and removal
Common stages
Specified stages
Approach Components
For Each curve
Fig. 3. Overview of the proposed method.
Fig. 4. A smile (see middle) changes the shapes of the
curves in the lower part of a the face while the act of
surprise changes shapes of curves in the upper part of
the face (see right).
for their selection. The next question is: Which facial
curves are suitable for recognizing people? Curves
on a surface can, in general, be defined either as
the level curves of a function or as the streamlines
of a gradient field. Ideally, one would like curves
that maximally separate inter-class variability from
the intra-class variability (typically due to expression
changes). The past usage of the level curves (of the
surface distance function) has the limitation that each
curve goes through different facial regions and that
makes it difficult to isolate local variability. Actually,
the previous work on shape analysis of facial curves
for 3D face recognition was mostly based on level
curves [27], [28].
In contrast, the radial curves with the nose tip as
origin have a tremendous potential. This is because:
(i) the nose is in many ways the focal point of a face.
It is relatively easy and efficient to detect the nose
tip (compared to other facial parts) and to extract
radial curves, with nose tip as the center, in a com-
pletely automated fashion. It is much more difficult
to automatically extract other types of curves, e.g.
those used by sketch artists ( cheek contours, fore-
head profiles, eye boundaries, etc). ( ii) Different radial
curves pass through different regions and, hence, can
be associated with different facial expressions. For
instance, differences in the shapes of radial curves in
the upper-half of the face can be loosely attributed
to the inter-class variability while those for curves
passing through the lips and cheeks can largely be
due to changes in expressions. This is illustrated in
Fig. 4 which shows a neutral face (left), a smiling
face (middle), and a surprised face (right). The main
difference in the middle face, relative to the left face,
lies in the lower part of the face, while for the right
face the main differences lie in the top half. (iii) Radial
curves have a more universal applicability. The curves
used in the past have worked well for some specific
tasks, e.g., lip contours in detecting certain expres-
sions, but they have not been as efficient for some
other tasks, such as face recognition. In contrast, radial
curves capture the full geometry and are applicable to
a variety of applications, including facial expression
recognition. (iv) In the case of the missing parts and
partial occlusion, at least some part of every radial
curve is usually available. It is rare to miss a full
radial curve. In contrast, it is more common to miss
an eye due to occlusion by glasses, the forehead due
to hair, or parts of cheeks due to a bad angle for
laser reflection. This issue is important in handling the
missing data via reconstruction, as shall be described
later in this paper. (v) Natural face deformations

Citations
More filters
Journal ArticleDOI
TL;DR: A comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems is presented in this paper.
Abstract: Automatic machine-based Facial Expression Analysis (FEA) has made substantial progress in the past few decades driven by its importance for applications in psychology, security, health, entertainment, and human–computer interaction. The vast majority of completed FEA studies are based on nonoccluded faces collected in a controlled laboratory environment. Automatic expression recognition tolerant to partial occlusion remains less understood, particularly in real-world scenarios. In recent years, efforts investigating techniques to handle partial occlusion for FEA have seen an increase. The context is right for a comprehensive perspective of these developments and the state of the art from this perspective. This survey provides such a comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems. It outlines existing challenges in overcoming partial occlusion and discusses possible opportunities in advancing the technology. To the best of our knowledge, it is the first FEA survey dedicated to occlusion and aimed at promoting better-informed and benchmarked future work.

416 citations

Journal ArticleDOI
TL;DR: This paper proposes a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition, and results with state-of-the-art methods are reported.
Abstract: Recognizing human actions in 3-D video sequences is an important open problem that is currently at the heart of many research domains including surveillance, natural interfaces and rehabilitation. However, the design and development of models for action recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, clothing and appearance. In this paper, we propose a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition. The proposed solution develops on fitting a human skeleton model to acquired data so as to represent the 3-D coordinates of the joints and their change over time as a trajectory in a suitable action space. Thanks to such a 3-D joint-based framework, the proposed solution is capable to capture both the shape and the dynamics of the human body, simultaneously. The action recognition problem is then formulated as the problem of computing the similarity between the shape of trajectories in a Riemannian manifold. Classification using k-nearest neighbors is finally performed on this manifold taking advantage of Riemannian geometry in the open curve shape space. Experiments are carried out on four representative benchmarks to demonstrate the potential of the proposed solution in terms of accuracy/latency for a low-latency action recognition. Comparative results with state-of-the-art methods are reported.

329 citations

Journal ArticleDOI
28 Jul 2014
TL;DR: This paper presents the first publicly available face database based on the Kinect sensor, and conducts benchmark evaluations on the proposed database using standard face recognition methods, and demonstrates the gain in performance when integrating the depth data with the RGB data via score-level fusion.
Abstract: The recent success of emerging RGB-D cameras such as the Kinect sensor depicts a broad prospect of 3-D data-based computer applications. However, due to the lack of a standard testing database, it is difficult to evaluate how the face recognition technology can benefit from this up-to-date imaging sensor. In order to establish the connection between the Kinect and face recognition research, in this paper, we present the first publicly available face database (i.e., KinectFaceDB 1 ) based on the Kinect sensor. The database consists of different data modalities (well-aligned and processed 2-D, 2.5-D, 3-D, and video-based face data) and multiple facial variations. We conducted benchmark evaluations on the proposed database using standard face recognition methods, and demonstrated the gain in performance when integrating the depth data with the RGB data via score-level fusion. We also compared the 3-D images of Kinect (from the KinectFaceDB) with the traditional high-quality 3-D scans (from the FRGC database) in the context of face biometrics, which reveals the imperative needs of the proposed database for face recognition research. 1 Online at http://rgb-d.eurecom.fr

257 citations


Cites background from "3D Face Recognition under Expressio..."

  • ..., [35] and [59]) are less affected by the illumination changes than 2-D methods; however, facial expression is still a major challenge in 3-D face recognition [60], [61]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems is presented in this article.
Abstract: Automatic machine-based Facial Expression Analysis (FEA) has made substantial progress in the past few decades driven by its importance for applications in psychology, security, health, entertainment and human computer interaction. The vast majority of completed FEA studies are based on non-occluded faces collected in a controlled laboratory environment. Automatic expression recognition tolerant to partial occlusion remains less understood, particularly in real-world scenarios. In recent years, efforts investigating techniques to handle partial occlusion for FEA have seen an increase. The context is right for a comprehensive perspective of these developments and the state of the art from this perspective. This survey provides such a comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems. It outlines existing challenges in overcoming partial occlusion and discusses possible opportunities in advancing the technology. To the best of our knowledge, it is the first FEA survey dedicated to occlusion and aimed at promoting better informed and benchmarked future work.

249 citations

Book ChapterDOI
08 Oct 2016
TL;DR: The proposed method iteratively and alternately applies two sets of cascaded regressors, one for updating 2D landmarks and the other for updating reconstructed pose-expression-normalized (PEN) 3D face shape, to simultaneously solve the two problems of face alignment and3D face reconstruction from an input 2D face image of arbitrary poses and expressions.
Abstract: We present an approach to simultaneously solve the two problems of face alignment and 3D face reconstruction from an input 2D face image of arbitrary poses and expressions. The proposed method iteratively and alternately applies two sets of cascaded regressors, one for updating 2D landmarks and the other for updating reconstructed pose-expression-normalized (PEN) 3D face shape. The 3D face shape and the landmarks are correlated via a 3D-to-2D mapping matrix. In each iteration, adjustment to the landmarks is firstly estimated via a landmark regressor, and this landmark adjustment is also used to estimate 3D face shape adjustment via a shape regressor. The 3D-to-2D mapping is then computed based on the adjusted 3D face shape and 2D landmarks, and it further refines the 2D landmarks. An effective algorithm is devised to learn these regressors based on a training dataset of pairing annotated 3D face shapes and 2D face images. Compared with existing methods, the proposed method can fully automatically generate PEN 3D face shapes in real time from a single 2D face image and locate both visible and invisible 2D landmarks. Extensive experiments show that the proposed method can achieve the state-of-the-art accuracy in both face alignment and 3D face reconstruction, and benefit face recognition owing to its reconstructed PEN 3D face shapes.

156 citations


Cites methods from "3D Face Recognition under Expressio..."

  • ...Moreover, existing methods always generate 3D faces that have the same pose and expression as the input image, which may not be desired in face recognition due to the challenge of matching 3D faces with expressions [12]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This survey focuses on recognition performed by matching models of the three-dimensional shape of the face, either alone or in combination with matching corresponding two-dimensional intensity images.

1,069 citations

Journal ArticleDOI
TL;DR: This paper introduces a square-root velocity (SRV) representation for analyzing shapes of curves in euclidean spaces under an elastic metric and demonstrates a wrapped probability distribution for capturing shapes of planar closed curves.
Abstract: This paper introduces a square-root velocity (SRV) representation for analyzing shapes of curves in euclidean spaces under an elastic metric. In this SRV representation, the elastic metric simplifies to the IL2 metric, the reparameterization group acts by isometries, and the space of unit length curves becomes the unit sphere. The shape space of closed curves is the quotient space of (a submanifold of) the unit sphere, modulo rotation, and reparameterization groups, and we find geodesics in that space using a path straightening approach. These geodesics and geodesic distances provide a framework for optimally matching, deforming, and comparing shapes. These ideas are demonstrated using: 1) shape analysis of cylindrical helices for studying protein structure, 2) shape analysis of facial curves for recognizing faces, 3) a wrapped probability distribution for capturing shapes of planar closed curves, and 4) parallel transport of deformations for predicting shapes from novel poses.

636 citations


"3D Face Recognition under Expressio..." refers background in this paper

  • ...…_ ðtÞ q O q2ð ðtÞÞÞ be the optimal element of ½q2 associated with the optimal rotation O and reparameterization of the second curve; then the geodesic distance between ½q1 and ½q2 in S is dsð½q1 ; ½q2 Þ¼: dcðq1; q 2Þ and the geodesic is given by (1), with q2 replaced by q 2 ....

    [...]

  • ...For this reason, we need the quality filter that will isolate and remove curves associated with those parts....

    [...]

  • ...An important contribution of this paper is its novel use of radial facial curves studied using elastic shape analysis....

    [...]

  • ...This also motivates the use of elastic shape analysis in 3D face recognition....

    [...]

  • ...Furthermore, the geodesic path between any two points q1; q2 2 C is given by the great circle, : ½0; 1 !...

    [...]

Journal ArticleDOI
TL;DR: The result is an efficient and accurate face recognition algorithm, robust to facial expressions, that can distinguish between identical twins and compare its performance to classical face recognition methods.
Abstract: An expression-invariant 3D face recognition approach is presented. Our basic assumption is that facial expressions can be modelled as isometries of the facial surface. This allows to construct expression-invariant representations of faces using the bending-invariant canonical forms approach. The result is an efficient and accurate face recognition algorithm, robust to facial expressions, that can distinguish between identical twins (the first two authors). We demonstrate a prototype system based on the proposed algorithm and compare its performance to classical face recognition methods. The numerical methods employed by our approach do not require the facial surface explicitly. The surface gradients field, or the surface metric, are sufficient for constructing the expression-invariant representation of any given face. It allows us to perform the 3D face recognition task while avoiding the surface reconstruction stage.

569 citations


"3D Face Recognition under Expressio..." refers background in this paper

  • ...[4] provide a limited experimental illustration of this invariance by comparing changes in surface distances with the euclidean distances between corresponding points on a canonical face surface....

    [...]

Journal ArticleDOI
TL;DR: This paper presents the computational tools and a hardware prototype for 3D face recognition and presents the results on the largest known, and now publicly available, face recognition grand challenge 3D facial database consisting of several thousand scans.
Abstract: In this paper, we present the computational tools and a hardware prototype for 3D face recognition. Full automation is provided through the use of advanced multistage alignment algorithms, resilience to facial expressions by employing a deformable model framework, and invariance to 3D capture devices through suitable preprocessing steps. In addition, scalability in both time and space is achieved by converting 3D facial scans into compact metadata. We present our results on the largest known, and now publicly available, face recognition grand challenge 3D facial database consisting of several thousand scans. To the best of our knowledge, this is the highest performance reported on the FRGC v2 database for the 3D modality

496 citations


"3D Face Recognition under Expressio..." refers background in this paper

  • ...Additionally, variations in face scans due to changes in facial expressions can also degrade face recognition performance....

    [...]

  • ...R. Slama is with Laboratoire d’Informatique Fondamentale de Lille (LIFL), (UMR CNRS 8022), University of Lille 1, Télécom Lille 1, Cité Scientifique, Rue G. Marconi, BP 20145, Villeneuve d’Ascq 59653, France....

    [...]

Frequently Asked Questions (11)
Q1. What are the advantages of using parallel techniques?

Regarding computational efficiency, parallel techniques can also be exploited to improve performance of their approach since the computation of curve distances, preprocessing, etc, are independent tasks. 

To avoid passing over deformable parts of faces encompassing discriminative information, Faltemieret al. [9] use 38 face regions that densely cover the face, and fuse scores and decisions after performing ICP on each region. 

Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial. 

A strong limitation of this approach is the need for manual segmentation of a face into parts that can then be analyzed separately. 

For instance, differences in the shapes of radial curves in the upper-half of the face can be loosely attributed to the inter-class variability while those for curves passing through the lips and cheeks can largely be due to changes in expressions. 

The number of total face scans is 4652; at least 54 scans each are available for most of the subjects, while there are only 31 scans each for 34 of them. 

In the difficult scenario of neutral vs. expressions, the rank-1 recognition rate is 96.8%, which represents a high performance, while in the simpler case of neutral vs. neutral the rate is 99.2%. 

Another common framework, especially for handling expression variability, is based on matching only parts or regions rather than matching full faces. 

The main limitation of these approaches, apart from the issues resulting from open mouths, is that they assume that surface distances between facial points are preserved within face classes. 

In order to study shapes of curves, one should identify all rotations and re-parameterizations of a curve as an equivalence class. 

The faces in the two bottom rows are examples of incorrectly recognized faces by their algorithm without restoration (as described earlier), but after the restoration step, they are correctly recognized.