3D Face Recognition under Expressions, Occlusions, and Pose Variations
Summary (4 min read)
1 INTRODUCTION
- Due to the natural, non-intrusive, and high throughput nature of face data acquisition, automatic face recognition has many benefits when compared to other biometrics.
- Amongst different modalities available for face imaging, 3D scanning has a major advantage over 2D color imaging in that nuisance variables, such as illumination and small pose changes, have a relatively smaller influence on the observations.
- 3D scans often suffer from the problem of missing parts due to self occlusions or external occlusions, or some imperfections in the scanning technology.
- Additionally, the authors provide some basic tools for statistical shape analysis of facial surfaces.
1.1 Previous Work
- The task of recognizing 3D face scans has been approached in many ways, leading to varying levels of successes.
- Similar approaches, but using manually annotated models, are presented in [31], [17].
- To handle the open mouth problem, they first detect and remove the lip region, and then compute the surface distance in presence of a hole corresponding to the removed part [5].
- Samir et al. [28] use the level curves of the surface distance function (from the tip of the nose) as features for face recognition.
- Fig. 2 shows some facial expressions leading to a significant shrinking or stretching of the skin surface and, thus, causing both Euclidean and surface distances between these points to change.
1.2 Overview of Our Approach
- This paper presents a Riemannian framework for 3D facial shape analysis.
- This framework is based on elastically matching and comparing radial curves emanating from the tip of the nose and it handles several of the problems described above.
- To handle the missing data, it introduces a restoration step that uses statistical estimation on shape manifolds of curves.
- This basic setup is evaluated on the FRGCv2 dataset following the standard protocol (see Section 4.2).
- These steps include occlusion detection (Component I) and missing data restoration (Component II).
2.1 Motivation for Radial Curves
- The changes in facial expressions affect different regions of a facial surface differently.
- In the case of the missing parts and partial occlusion, at least some part of every radial curve is usually available.
- Based on these arguments, the authors choose a novel geometrical representation of facial surfaces using radial curves that start from the nose tip.
2.2 Motivation for Elasticity
- Consider the two parameterized curves shown in Fig. 5; call them β1 and β2.
- The expression on the left has the mouth open whereas the expression on the right has the mouth closed.
- In order to compare their shapes, the authors need to register points across those curves.
- For curves, the problem of optimal registration is actually the same as that of optimal re-parameterization.
- This optimization leads to a proper distance ( distance) and an optimal deformation between the shapes of curves.
2.3 Automated Extraction of Radial Curves
- Each facial surface is represented by an indexed collection of radial curves that are defined and extracted as follows.
- Pα that has the nose tip as its origin and makes an angle α with the plane containing the reference curve.
- Using these curves, the authors will demonstrate that the elastic framework is well suited to modeling of deformations associated with changes in facial expressions and for handling missing data.
- The gallery face in this example belongs to the same person under the same expression.
- Since the curve extraction on the probe face is based on the gallery nose coordinates which belongs to another person, the curves may be shifted in this nose region.
2.4 Curve Quality Filter
- In situations involving non-frontal 3D scans, some curves may be partially hidden due to self occlusion.
- The use of these curves in face recognition can severely degrade the recognition performance and, therefore, they should be identified and discarded.
- The authors introduce a quality filter that uses the continuity and the length of a curve to detect such curves.
- The discontinuity or the shortness of a curve results either from missing data or large noise.
- Recall that during the pre-processing step, there is a provision for filling holes.
3.1 Background on the Shapes of Curves
- More precisely, as shown in [30], an elastic metric for comparing shapes of curves becomes the simple L2-metric under the SRVF representation.
- (A similar metric and representation for curves was also developed by Younes et al. [33] but it only applies to planar curves and not to facial curves).
- Furthermore, under L2-metric, the re-parametrization group acts by isometries on the manifold of q functions, which is not the case for the original curve β.
- By iterating between these two, the authors can reach a solution for the joint optimiza- tion problem.
3.2 Shape Metric for Facial Surfaces
- Now the authors extend the framework from radial curves to full facial surfaces.
- The indexing provides a correspondence between curves across faces.
- Since the authors have deformations (geodesic paths) between corresponding curves, they can combine these deformations to obtain deformations between full facial surfaces.
- Algorithm 1 is used to calculate the geodesic path in the shape space.
- The upper lips match the upper lips, for instance, and this helps produce a natural opening of the mouth as illustrated in the top row in Fig. 10.
3.3 Computation of the Mean Shape
- One can use the notion of Karcher mean [14] to define an average face that can serve as a representative face of a group of faces.
- The Karcher mean is then defined by: S = argminS∈Sn V(S).
- The algorithm for computing Karcher mean is a standard one, see e.g. [8], and is not repeated here to save space.
- This minimizer may not be unique and, in practice, one can pick any one of those solutions as the mean face.
3.4 Completion of Partially-Obscured Curves
- Earlier the authors have introduced a filtering step that finds and removes curves with missing parts.
- Once the authors detect points that belong to the face and points that belong to the occluding object, they first remove the occluding object and use a statistical model in the shape space of radial curves to complete the broken curves.
- To keep the model simple, the authors use the PCA of the training data, in an appropriate vector space, to form an orthogonal basis representing training shapes.
- In order to evaluate this reconstruction step, the authors have compared the restored surface (shown in the top row of Fig. 12) with the complete neutral face of that class, as shown in Fig. 13.
- In the remainder of this paper, the authors will apply this comprehensive framework for 3D face recognition using a variety of well known and challenging datasets.
4.1 Data Preprocessing
- Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial.
- As illustrated in Fig. 14, this step includes the following items: .
- The hole-filling filter identifies and fills holes in input meshes.
- The holes are created either because of the absorption of laser in dark areas, such as eyebrows and mustaches, or self-occlusion or open mouths.
- The nose tip is automatically detected for frontal scans and manually annotated for scans with occlusions and large pose variation.
4.2 Comparative Evaluation on the FRGCv2 Dataset
- For the first evaluation the authors use the FRGCv2 dataset in which the scans have been manually clustered into three categories: neutral expression, small expression, and large expression.
- Note that this method results in 97.7% rank-1 recognition rate in the case of neutral vs. all.
- For that end, one would need a systematic evaluation on a dataset with the missing data issues, e.g. the GavabDB.
- For the standard protocol testings, the ROC III mask of FRGC v2, the authors obtain the verification rates of around 97%, which is comparable to the best published results.
- Since scans in FRGCv2 are mostly frontal and have high quality, many methods are able to provide good performance.
4.3 Evaluation on the GavabDB Dataset
- Since GavabDB [21] has many noisy 3D face scans under large facial expressions, the authors will use that database to help evaluate their framework.
- Each subject was scanned nine times from different angles and under different facial expressions (six with the neutral expression and three with nonneutral expressions).
- As noted, their approach provides the highest recognition rate for faces with non-neutral expressions (94.54%).
- Fig. 17 illustrates examples of correct and incorrect matches for some probe faces.
- The performance decreases for scans from the left or right sides because more parts are occluded in those scans.
4.4 3D Face Recognition on the Bosphorus Dataset: Recognition Under External Occlusion
- In this section the authors will use components I (occlusion detection and removal) and II (missing data restora- tion) in the algorithm.
- In each iteration, the authors match the current face scan with the template using ICP and remove those points on the scan that are more than a certain threshold away from the corresponding points on the template.
- The rank-1 recognition rate is reported in Fig. 20 for different approaches depending upon the type of occlusion.
- The rank-1 recognition rate is 78.63% when the authors remove the occluded parts and apply the recognition algorithm using the remaining parts, as described in Section 2.4.
- Even if the part added with restoration introduces some error, it still allows us to use the shapes of the partially observed curves.
5 DISCUSSION
- In order to study the performance of the proposed approach in presence of different challenges, the authors have presented experimental results using three wellknown 3D face databases.
- The authors have obtained com- petitive results relative to the state of the art for 3D face recognition in presence of large expressions, nonfrontal views and occlusions.
- Table 4 also reports the computational time of their approach and some state of the art methods on the FRGCv2 dataset.
- For each approach, the authors report the time needed for preprocessing and/or feature extraction in the first column.
- In the case of GavabDB and Bosphorus, the nose tip was manually annotated for non frontal and occluded faces.
6 CONCLUSION
- The authors have also presented results on 3D face recognition designed to handle variations of facial expression, pose variations and occlusions between gallery and probe scans.
- This method has several properties that make it appropriate for 3D face recognition in non-cooperative scenarios.
- Lastly, in the presence of occlusion, the authors have proposed to remove the occluded parts then to recover only the missing data on the 3D scan using statistical shape models.
- That is, the authors have constructed a low dimensional shape subspace for each element of the indexed collection of curves, and then represent a curve (with missing data) as a linear combination of its basis elements.
Did you find this useful? Give us your feedback
Citations
416 citations
329 citations
257 citations
Cites background from "3D Face Recognition under Expressio..."
..., [35] and [59]) are less affected by the illumination changes than 2-D methods; however, facial expression is still a major challenge in 3-D face recognition [60], [61]....
[...]
249 citations
156 citations
Cites methods from "3D Face Recognition under Expressio..."
...Moreover, existing methods always generate 3D faces that have the same pose and expression as the input image, which may not be desired in face recognition due to the challenge of matching 3D faces with expressions [12]....
[...]
References
1,109 citations
1,069 citations
636 citations
"3D Face Recognition under Expressio..." refers background in this paper
...…_ ðtÞ q O q2ð ðtÞÞÞ be the optimal element of ½q2 associated with the optimal rotation O and reparameterization of the second curve; then the geodesic distance between ½q1 and ½q2 in S is dsð½q1 ; ½q2 Þ¼: dcðq1; q 2Þ and the geodesic is given by (1), with q2 replaced by q 2 ....
[...]
...For this reason, we need the quality filter that will isolate and remove curves associated with those parts....
[...]
...An important contribution of this paper is its novel use of radial facial curves studied using elastic shape analysis....
[...]
...This also motivates the use of elastic shape analysis in 3D face recognition....
[...]
...Furthermore, the geodesic path between any two points q1; q2 2 C is given by the great circle, : ½0; 1 !...
[...]
569 citations
"3D Face Recognition under Expressio..." refers background in this paper
...[4] provide a limited experimental illustration of this invariance by comparing changes in surface distances with the euclidean distances between corresponding points on a canonical face surface....
[...]
496 citations
"3D Face Recognition under Expressio..." refers background in this paper
...Additionally, variations in face scans due to changes in facial expressions can also degrade face recognition performance....
[...]
...R. Slama is with Laboratoire d’Informatique Fondamentale de Lille (LIFL), (UMR CNRS 8022), University of Lille 1, Télécom Lille 1, Cité Scientifique, Rue G. Marconi, BP 20145, Villeneuve d’Ascq 59653, France....
[...]
Related Papers (5)
Frequently Asked Questions (11)
Q2. How do Faltemieret al. avoid passing over deformable parts of faces?
To avoid passing over deformable parts of faces encompassing discriminative information, Faltemieret al. [9] use 38 face regions that densely cover the face, and fuse scores and decisions after performing ICP on each region.
Q3. What is the importance of the pre-processing step?
Since the raw data contains a number of imperfections, such as holes, spikes, and include some undesired parts, such as clothes, neck, ears and hair, the data pre-processing step is very important and nontrivial.
Q4. What is the main limitation of this approach?
A strong limitation of this approach is the need for manual segmentation of a face into parts that can then be analyzed separately.
Q5. What is the reason for the differences in the shape of the nose curve?
For instance, differences in the shapes of radial curves in the upper-half of the face can be loosely attributed to the inter-class variability while those for curves passing through the lips and cheeks can largely be due to changes in expressions.
Q6. How many scans are available for each face?
The number of total face scans is 4652; at least 54 scans each are available for most of the subjects, while there are only 31 scans each for 34 of them.
Q7. What is the rank-1 recognition rate in the difficult scenario of neutral vs. expressions?
In the difficult scenario of neutral vs. expressions, the rank-1 recognition rate is 96.8%, which represents a high performance, while in the simpler case of neutral vs. neutral the rate is 99.2%.
Q8. What is the common framework for handling expression variability?
Another common framework, especially for handling expression variability, is based on matching only parts or regions rather than matching full faces.
Q9. What is the main limitation of these approaches?
The main limitation of these approaches, apart from the issues resulting from open mouths, is that they assume that surface distances between facial points are preserved within face classes.
Q10. What is the equivalence class of a curve?
In order to study shapes of curves, one should identify all rotations and re-parameterizations of a curve as an equivalence class.
Q11. What is the correct recognition rate for the faces in the two bottom rows?
The faces in the two bottom rows are examples of incorrectly recognized faces by their algorithm without restoration (as described earlier), but after the restoration step, they are correctly recognized.