Book Chapter•DOI•

Model Based Analysis of Face Images for Facial Feature Extraction

Zahid Riaz¹, Christoph Mayer¹, Michael Beetz¹, Bernd Radig¹•Institutions (1)

29 Aug 2009-Vol. 5702, pp 99-106

TL;DR: A comprehensive approach to extract a common feature set from the image sequences using active appearance model (AAM) approach, which produced very promising recognition rates for applications with same set of features and classifiers.

read less

Abstract: This paper describes a comprehensive approach to extract a common feature set from the image sequences. We use simple features which are easily extracted from a 3D wireframe model and efficiently used for different applications on a benchmark database. Features verstality is experimented on facial expressions recognition, face reognition and gender classification. We experiment different combinations of the features and find reasonable results with a combined features approach which contain structural, textural and temporal variations. The idea follows in fitting a model to human face images and extracting shape and texture information. We parametrize these extracted information from the image sequences using active appearance model (AAM) approach. We further compute temporal parameters using optical flow to consider local feature variations. Finally we combine these parameters to form a feature vector for all the images in our database. These features are then experimented with binary decision tree (BDT) and Bayesian Network (BN) for classification. We evaluated our results on image sequences of Cohn Kanade Facial Expression Database (CKFED). The proposed system produced very promising recognition rates for our applications with same set of features and classifiers. The system is also realtime capable and automatic.

...read moreread less

Summary (2 min read)

Jump to: [1 Introduction] – [2 Related work] – [3 Our approach] – [4 Determining High-Level Information] – [5 Experimental Evaluations] and [6 Conclusions]

1 Introduction

In the recent decade model based image analysis of human faces has become a challenging field due to its capability to deal with the real world scenarios.
These capabilities of the system suggest to apply it in interactive secnarios like human machine interaction, security of personalized utilities like tokenless devices, facial analysis for person behavior and person security.
Temporal features are extracted using optical flow.
The remainder of this paper is divided in four main sections.
This includes description from model fitting to face image to feature vector formation.

3 Our approach

Image warping and parameters extraction for shape, texture and temporal information.the authors.
Texture information is mapped from the example image to a reference shape which is the mean shape of all the shapes available in database.
Texture warping between the trigulations is performed using affine transformation.
The authors use reduced descriptors by trading off between accuracy and run time performance.
This computer vision task comprises of various phases shown in Figure 1 for which it exploits model-based techniques that accurately localize facial features, seamlessly track them through image sequences, and finally infer facial features.

4 Determining High-Level Information

In order to initialize, We apply the algorithm of Viola et al. [20] to roughly detect the face position within the image.the authors.the authors.
To extract descriptive features, the model parameters are exploited.
Local motion of feature points is observed using optical flow.
The authors extract 85 structural features, 74 textural features and 12 temporal features textural parameters to form a combined feature vector for each image.
The face feature vector consists of the shape, texture and temporal variations, which sufficiently defines global and local variations of the face.

5 Experimental Evaluations

For experimentation purposes, the authors benchmark their results on Cohn Kanade Facial Expression Database .
The database contains 488 short image sequences of 97 different persons performing six universal facial expressions [12].
It provides researchers with a large dataset for experimenting and benchmarking purpose.
Furthermore, the image sequences are taken in a laboratory environment with predefined illumination conditions, solid background and frontal face views.
In order to experiment feature verstality the authors use two different classifiers with same feature set on three different applications: face recognition, facial expressions recognition and gender classification.

6 Conclusions

The features set is applied to three different applications: face recognition, facial expressions recognition and gender classification, which produced the reasonable results in all three cases for CKFED.
The authors consider different classifiers for checking the versatility of their extracted features.
The authors use two different classifiers with same specifications which evidence simplicity of their approach however, the results can be further optimized by trying other classifiers.
The database consists of frontal views with uniform illuminations.
Further extensions of this work is to enhance the feature sets to include information about pose and lighting variations.

Did you find this useful? Give us your feedback

Figures (4)

Fig. 3. Motion patterns within the image are extracted and the temporal features are calculated from them. These features are descriptive for a sequence of images rather than single images.

Fig. 2. Texture information is represented by an appearance model. Model parameters of the fitted model are extracted to represent single image information.

Fig. 1. Our Approach: Sequential flow for feature extraction

Table 1. Comparison of Extracted Features

Content maybe subject to copyright Report

Model Based Analysis of Face Images for Facial

Feature Extraction

Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig

Technische Universit¨at M¨unchen,

Boltzmannstr. 3, 85748 Garching, Germany

{riaz,mayerc,beetz,radig}@in.tum.de

http://www9.in.tum.de

Abstract. This paper describes a comprehensive approach to extract a

common feature set from the image sequences. We use simple features

which are easily extracted from a 3D wireframe model and eﬃciently

used for diﬀerent applications on a benchmark database. Features ver-

stality is experimented on facial expressions recognition, face reognition

and gender classiﬁcation. We experiment diﬀerent combinations of the

features and ﬁnd reasonable results with a combined features approach

which contain structural, textural and temporal variations. The idea fol-

lows in ﬁtting a model to human face images and extracting shape and

texture information. We parametrize these extracted information from

the image sequences using active appearance model (AAM) approach.

We further compute temporal parameters using optical ﬂow to consider

local feature variations. Finally we combine these parameters to form a

feature vector for all the images in our database. These features are then

experimented with binary decision tree (BDT) and Bayesian Network

(BN) for classiﬁcation. We evaluated our results on image sequences of

Cohn Kanade Facial Expression Database (CKFED). The proposed sys-

tem produced very promising recognition rates for our applications with

same set of features and classiﬁers. The system is also realtime capable

and automatic.

Key words: Feature Extraction, Face Image Analysis, Face Recogni-

tion, Facial Expressions Recognition, Human Robot Interaction

1 Introduction

In the recent decade model based image analysis of human faces has become

a challenging ﬁeld due to its capability to deal with the real world scenarios.

Further it outperforms the previous techniques which were constrained to user

intervention with the system either to manually interact with system or to be

frontal to the camera. Currently available model based techniques are trying

to deal with some of the future challenges like developing state-of-the-art algo-

rithms, improving eﬃciency, fully automated system development and verstality

under diﬀerent applications. In this paper we deal with some of these challenges.

We focus on feature extraction technique which is fully automatic and verstile

2 Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig

enough for diﬀerent applications like face recognition, facial expressions recogni-

tion and gender classiﬁcation. These capabilities of the system suggest to apply

it in interactive secnarios like human machine interaction, security of personal-

ized utilities like tokenless devices, facial analysis for person behavior and person

security.

Model-based image interpretation techniques extract information about fa-

cial expression, person identitiy and gender classiﬁcation from images of human

faces via facial changes. Models take beneﬁt of the prior knowledge of the ob-

ject shape and hence try to match themselves with the object in an image for

which they are designed. Face models impose knowledge about human faces and

reduce high dimensional image data to a small number of expressive model pa-

rameters. We integrate the three-dimensional Candide-3 face model [8] that

has been speciﬁcally designed for observing facial features variations deﬁned by

facial action coding system (FACS) [13]. The model parameters together with

extracted texture and motion information is utilized to train classiﬁers that de-

termine person-speciﬁc information. A combination of diﬀerent facial features is

used for classiﬁers to classify six basic facial expressoins i.e. anger, fear, surprise,

saddness, laugh and disgust, facial identitly and gender classiﬁcation.

Our feature vector for each image consists of structral, textural and temporal

variations of the faces in the image sequence. Shape and textural parameters de-

ﬁne active appearance models (AAM) in partial 3D space with shape parameters

extracted from 3D landmarks and texture from 2D image. Temporal features are

extracted using optical ﬂow. These extracted features are more informative than

AAM parameters since we consider local motion patterns in the image sequences

in the form temporal parameters.

The remainder of this paper is divided in four main sections. In section

2, related work to our applications is discussed. In section 3 we discuss our

approach in detail. In section 4 higher level features extraction from model based

image interpretation is described. This includes description from model ﬁtting

to face image to feature vector formation. Section 5 discusses about evaluation

of our results on the database. Finally we conclude our results with some future

directions.

2 Related work

We initiate with a three step approach that has been suggested by Pantic et

al. [1] for facial expression recognition. However, the generality of this approach

makes it applicable not only to facial expression estimation but also to apply it

for person identiﬁcation and gender classiﬁcation at the same time. The ﬁrst step

aims at determining the position and shape of the face in the image by ﬁtting

a model. Descriptive features are extracted in the second step. In the third step

a classiﬁer is applied to the features to determine high level information from

the features. Several face models and ﬁtting approaches have been presented in

the recent years. Cootes et al. [5] introduced modeling face shapes with Active

Contours. Further enhancements included the idea of expanding shape models

Model Based Analysis of Face Images for Facial Feature Extraction 3

with texture information [6]. In contrast, three-dimensional shape models such

as the Candide-3 face model consideres the real-world face structure rather than

the appearance in the image. Blanz et al. propose a face model that consideres

both, the three-dimensional structur as well as its texture [7]. However, model

parameters that describe the current image content need to be determined in

order to extract high-level information, a process known as model ﬁtting. In

order to ﬁt a model to an image. Van Ginneken et al. learned local objective

functions from annotated training images [18]. In this work, image features are

obtained by approximating the pixel values in a region around a pixel of in-

terest The learning algorithm use to map images features to objective values

is a k-Nearest-Neighbor classiﬁer (kNN) learned from the data. We used simi-

lar methodology developed by Wimmer et al. [4] which combines multitude of

qualitatively diﬀerent features [19], determines the most relevant features using

machine learning and learns objective functions from annotated images [18]. To

extract discriptive features from the image, Michel et al. [14] extracted the lo-

cation of 22 feature points within the face and determine their motion between

an image that shows the neutral state of the face and an image that represents

a facial expression. The very similar approach of Cohn et al. [15] uses hierar-

chical optical ﬂow in order to determine the motion of 30 feature points. A

set of training data formed from the extracted features is utilized to learn on

a classiﬁer. For facial expressions, some approaches infer the expressions from

rules stated by Ekman and Friesen [13]. This approach is applied by Kotsia et

al. [16] to design Support Vector Machines (SVM) for classiﬁcation. Michel et

al. [14] train a Support Vector Machine (SVM) that determines the visible facial

expression within the video sequences of the Cohn-Kanade Facial Expression

Database by comparing the ﬁrst frame with the neutral expression to the last

frame with the peak expression. In order to perform face recognition applications

many researchers have applied model based approaches. Edwards et al [2] use

weighted distance classiﬁer called Mahalanobis distance measure for AAM pa-

rameters. However, they isolate the sources of variation by maximizing the inter

class variations using Linear Discriminant Analysis (LDA), a holistic approach

which was used for Fisherfaces representation [3]. However they do not discuss

face recognition under facial expression. Riaz et al [17] apply similar features for

explaining face recognition using bayesian networks. However results are limited

to face recognition application only. They used expression invariant technique

for face recognition, which is also used in 3D scenarios by Bronstein et al [9]

without 3D reconstruction of the faces and using geodesic distance. Park et. al.

[10] apply 3D model for face recognition on videos from CMU Face in Action

(FIA) database. They reconstruct a 3D model acquiring views from 2D model

ﬁtting to the images.

4 Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig

3 Our approach

In this section we explain in detail the approach adopted in this paper including

model ﬁtting, image warping and parameters extraction for shape, texture and

temporal information.

We use a wireframe 3D face model known as candide-III [8]. The model is

ﬁtted to the face image using objective function approach [4]. After ﬁtting the

model to the example face image, we use the projections of the 3D landmarks

in 2D for texture mapping. Texture information is mapped from the example

image to a reference shape which is the mean shape of all the shapes available

in database. However the choice of mean shape is arbitrary. Image texture is

extracted using planar subdivisions of the reference and the example shapes. We

use delauny triangulations of the distribution of our model points. Texture warp-

ing between the trigulations is performed using aﬃne transformation. Principal

Component Analysis (PCA) is used to obtain the texture and shape parameters

of the example image. This approach is similar to extracting AAM parameters.

In addition to AAM parameters, temporal features of the facial changes are also

calculated. Local motion of the feature points is observed using optical ﬂow. We

use reduced descriptors by trading oﬀ between accuracy and run time perfor-

mance. These features are then used for classiﬁcation. Our approach achieves

real-time performance and provides robustness against facial expressions in real-

world scenarios. This computer vision task comprises of various phases shown

in Figure 1 for which it exploits model-based techniques that accurately localize

facial features, seamlessly track them through image sequences, and ﬁnally infer

facial features. We speciﬁcally adapt state-of-the-art techniques to each of these

challenging phases.

4 Determining High-Level Information

In order to initialize, We apply the algorithm of Viola et al. [20] to roughly

detect the face position within the image. Then, model parameters are estimated

by applying the approach of Wimmer et al. [4] because it is able to robustly

determine model parameters in real-time.

To extract descriptive features, the model parameters are exploited. The

model conﬁguration represents information about various facial features, such as

lips, eye brows or eyes and therefore contributes to the extracted features. These

structural features include both, information about the person’s face structure

that helps to determine person-speciﬁc information such as gender or identity.

Furthermore, changes in these features indicates shape changes and therefore

contributes to the recognition of facial expressions.

The shape x is parametrized by using mean shape x

and matrix of eigen-

vectors P

to obtain the parameter vector b

[11].

x = x

+ P

(1)

Model Based Analysis of Face Images for Facial Feature Extraction 5

Fig. 1. Our Approach: Sequential ﬂow for feature extraction

The extracted texture is parametrized using PCA by using mean texture

and matrix of eigenvectors P

to obtain the parameter vector b

[11]. Figure 2

shows shape model ﬁtting and texture extracted from face image.

g = g

+ P

(2)

Further, temporal features of the facial changes are also calculated that take

movement over time into consideration. Local motion of feature points is ob-

served using optical ﬂow. We do not specify the location of these feature points

manually but distribute equally in the whole face region. The number of feature

points is chosen in a way that the system is still capable of performing in real time

and therefore inherits a trade oﬀ between accuracy and runtime performance.

Figure 3 shows motion patterns for some of the images from database.

We combine all extracted features into a single feature vector. Single image

information is considered by the structural and textural features whereas image

sequence information is considered by the temporal features. The overall feature

vector becomes:

u = (b

, ...., b

, b

, ...., b

, b

, ...., b

, ) (3)

Where b

, b

and b

are shape, textural and temporal parameters respec-

tively.

We extract 85 structural features, 74 textural features and 12 temporal fea-

tures textural parameters to form a combined feature vector for each image.

These features are then used for binary decision tree (BDT) and bayesian net-

work (BN) for diﬀerent classiﬁcations. The face feature vector consists of the

HTML Viewer

Frequently Asked Questions (17)

Q1. What are the contributions mentioned in the paper "Model based analysis of face images for facial feature extraction" ?

This paper describes a comprehensive approach to extract a common feature set from the image sequences. The authors use simple features which are easily extracted from a 3D wireframe model and efficiently used for different applications on a benchmark database. The idea follows in fitting a model to human face images and extracting shape and texture information. The authors parametrize these extracted information from the image sequences using active appearance model ( AAM ) approach. The authors further compute temporal parameters using optical flow to consider local feature variations. The proposed system produced very promising recognition rates for their applications with same set of features and classifiers.

Q2. What have the authors stated for future works in "Model based analysis of face images for facial feature extraction" ?

Further extensions of this work is to enhance the feature sets to include information about pose and lighting variations.

Q3. What are the main features of the face model?

Face models impose knowledge about human faces and reduce high dimensional image data to a small number of expressive model parameters.

Q4. What are the basic facial expressions used for classifiers?

A combination of different facial features is used for classifiers to classify six basic facial expressoins i.e. anger, fear, surprise, saddness, laugh and disgust, facial identitly and gender classification.

Q5. What are the main parameters of the AAM?

Shape and textural parameters define active appearance models (AAM) in partial 3D space with shape parameters extracted from 3D landmarks and texture from 2D image.

Q6. What is the way to extract high-level information from an image?

model parameters that describe the current image content need to be determined in order to extract high-level information, a process known as model fitting.

Q7. What are the challenges of model based image analysis?

Currently available model based techniques are trying to deal with some of the future challenges like developing state-of-the-art algorithms, improving efficiency, fully automated system development and verstality under different applications.

Q8. What is the purpose of this experiment?

In order to experiment feature verstality the authors use two different classifiers with same feature set on three different applications: face recognition, facial expressions recognition and gender classification.

Q9. What is the method to learn facial expressions?

The learning algorithm use to map images features to objective values is a k-Nearest-Neighbor classifier (kNN) learned from the data.

Q10. What is the main topic of this paper?

In the recent decade model based image analysis of human faces has become a challenging field due to its capability to deal with the real world scenarios.

Q11. What is the parameter vector for the extracted texture?

Psbs (1)The extracted texture is parametrized using PCA by using mean texture gmand matrix of eigenvectors Pgto obtain the parameter vector bg [11].

Q12. How did Michel et al. extract the facial expression from an image?

To extract discriptive features from the image, Michel et al. [14] extracted the location of 22 feature points within the face and determine their motion between an image that shows the neutral state of the face and an image that represents a facial expression.

Q13. What is the main topic of the paper?

In this section the authors explain in detail the approach adopted in this paper including model fitting, image warping and parameters extraction for shape, texture and temporal information.

Q14. What is the way to learn facial expressions?

Michel et al. [14] train a Support Vector Machine (SVM) that determines the visible facial expression within the video sequences of the Cohn-Kanade Facial Expression Database by comparing the first frame with the neutral expression to the last frame with the peak expression.

Q15. What is the purpose of the feature set?

The features set is applied to three different applications: face recognition, facial expressions recognition and gender classification, which produced the reasonable results in all three cases for CKFED.

Q16. What is the purpose of this task?

This computer vision task comprises of various phases shown in Figure 1 for which it exploits model-based techniques that accurately localize facial features, seamlessly track them through image sequences, and finally infer facial features.

Q17. How many features are extracted from each image?

The authors extract 85 structural features, 74 textural features and 12 temporal features textural parameters to form a combined feature vector for each image.

Model Based Analysis of Face Images for Facial Feature Extraction

Summary (2 min read)

1 Introduction

3 Our approach

4 Determining High-Level Information

5 Experimental Evaluations

6 Conclusions

Figures (4)

Citations

Cites methods from "Model Based Analysis of Face Images..."

Cites background from "Model Based Analysis of Face Images..."

References

"Model Based Analysis of Face Images..." refers methods in this paper

Related Papers (5)

Frequently Asked Questions (17)

Q1. What are the contributions mentioned in the paper "Model based analysis of face images for facial feature extraction" ?

Q2. What have the authors stated for future works in "Model based analysis of face images for facial feature extraction" ?

Q3. What are the main features of the face model?

Q4. What are the basic facial expressions used for classifiers?

Q5. What are the main parameters of the AAM?

Q6. What is the way to extract high-level information from an image?

Q7. What are the challenges of model based image analysis?

Q8. What is the purpose of this experiment?

Q9. What is the method to learn facial expressions?

Q10. What is the main topic of this paper?

Q11. What is the parameter vector for the extracted texture?

Q12. How did Michel et al. extract the facial expression from an image?

Q13. What is the main topic of the paper?

Q14. What is the way to learn facial expressions?

Q15. What is the purpose of the feature set?

Q16. What is the purpose of this task?

Q17. How many features are extracted from each image?