scispace - formally typeset
Search or ask a question
Book ChapterDOI

Model Based Analysis of Face Images for Facial Feature Extraction

29 Aug 2009-Vol. 5702, pp 99-106
TL;DR: A comprehensive approach to extract a common feature set from the image sequences using active appearance model (AAM) approach, which produced very promising recognition rates for applications with same set of features and classifiers.
Abstract: This paper describes a comprehensive approach to extract a common feature set from the image sequences. We use simple features which are easily extracted from a 3D wireframe model and efficiently used for different applications on a benchmark database. Features verstality is experimented on facial expressions recognition, face reognition and gender classification. We experiment different combinations of the features and find reasonable results with a combined features approach which contain structural, textural and temporal variations. The idea follows in fitting a model to human face images and extracting shape and texture information. We parametrize these extracted information from the image sequences using active appearance model (AAM) approach. We further compute temporal parameters using optical flow to consider local feature variations. Finally we combine these parameters to form a feature vector for all the images in our database. These features are then experimented with binary decision tree (BDT) and Bayesian Network (BN) for classification. We evaluated our results on image sequences of Cohn Kanade Facial Expression Database (CKFED). The proposed system produced very promising recognition rates for our applications with same set of features and classifiers. The system is also realtime capable and automatic.

Summary (2 min read)

1 Introduction

  • In the recent decade model based image analysis of human faces has become a challenging field due to its capability to deal with the real world scenarios.
  • These capabilities of the system suggest to apply it in interactive secnarios like human machine interaction, security of personalized utilities like tokenless devices, facial analysis for person behavior and person security.
  • Temporal features are extracted using optical flow.
  • The remainder of this paper is divided in four main sections.
  • This includes description from model fitting to face image to feature vector formation.

3 Our approach

  • Image warping and parameters extraction for shape, texture and temporal information.the authors.
  • Texture information is mapped from the example image to a reference shape which is the mean shape of all the shapes available in database.
  • Texture warping between the trigulations is performed using affine transformation.
  • The authors use reduced descriptors by trading off between accuracy and run time performance.
  • This computer vision task comprises of various phases shown in Figure 1 for which it exploits model-based techniques that accurately localize facial features, seamlessly track them through image sequences, and finally infer facial features.

4 Determining High-Level Information

  • In order to initialize, We apply the algorithm of Viola et al. [20] to roughly detect the face position within the image.the authors.the authors.
  • To extract descriptive features, the model parameters are exploited.
  • Local motion of feature points is observed using optical flow.
  • The authors extract 85 structural features, 74 textural features and 12 temporal features textural parameters to form a combined feature vector for each image.
  • The face feature vector consists of the shape, texture and temporal variations, which sufficiently defines global and local variations of the face.

5 Experimental Evaluations

  • For experimentation purposes, the authors benchmark their results on Cohn Kanade Facial Expression Database .
  • The database contains 488 short image sequences of 97 different persons performing six universal facial expressions [12].
  • It provides researchers with a large dataset for experimenting and benchmarking purpose.
  • Furthermore, the image sequences are taken in a laboratory environment with predefined illumination conditions, solid background and frontal face views.
  • In order to experiment feature verstality the authors use two different classifiers with same feature set on three different applications: face recognition, facial expressions recognition and gender classification.

6 Conclusions

  • The features set is applied to three different applications: face recognition, facial expressions recognition and gender classification, which produced the reasonable results in all three cases for CKFED.
  • The authors consider different classifiers for checking the versatility of their extracted features.
  • The authors use two different classifiers with same specifications which evidence simplicity of their approach however, the results can be further optimized by trying other classifiers.
  • The database consists of frontal views with uniform illuminations.
  • Further extensions of this work is to enhance the feature sets to include information about pose and lighting variations.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Model Based Analysis of Face Images for Facial
Feature Extraction
Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig
Technische Universit¨at M¨unchen,
Boltzmannstr. 3, 85748 Garching, Germany
{riaz,mayerc,beetz,radig}@in.tum.de
http://www9.in.tum.de
Abstract. This paper describes a comprehensive approach to extract a
common feature set from the image sequences. We use simple features
which are easily extracted from a 3D wireframe model and efficiently
used for different applications on a benchmark database. Features ver-
stality is experimented on facial expressions recognition, face reognition
and gender classification. We experiment different combinations of the
features and find reasonable results with a combined features approach
which contain structural, textural and temporal variations. The idea fol-
lows in fitting a model to human face images and extracting shape and
texture information. We parametrize these extracted information from
the image sequences using active appearance model (AAM) approach.
We further compute temporal parameters using optical flow to consider
local feature variations. Finally we combine these parameters to form a
feature vector for all the images in our database. These features are then
experimented with binary decision tree (BDT) and Bayesian Network
(BN) for classification. We evaluated our results on image sequences of
Cohn Kanade Facial Expression Database (CKFED). The proposed sys-
tem produced very promising recognition rates for our applications with
same set of features and classifiers. The system is also realtime capable
and automatic.
Key words: Feature Extraction, Face Image Analysis, Face Recogni-
tion, Facial Expressions Recognition, Human Robot Interaction
1 Introduction
In the recent decade model based image analysis of human faces has become
a challenging field due to its capability to deal with the real world scenarios.
Further it outperforms the previous techniques which were constrained to user
intervention with the system either to manually interact with system or to be
frontal to the camera. Currently available model based techniques are trying
to deal with some of the future challenges like developing state-of-the-art algo-
rithms, improving efficiency, fully automated system development and verstality
under different applications. In this paper we deal with some of these challenges.
We focus on feature extraction technique which is fully automatic and verstile

2 Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig
enough for different applications like face recognition, facial expressions recogni-
tion and gender classification. These capabilities of the system suggest to apply
it in interactive secnarios like human machine interaction, security of personal-
ized utilities like tokenless devices, facial analysis for person behavior and person
security.
Model-based image interpretation techniques extract information about fa-
cial expression, person identitiy and gender classification from images of human
faces via facial changes. Models take benefit of the prior knowledge of the ob-
ject shape and hence try to match themselves with the object in an image for
which they are designed. Face models impose knowledge about human faces and
reduce high dimensional image data to a small number of expressive model pa-
rameters. We integrate the three-dimensional Candide-3 face model [8] that
has been specifically designed for observing facial features variations defined by
facial action coding system (FACS) [13]. The model parameters together with
extracted texture and motion information is utilized to train classifiers that de-
termine person-specific information. A combination of different facial features is
used for classifiers to classify six basic facial expressoins i.e. anger, fear, surprise,
saddness, laugh and disgust, facial identitly and gender classification.
Our feature vector for each image consists of structral, textural and temporal
variations of the faces in the image sequence. Shape and textural parameters de-
fine active appearance models (AAM) in partial 3D space with shape parameters
extracted from 3D landmarks and texture from 2D image. Temporal features are
extracted using optical flow. These extracted features are more informative than
AAM parameters since we consider local motion patterns in the image sequences
in the form temporal parameters.
The remainder of this paper is divided in four main sections. In section
2, related work to our applications is discussed. In section 3 we discuss our
approach in detail. In section 4 higher level features extraction from model based
image interpretation is described. This includes description from model fitting
to face image to feature vector formation. Section 5 discusses about evaluation
of our results on the database. Finally we conclude our results with some future
directions.
2 Related work
We initiate with a three step approach that has been suggested by Pantic et
al. [1] for facial expression recognition. However, the generality of this approach
makes it applicable not only to facial expression estimation but also to apply it
for person identification and gender classification at the same time. The first step
aims at determining the position and shape of the face in the image by fitting
a model. Descriptive features are extracted in the second step. In the third step
a classifier is applied to the features to determine high level information from
the features. Several face models and fitting approaches have been presented in
the recent years. Cootes et al. [5] introduced modeling face shapes with Active
Contours. Further enhancements included the idea of expanding shape models

Model Based Analysis of Face Images for Facial Feature Extraction 3
with texture information [6]. In contrast, three-dimensional shape models such
as the Candide-3 face model consideres the real-world face structure rather than
the appearance in the image. Blanz et al. propose a face model that consideres
both, the three-dimensional structur as well as its texture [7]. However, model
parameters that describe the current image content need to be determined in
order to extract high-level information, a process known as model fitting. In
order to fit a model to an image. Van Ginneken et al. learned local objective
functions from annotated training images [18]. In this work, image features are
obtained by approximating the pixel values in a region around a pixel of in-
terest The learning algorithm use to map images features to objective values
is a k-Nearest-Neighbor classifier (kNN) learned from the data. We used simi-
lar methodology developed by Wimmer et al. [4] which combines multitude of
qualitatively different features [19], determines the most relevant features using
machine learning and learns objective functions from annotated images [18]. To
extract discriptive features from the image, Michel et al. [14] extracted the lo-
cation of 22 feature points within the face and determine their motion between
an image that shows the neutral state of the face and an image that represents
a facial expression. The very similar approach of Cohn et al. [15] uses hierar-
chical optical flow in order to determine the motion of 30 feature points. A
set of training data formed from the extracted features is utilized to learn on
a classifier. For facial expressions, some approaches infer the expressions from
rules stated by Ekman and Friesen [13]. This approach is applied by Kotsia et
al. [16] to design Support Vector Machines (SVM) for classification. Michel et
al. [14] train a Support Vector Machine (SVM) that determines the visible facial
expression within the video sequences of the Cohn-Kanade Facial Expression
Database by comparing the first frame with the neutral expression to the last
frame with the peak expression. In order to perform face recognition applications
many researchers have applied model based approaches. Edwards et al [2] use
weighted distance classifier called Mahalanobis distance measure for AAM pa-
rameters. However, they isolate the sources of variation by maximizing the inter
class variations using Linear Discriminant Analysis (LDA), a holistic approach
which was used for Fisherfaces representation [3]. However they do not discuss
face recognition under facial expression. Riaz et al [17] apply similar features for
explaining face recognition using bayesian networks. However results are limited
to face recognition application only. They used expression invariant technique
for face recognition, which is also used in 3D scenarios by Bronstein et al [9]
without 3D reconstruction of the faces and using geodesic distance. Park et. al.
[10] apply 3D model for face recognition on videos from CMU Face in Action
(FIA) database. They reconstruct a 3D model acquiring views from 2D model
fitting to the images.

4 Zahid Riaz, Christoph Mayer, Michael Beetz, and Bernd Radig
3 Our approach
In this section we explain in detail the approach adopted in this paper including
model fitting, image warping and parameters extraction for shape, texture and
temporal information.
We use a wireframe 3D face model known as candide-III [8]. The model is
fitted to the face image using objective function approach [4]. After fitting the
model to the example face image, we use the projections of the 3D landmarks
in 2D for texture mapping. Texture information is mapped from the example
image to a reference shape which is the mean shape of all the shapes available
in database. However the choice of mean shape is arbitrary. Image texture is
extracted using planar subdivisions of the reference and the example shapes. We
use delauny triangulations of the distribution of our model points. Texture warp-
ing between the trigulations is performed using affine transformation. Principal
Component Analysis (PCA) is used to obtain the texture and shape parameters
of the example image. This approach is similar to extracting AAM parameters.
In addition to AAM parameters, temporal features of the facial changes are also
calculated. Local motion of the feature points is observed using optical flow. We
use reduced descriptors by trading off between accuracy and run time perfor-
mance. These features are then used for classification. Our approach achieves
real-time performance and provides robustness against facial expressions in real-
world scenarios. This computer vision task comprises of various phases shown
in Figure 1 for which it exploits model-based techniques that accurately localize
facial features, seamlessly track them through image sequences, and finally infer
facial features. We specifically adapt state-of-the-art techniques to each of these
challenging phases.
4 Determining High-Level Information
In order to initialize, We apply the algorithm of Viola et al. [20] to roughly
detect the face position within the image. Then, model parameters are estimated
by applying the approach of Wimmer et al. [4] because it is able to robustly
determine model parameters in real-time.
To extract descriptive features, the model parameters are exploited. The
model configuration represents information about various facial features, such as
lips, eye brows or eyes and therefore contributes to the extracted features. These
structural features include both, information about the person’s face structure
that helps to determine person-specific information such as gender or identity.
Furthermore, changes in these features indicates shape changes and therefore
contributes to the recognition of facial expressions.
The shape x is parametrized by using mean shape x
m
and matrix of eigen-
vectors P
s
to obtain the parameter vector b
s
[11].
x = x
m
+ P
s
b
s
(1)

Model Based Analysis of Face Images for Facial Feature Extraction 5
Fig. 1. Our Approach: Sequential flow for feature extraction
The extracted texture is parametrized using PCA by using mean texture
g
m
and matrix of eigenvectors P
g
to obtain the parameter vector b
g
[11]. Figure 2
shows shape model fitting and texture extracted from face image.
g = g
m
+ P
g
b
g
(2)
Further, temporal features of the facial changes are also calculated that take
movement over time into consideration. Local motion of feature points is ob-
served using optical flow. We do not specify the location of these feature points
manually but distribute equally in the whole face region. The number of feature
points is chosen in a way that the system is still capable of performing in real time
and therefore inherits a trade off between accuracy and runtime performance.
Figure 3 shows motion patterns for some of the images from database.
We combine all extracted features into a single feature vector. Single image
information is considered by the structural and textural features whereas image
sequence information is considered by the temporal features. The overall feature
vector becomes:
u = (b
s
1
, ...., b
s
m
, b
g
1
, ...., b
g
n
, b
t
1
, ...., b
t
p
, ) (3)
Where b
s
, b
g
and b
t
are shape, textural and temporal parameters respec-
tively.
We extract 85 structural features, 74 textural features and 12 temporal fea-
tures textural parameters to form a combined feature vector for each image.
These features are then used for binary decision tree (BDT) and bayesian net-
work (BN) for different classifications. The face feature vector consists of the

Citations
More filters
Journal ArticleDOI
28 Apr 2014-Sensors
TL;DR: A software architecture that is able to detect, recognize, classify and generate facial expressions in real time using FACS, the de facto standard for facial expression recognition and synthesis is presented.
Abstract: This paper presents a multi-sensor humanoid robotic head for human robot interaction. The design of the robotic head, Muecas, is based on ongoing research on the mechanisms of perception and imitation of human expressions and emotions. These mechanisms allow direct interaction between the robot and its human companion through the different natural language modalities: speech, body language and facial expressions. The robotic head has 12 degrees of freedom, in a human-like configuration, including eyes, eyebrows, mouth and neck, and has been designed and built entirely by IADeX (Engineering, Automation and Design of Extremadura) and RoboLab. A detailed description of its kinematics is provided along with the design of the most complex controllers. Muecas can be directly controlled by FACS (Facial Action Coding System), the de facto standard for facial expression recognition and synthesis. This feature facilitates its use by third party platforms and encourages the development of imitation and of goal-based systems. Imitation systems learn from the user, while goal-based ones use planning techniques to drive the user towards a final desired state. To show the flexibility and reliability of the robotic head, the paper presents a software architecture that is able to detect, recognize, classify and generate facial expressions in real time using FACS. This system has been implemented using the robotics framework, RoboComp, which provides hardware-independent access to the sensors in the head. Finally, the paper presents experimental results showing the real-time functioning of the whole system, including recognition and imitation of human facial expressions.

51 citations


Cites methods from "Model Based Analysis of Face Images..."

  • ...Others similar works are evaluated, which use the Cohn–Kanade Facial Expression Database and the mesh model Candide-3, but with different classification systems: Bayesian network [11] and model tree [10]....

    [...]

  • ...However, there are many alternatives to this type of classifier, such as support vector machine (SVM) [9], model tree [10], binary decision tree [11] and neural networks [12], among others....

    [...]

Proceedings ArticleDOI
24 Mar 2012
TL;DR: This paper attempts to overcome the variations of facial expression and proposes a biological vision-based facial description, namely Perceived Facial Images (PFIs), applied to facial images for 2D face recognition.
Abstract: Face recognition is becoming a difficult process because of the generally similar shapes of faces and because of the numerous variations between images of the same face. A face recognition system aims at recognizing a face in a manner that is as independent as possible of these image variations. Such variations make face recognition, on the basis of appearance, a difficult task. This paper attempts to overcome the variations of facial expression and proposes a biological vision-based facial description, namely Perceived Facial Images (PFIs), applied to facial images for 2D face recognition. Based on the intermediate facial description, SIFT-based feature matching is then carried out to calculate similarity measures between a given probe face and the gallery ones. Because the proposed biological vision-based facial description generates a PFI for each quantized gradient orientation of facial images, we further propose a weighted sum rule based fusion scheme. The proposed approach was tested on three facial expression databases: the Cohn and Kanade Facial Expression Database, the Japanese Female Facial Expression (JAFFE) Database and the FEEDTUM Database. The experimental results demonstrate the effectiveness of the proposed method.

20 citations

Proceedings ArticleDOI
01 Nov 2011
TL;DR: The experimental results show that fusion of visible and thermal infrared features can improve the accuracy rate of negative expressions and reduce the discrepancy, and can improved the expression recognition performance.
Abstract: In this paper, we propose a spontaneous facial expression recognition method by using feature-level fusion of visible and thermal infrared facial images. Firstly, the appearance features of visible images and statistic parameters of thermal infrared difference images are extracted. Then, analysis of variance is adopted to select the optimal feature subsets from both visible and thermal ones. These selected features are combined as the input of a K-Nearest Neighbors classifier. We experimentally evaluate the effectiveness of the proposed method on USTC-NVIE database. The experimental results show that fusion of visible and thermal infrared features can improve the accuracy rate of negative expressions and reduce the discrepancy. Thus, it can improve the expression recognition performance.

18 citations


Cites background from "Model Based Analysis of Face Images..."

  • ...Among those, most researches focus on the representation of visual information for facial expression [2][3][4]....

    [...]

Journal ArticleDOI
TL;DR: A biological vision-based facial description, namely perceived facial images, applied to extract features from human face images is proposed and a good architecture of neural network classifier can be obtained.
Abstract: This study presents a modified constructive training algorithm for multilayer perceptron (MLP) which is applied to face recognition problem. An incremental training procedure has been employed where the training patterns are learned incrementally. This algorithm starts with a small number of training patterns and a single hidden-layer using an initial number of neurons. During the training, the hidden neurons number is increased when the mean square error (MSE) threshold of the training data (TD) is not reduced to a predefined value. Input patterns are trained incrementally until all patterns of TD are learned. The aim of this algorithm is to determine the adequate initial number of hidden neurons, the suitable number of training patterns in the subsets of each class and the number of iterations during the training step as well as the MSE threshold value. The proposed algorithm is applied in the classification stage in face recognition system. For the feature extraction stage, this paper proposes to use a biological vision-based facial description, namely perceived facial images, applied to extract features from human face images. Gabor features and Zernike moment have been used in order to determine the best feature extractor. The proposed approach is tested on the Cohn-Kanade Facial Expression Database. Experimental results indicate that a good architecture of neural network classifier can be obtained. The effectiveness of the proposed method compared with the fixed MLP architecture has been proved.

16 citations

Journal ArticleDOI
TL;DR: This paper proposes a unique design approach, which uses reverse engineering techniques of three dimensional measurement and analysis, to visualize some critical facial motion data, including facial skin localized deformations, motion directions of facial features, and displacements of facial skin elements on a human face in different facial expressional states.
Abstract: The static and dynamic realistic effects of the appearance are essential but challenging targets in the development of human face robots. Human facial anatomy is the primary theoretical foundation for designing the facial expressional mechanism in most existent human face robots. Based on the popular study of facial action units, actuators are arranged to connect to certain control points underneath the facial skin in prearranged directions to mimic the facial muscles involved in generating facial expressions. Most facial robots fail to generate realistic facial expressions because there are significant differences in the method of generating expressions between the contracting muscles and inner tissues of human facial skin and the wire pulling of a single artificial facial skin. This paper proposes a unique design approach, which uses reverse engineering techniques of three dimensional measurement and analysis, to visualize some critical facial motion data, including facial skin localized deformations, motion directions of facial features, and displacements of facial skin elements on a human face in different facial expressional states. The effectiveness and robustness of the proposed approach have been verified in real design cases on face robots.

12 citations

References
More filters
Book
25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Abstract: Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. *Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization

20,196 citations


"Model Based Analysis of Face Images..." refers methods in this paper

  • ...The results are evaluated using classifiers from weka [21] with 10-fold cross validation....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

13,037 citations

Journal ArticleDOI
TL;DR: A face recognition algorithm which is insensitive to large variation in lighting direction and facial expression is developed, based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variations in lighting and facial expressions.
Abstract: We develop a face recognition algorithm which is insensitive to large variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space-if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's linear discriminant and produces well separated classes in a low-dimensional subspace, even under severe variation in lighting and facial expressions. The eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed "Fisherface" method has error rates that are lower than those of the eigenface technique for tests on the Harvard and Yale face databases.

11,674 citations

Proceedings ArticleDOI
07 Jul 2001
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algo- rithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a "cascade" which allows back- ground regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection perfor- mance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

10,592 citations

Journal ArticleDOI
Abstract: We describe a new method of matching statistical models of appearance to images. A set of model parameters control modes of shape and gray-level variation learned from a training set. We construct an efficient iterative matching algorithm by learning the relationship between perturbations in the model parameters and the induced image errors.

6,200 citations

Frequently Asked Questions (17)
Q1. What are the contributions mentioned in the paper "Model based analysis of face images for facial feature extraction" ?

This paper describes a comprehensive approach to extract a common feature set from the image sequences. The authors use simple features which are easily extracted from a 3D wireframe model and efficiently used for different applications on a benchmark database. The idea follows in fitting a model to human face images and extracting shape and texture information. The authors parametrize these extracted information from the image sequences using active appearance model ( AAM ) approach. The authors further compute temporal parameters using optical flow to consider local feature variations. The proposed system produced very promising recognition rates for their applications with same set of features and classifiers. 

Further extensions of this work is to enhance the feature sets to include information about pose and lighting variations. 

Face models impose knowledge about human faces and reduce high dimensional image data to a small number of expressive model parameters. 

A combination of different facial features is used for classifiers to classify six basic facial expressoins i.e. anger, fear, surprise, saddness, laugh and disgust, facial identitly and gender classification. 

Shape and textural parameters define active appearance models (AAM) in partial 3D space with shape parameters extracted from 3D landmarks and texture from 2D image. 

model parameters that describe the current image content need to be determined in order to extract high-level information, a process known as model fitting. 

Currently available model based techniques are trying to deal with some of the future challenges like developing state-of-the-art algorithms, improving efficiency, fully automated system development and verstality under different applications. 

In order to experiment feature verstality the authors use two different classifiers with same feature set on three different applications: face recognition, facial expressions recognition and gender classification. 

The learning algorithm use to map images features to objective values is a k-Nearest-Neighbor classifier (kNN) learned from the data. 

In the recent decade model based image analysis of human faces has become a challenging field due to its capability to deal with the real world scenarios. 

Psbs (1)The extracted texture is parametrized using PCA by using mean texture gmand matrix of eigenvectors Pgto obtain the parameter vector bg [11]. 

To extract discriptive features from the image, Michel et al. [14] extracted the location of 22 feature points within the face and determine their motion between an image that shows the neutral state of the face and an image that represents a facial expression. 

In this section the authors explain in detail the approach adopted in this paper including model fitting, image warping and parameters extraction for shape, texture and temporal information. 

Michel et al. [14] train a Support Vector Machine (SVM) that determines the visible facial expression within the video sequences of the Cohn-Kanade Facial Expression Database by comparing the first frame with the neutral expression to the last frame with the peak expression. 

The features set is applied to three different applications: face recognition, facial expressions recognition and gender classification, which produced the reasonable results in all three cases for CKFED. 

This computer vision task comprises of various phases shown in Figure 1 for which it exploits model-based techniques that accurately localize facial features, seamlessly track them through image sequences, and finally infer facial features. 

The authors extract 85 structural features, 74 textural features and 12 temporal features textural parameters to form a combined feature vector for each image.