scispace - formally typeset

Proceedings ArticleDOI

3D face recognition by projection-based methods

02 Feb 2006-Vol. 6072, pp 194-204

TL;DR: The feature extraction techniques are applied to three different representations of registered faces, namely, 3D point clouds, 2D depth images and 3D voxel, and the resulting feature vectors are matched using Linear Discriminant Analysis.

AbstractIn this paper, we investigate recognition performances of various projection-based features applied on registered 3D scans of faces Some features are data driven, such as ICA-based features or NNMF-based features Other features are obtained using DFT or DCT-based schemes We apply the feature extraction techniques to three different representations of registered faces, namely, 3D point clouds, 2D depth images and 3D voxel We consider both global and local features Global features are extracted from the whole face data, whereas local features are computed over the blocks partitioned from 2D depth images The block-based local features are fused both at feature level and at decision level The resulting feature vectors are matched using Linear Discriminant Analysis Experiments using different combinations of representation types and feature vectors are conducted on the 3D-RMA dataset

Summary (3 min read)

Introduction

  • In intensity images, faces acquired from the same person show high variability due to lighting conditions.
  • Section 3 describes the projection-based features and their extraction from different representation types.

2. REPRESENTATION TYPES OF FACE DATA

  • The authors have compared three different representation schemes and extracted the features from these representations.
  • These representation types are 3D point cloud, 2D depth image and 3D voxel representation.
  • All these representations are derived from registered and cropped face data.
  • The faces are registered using the ICP algorithm described by Akarun et al.

2.1. 3D point cloud

  • The 3D point cloud representation is the set of 3D coordinates, },,{ zyx of the range data, obtained after registration.
  • The authors have all the correspondences, defined at the registration process.
  • Thus the authors can threat the ordered set of the coordinates as the signal describing the face.
  • Another way of arranging the set of coordinates is to form an Nx3 matrix, where each dimension is placed into one of the columns.

2.2. 2D depth image

  • 2D depth image is a commonly used representation type for face recognition.
  • The point cloud is placed onto a regular X-Y grid, and the Z coordinates are mapped onto this grid to form the depth image ),.
  • This representation type is similar to intensity images by structure, therefore many techniques applied on intensity images can be also applied to ),( yxI .
  • The authors have tested the following descriptors, which were previously applied to 2D intensity images, with the depth images: DFT, DCT, block-based versions of DFT and DCT, Independent Component Analysis (ICA) and Nonnegative Matrix Factorization (NNMF).

2.3. 3D voxel representation

  • To obtain this function, the authors implement the following steps:.
  • The center of this voxel grid should coincide with the center of mass of the point cloud.
  • Then the authors define a binary function ),,( zyxV on the voxel grid.
  • If, in a particular voxel at location ),,( zyx , there does not exist any point from the face, then ),,( zyxV at that voxel is set to zero.
  • By using the distance transform the authors distribute the shape information of the surface throughout the 3D space and obtain a richer representation.

3.1. Global DFT/DCT

  • The authors could have concatenated the X, Y and Z coordinates and computed the one-dimensional DFT, however, then they would lose the inherent relation within the coordinates of a point in the face.
  • One should note that, most of the energy is concentrated in the band-pass region due to the zigzag scan of the face as can be observed from the plots of the coordinates in Figure 1.
  • In order to obtain global DFT-based features from the depth image, the authors apply 2D-DFT to the function ),( yxI .
  • The authors extract the first KxK coefficients of this matrix and obtain a feature vector of size 2K2 – 1, by concatenating the real and imaginary parts .

3.2. Block-based DFT/DCT

  • In addition to the global DFT/DCT-based techniques, the authors also extract local features, based on the calculation of DFT coefficients on blocks.
  • The authors perform fusion at decision level by using the sum rule.
  • The depth image of an input face to be recognized is partitioned into blocks and each block is matched with the corresponding blocks of the depth images in the database.
  • From this comparison, each face in the database gets a rank.

3.3. Independent Component Analysis (ICA)

  • Let X be the data matrix, where each column includes the data from one face, then the authors can represent X as follows: ASX = where A is the mixing matrix.
  • For the point cloud, the ),,( zyx coordinates are concatenated to form a one-dimensional vector.
  • For depth images, the authors follow a similar procedure.
  • Then PCA is applied to the face database and ICA-based features are derived from the PCA coefficients of the faces.

3.4. Nonnegative Matrix Factorization (NNMF)

  • W and H are obtained using the multiplicative update rules described by Lee and Seung14.
  • To construct the data matrix X , the authors either use the point cloud representations or the depth images.
  • Figure 13 shows the first five basis faces obtained from NNMF of the depth images.
  • Since the nonnegativity constraints only allow additive combinations, NNMF provides a parts-based representation.

4. MATCHING FEATURES

  • The authors use linear discrimination for classifying an input feature vector.
  • The authors estimate the covariance matrix of the feature vectors in the training set and fit a multivariate normal density to each class ( person ) using this global covariance matrix.
  • When there is an input face to be recognized, the feature vector of the face is extracted and the Mahalanobis distances of the input feature vector to the class centers are calculated.
  • The class giving the smallest Mahalanobis distance is chosen as the identity of the input face.

5. EXPERIMENTAL RESULTS

  • The authors have used the 3D-RMA face database16 for comparing the schemes discussed above.
  • The 3D-RMA database contains face scans of 106 subjects.
  • The authors have used 4 sessions for training (424 face scans) and utilized the rest 193 faces for test.
  • Table 2 gives the identification results of all the schemes, averaged over the 5 experiments.

6. CONCLUSION

  • Several feature types are proposed for the recognition of pre-registered 3D face data.
  • The features are extracted from three different face representations of the face data.
  • Experimental results show that the point cloud representation along with the ICA-based or NNMF-based features gave superior results, 99.8 per cent recognition performance.
  • On the other hand, ICA and NNMF have the ability to extract the essence of the information present in the large data matrices.
  • Several fusion methods at both feature and decision levels can be applied for block-based DFT-DCT methods.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

* This work was partially supported by TÜBİTAK project 103E038,BİTAK project 104E080 and BU Research Fund 05HA203.
3D Face Recognition by Projection Based Methods
Helin Dutağacı
(1)
, Bülent Sankur
(1)
, Yücel Yemez
(2)
(1)
Electrical and Electronic Engineering Department, Boğaziçi University, Bebek, İstanbul, Turkey
[dutagach, bulent.sankur]@boun.edu.tr
Telephone: (90) 212 359 6414, Fax: (90) 212 287 2465
(2)
Computer Engineering Department, Koç University, Bebek, İstanbul, Turkey
yyemez@ku.edu.tr
Corresponding author: Bülent Sankur
ABSTRACT
In this paper, we investigate recognition performances of various projection-based features applied on registered 3D
scans of faces. Some features are data driven, such as ICA-based features or NNMF-based features. Other features are
obtained using DFT or DCT-based schemes. We apply the feature extraction techniques to three different representations
of registered faces, namely, 3D point clouds, 2D depth images and 3D voxel. We consider both global and local features.
Global features are extracted from the whole face data, whereas local features are computed over the blocks partitioned
from 2D depth images. The block-based local features are fused both at feature level and at decision level. The resulting
feature vectors are matched using Linear Discriminant Analysis. Experiments using different combinations of
representation types and feature vectors are conducted on the 3D-RMA dataset.
Keywords: Face biometry, 3D face recognition, Independent Component Analysis, Nonnegative Matrix Factorization
1. INTRODUCTION
There are a number of challenges encountered with face recognition from 2D intensity images. In intensity images, faces
acquired from the same person show high variability due to lighting conditions. Face segmentation from a cluttered
background is another problem. Since 3D acquisition devices measure shape information, 3D face models are
independent of lighting conditions. In addition, segmentation of 3D faces from background is relatively an easy task, for
range images, as far as the face is within the range of the scanner. Furthermore, 3D face information can model small
pose variations as opposed to intensity images.
The shape information of 3D faces is descriptive enough to distinguish between people. This information can either be
used alone, or can be fused with 2D intensity information to increase recognition performance. In this work, we have
used only 3D range images for identification purposes.
We can summarize the work on 3D face identification as follows: Phillips
1, 2, 3
et al. proposed a 3D face recognition
system based on curvature calculation on range data. Tanaka et al.
4
utilize Extended Gaussian Image, which includes
information of principal curvatures and their directions. Different EGIs are compared using Fisher’s spherical
correlation. Another work based on Extended Gaussian Image can be found in the paper of Lee et al
5
. Gordon
6
, proposed
a template-based recognition system, which again involves curvature calculation. Chua et al.
7
, have used point
signatures, a free form surface representation technique. Beumier et al.
8
proposed the use of face profiles for
identification. They extracted central profile and lateral profile, and compared curvature values along these profiles.
Chang et al.
9
applied Principal Component Analysis on 3D range data, together with the intensity images.
In this work we have utilized 3D face data registered by the algorithm described by Akarun et al.
10, 11, 12
. After
registration, they have used the following methods for matching faces: Euclidean distance between point clouds,
matching surface normals, principal component analysis, linear discriminant analysis and matching central and lateral
profiles.
Security, Steganography, and Watermarking of Multimedia Contents VIII, edited by Edward J. Delp III, Ping Wah Wong,
Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 6072, 60720I, © 2006 SPIE-IS&T · 0277-786X/06/$15
SPIE-IS&T/ Vol. 6072 60720I-1
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/06/2017 Terms of Use: http://spiedigitallibrary.org/ss/termsofuse.aspx

We propose three different representation schemes of 3D face information and a number of projection-based features,
and compare their recognition performance. The representation schemes are 3D point cloud, 2D depth image and 3D
voxel representation. Table 1, summarizes the proposed features extracted from these representations.
Table 1: Representation schemes and features used for 3D face recognition
Representation Features
3D Point Cloud
2D DFT (Discrete Fourier Transform )
ICA (Independent Component Analysis)
NNMF (Nonnegative Matrix Factorization)
2D Depth Image
Global DFT
Global DCT
Block-based DFT (Fusion at feature level)
Block-based DCT (Fusion at feature level)
Block-based DFT (Fusion at decision level)
Block-based DCT (Fusion at decision level)
ICA (Independent Component Analysis)
NNMF (Nonnegative Matrix Factorization)
3D Voxel Representation
3D DFT (Discrete Fourier Transform )
The paper is organized as follows: Section 2 introduces the representation types of 3D face data. Section 3 describes the
projection-based features and their extraction from different representation types. Section 4, briefly explains the distance
measure between feature vectors. In Section 5, we give the experimental results. Finally we conclude with section 6.
2. REPRESENTATION TYPES OF FACE DATA
In this work, we have compared three different representation schemes and extracted the features from these
representations. These representation types are 3D point cloud, 2D depth image and 3D voxel representation. All these
representations are derived from registered and cropped face data. The faces are registered using the ICP algorithm
described by Akarun et al.
10, 11 ,12
2.1. 3D point cloud
The 3D point cloud representation is the set of 3D coordinates,
},,{ zyx of the range data, obtained after registration. We
have all the correspondences, defined at the registration process. Thus we can threat the ordered set of the coordinates as
the signal describing the face. Figure 1.a shows a sample point cloud representation, while Figure 1.b, c and d show the
x
, y and
z
vectors respectively, as a function of the vector index.
If we have N points in the face data, we can concatenate the
x
, y and
z
coordinates and obtain a one-dimensional
signal of length 3N. Another way of arranging the set of coordinates is to form an Nx3 matrix, where each dimension is
placed into one of the columns. We choose one of these two arrangements of the data, depending on the feature type we
would like to estimate.
SPIE-IS&T/ Vol. 6072 60720I-2
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/06/2017 Terms of Use: http://spiedigitallibrary.org/ss/termsofuse.aspx

X-coordinate
i.I
Ind6x
Y-coordinate
Ind6x
Z-coordinate
I ''I
Ind6x
(a)
(b)
(c)
(d)
Figure 1: (a) Point cloud representation. (b, c, d)
x
, y and
z
vectors respectively, as a function of the vector index.
2.2. 2D depth image
2D depth image is a commonly used representation type for face recognition. It is sometimes called as 2 ½ D image since
it encodes 3D information. The point cloud is placed onto a regular X-Y grid, and the Z coordinates are mapped onto this
grid to form the depth image
),( yxI (Figure 2). This representation type is similar to intensity images by structure,
therefore many techniques applied on intensity images can be also applied to
),( yxI . We have tested the following
descriptors, which were previously applied to 2D intensity images, with the depth images: DFT, DCT, block-based
versions of DFT and DCT, Independent Component Analysis (ICA) and Nonnegative Matrix Factorization (NNMF).
Figure 2: 2D depth images from side and from top.
SPIE-IS&T/ Vol. 6072 60720I-3
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/06/2017 Terms of Use: http://spiedigitallibrary.org/ss/termsofuse.aspx

1
V
H
2.3. 3D voxel representation
3D voxel representation can be regarded as a function
),,( zyxV
d
, filling the 3D space. To obtain this function, we
implement the following steps: The point cloud is placed in an NxNxN voxel grid. The center of this voxel grid should
coincide with the center of mass of the point cloud. Then we define a binary function
),,( zyxV on the voxel grid. If, in
a particular voxel at location
),,( zyx , there does not exist any point from the face, then ),,( zyxV at that voxel is set
to zero. Otherwise,
),,( zyxV gets a value of one. Figure 3 shows a sample point cloud, and the corresponding binary
),,( zyxV displayed as a negative image.
Figure 3: Point cloud and its binary voxel representation.
After the 3D binary function is obtained, we apply 3D distance transform on this binary function to get
),,( zyxV
d
. This
function gets a value of zero on the face surface, and the value increases as we get further away the surface. By using the
distance transform we distribute the shape information of the surface throughout the 3D space and obtain a richer
representation. Figure 4 gives slices from the voxel representation based on the distance transform.
Figure 4: Slices from the voxel representation based on the distance transform.
3. FEATURES FOR FACE RECOGNITION
In Table 1, we have summarized the features to be compared with respect to their face recognition performance. These
features can be grouped into four categories: Global DFT/DCT-based features block DFT/DCT-based features, ICA
coefficients and NNMF-based features.
3.1. Global DFT/DCT
In order to compute DFT coefficients from the point cloud of N points, we first define an Nx3 matrix, P, and replace
coordinates of each dimension to one column of P:
][
1113 NxNxNxNx
ZYXP =
This matrix can be regarded as a 2D function, and we apply 2D DFT on this function. We could have concatenated the
X, Y and Z coordinates and computed the one-dimensional DFT, however, then we would lose the inherent relation
within the coordinates of a point in the face. DFT coefficients are strongly dependent on the order of the data, and we
SPIE-IS&T/ Vol. 6072 60720I-4
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/06/2017 Terms of Use: http://spiedigitallibrary.org/ss/termsofuse.aspx

Real part
Jmagnaiypait
ofDFTcoeff. ofDFfcoeff.
First KxK coefficients
of real part
2D DFT
• Real prnt ofDFT
I,
Iiiiagiiiarv Pifit
of DFT
First KxK-1 coefficients
of imaginary pait
Feature vector
intended to keep the X, Y and Z coordinates of a point, close in the data structure. The 2D-DFT coefficients of P are then
computed as follows:
)
3
2
exp()
2
exp(
1
3
1
dj
N
ni
N
nd
ndijij
ππ
∑∑
==
==
PDFT{P}FP
FP is a matrix of size Nx3. We take the first K coefficients of the first column of this matrix, and obtain a feature vector
of size 2K – 1 by concatenating the real and imaginary parts of the K complex coefficients. Figure 5 shows a sample
DFT-based feature vector of the point cloud. One should note that, most of the energy is concentrated in the band-pass
region due to the zigzag scan of the face as can be observed from the plots of the coordinates in Figure 1. One future
work can be investigation of the appropriate band that will give superior recognition results.
Figure 5: Sample DFT-based feature vector obtained from point cloud.
In order to obtain global DFT-based features from the depth image, we apply 2D-DFT to the function
),( yxI . The
resulting DFT coefficients are of the same size with the depth image. We extract the first KxK coefficients of this matrix
and obtain a feature vector of size 2K
2
– 1, by concatenating the real and imaginary parts (Figure 6). Likewise, we get
the global DCT-based features; however, in this case we obtain a feature vector of size K
2
since DCT coefficients are
real.
Figure 6: Extraction of global DFT-based features from depth image.
We also extract DFT-based descriptors from the voxel representation. We compute the 3D-DFT coefficients of the
distance transform function
),,( zyxV
d
, and extract the first KxKxK coefficients to form the feature vector. By
concatenating the real and imaginary parts, we obtain a feature vector of size 2K
3
– 1 (Figure 7).
SPIE-IS&T/ Vol. 6072 60720I-5
Downloaded From: http://proceedings.spiedigitallibrary.org/ on 07/06/2017 Terms of Use: http://spiedigitallibrary.org/ss/termsofuse.aspx

Citations
More filters

01 Jan 2008
TL;DR: This book presents an introduction to the principles of the fast Fourier transform, which covers FFTs, frequency domain filtering, and applications to video and audio signal processing.
Abstract: This manuscript describes a number of algorithms that can be used to quickly evaluate a polynomial over a collection of points and interpolate these evaluations back into a polynomial. Engineers define the “Fast Fourier Transform” as a method of solving the interpolation problem where the coefficient ring used to construct the polynomials has a special multiplicative structure. Mathematicians define the “Fast Fourier Transform” as a method of solving the evaluation problem. One purpose of the document is to provide a mathematical treatment of the topic of the “Fast Fourier Transform” that can also be understood by someone who has an understanding of the topic from the engineering perspective. The manuscript will also introduce several new algorithms that solve the fast multipoint evaluation problem over certain finite fields and require fewer finite field operations than existing techniques. The document will also demonstrate that these new algorithms can be used to multiply polynomials with finite field coefficients with fewer operations than Schonhage's algorithm in most circumstances. A third objective of this document is to provide a mathematical perspective of several algorithms which can be used to multiply polynomials of size which is not a power of two. Several improvements to these algorithms will also be discussed. Finally, the document will describe several applications of the “Fast Fourier Transform” algorithms presented and will introduce improvements in several of these applications. In addition to polynomial multiplication, the applications of polynomial division with remainder, the greatest common divisor, decoding of Reed-Solomon error-correcting codes, and the computation of the coefficients of a discrete Fourier Series will be addressed.

240 citations


Journal ArticleDOI
TL;DR: A robust multilevel fusion strategy involving cascaded multimodal fusion of audio-lip-face motion, correlation and depth features for biometric person authentication is proposed, which shows improved fusion performance for a range of tested levels of audio and video degradation.
Abstract: In this paper, we propose a robust multilevel fusion strategy involving cascaded multimodal fusion of audio-lip-face motion, correlation and depth features for biometric person authentication. The proposed approach combines the information from different audio-video based modules, namely: audio-lip motion module, audio-lip correlation module, 2D+3D motion-depth fusion module, and performs a hybrid cascaded fusion in an automatic, unsupervised and adaptive manner, by adapting to the local performance of each module. This is done by taking the output-score based reliability estimates (confidence measures) of each of the module into account. The module weightings are determined automatically such that the reliability measure of the combined scores is maximised. To test the robustness of the proposed approach, the audio and visual speech (mouth) modalities are degraded to emulate various levels of train/test mismatch; employing additive white Gaussian noise for the audio and JPEG compression for the video signals. The results show improved fusion performance for a range of tested levels of audio and video degradation, compared to the individual module performances. Experiments on a 3D stereovision database AVOZES show that, at severe levels of audio and video mismatch, the audio, mouth, 3D face, and tri-module (audio-lip motion, correlation and depth) fusion EERs were 42.9%, 32%, 15%, and 7.3%, respectively, for biometric person authentication task.

37 citations


Proceedings ArticleDOI
23 Oct 2009
TL;DR: The results show that the most energetic features, low frequency components, are not the most discriminating features in this 3D face recognition method.
Abstract: This paper presents a 3D face recognition method In this method, 3D Discrete Cosine Transform (DCT) is used to extract features Before the feature extraction, faces are aligned with respect to nose tip and then registered two times: according to average nose and average face Then the coefficients of 3D transformation are calculated The most discriminating 3D transform coefficients are selected as the feature vector where the ratio of between-class variance and within-class variance is used for discriminant coefficient selection The results show that the most energetic features, low frequency components, are not the most discriminating features The method was also modified based on 3D Discrete Fourier Transform (DFT) for feature selection as regarding real and complex DFT coefficients as independent features Discriminating features were matched by using the Nearest Neighbor classifier Recognition experiments were realized on 3D RMA face database The proposed method yileds a recognition rate above 99% for 3D DCT based features

14 citations


Cites methods from "3D face recognition by projection-b..."

  • ...In the range image based approaches, well known 2D recognition methods such as Eigenface [7], Fisherface [8], Gabor Features [9], and DCT [ 10 ,11], are directly applied to range images....

    [...]

  • ...Dutagaci et. al. [ 10 ] evaluates several projection based methods like ICA, DFT, DCT and NNMF by using both point clouds and range images....

    [...]

  • ...2D DCT and DFT are successfully used for feature extraction from 2D intensity images and range images for feature extraction [ 10 ]....

    [...]


Journal ArticleDOI
TL;DR: A novel multimedia sensor fusion approach based on heterogeneous sensors for biometric access control applications uses multiple acoustic and visual sensors for extracting dominant biometric cues, and combines them with nondominant cues.
Abstract: In this article, we propose a novel multimedia sensor fusion approach based on heterogeneous sensors for biometric access control applications. The proposed fusion technique uses multiple acoustic and visual sensors for extracting dominant biometric cues, and combines them with nondominant cues. The performance evaluation of the proposed fusion protocol and a novel cascaded authentication approach using a 3D stereovision database shows a significant improvement in performance and robustness, with equal error rates of 42.9p (audio only), 32p (audio + 3D face + 2D lip features), 15p (audio + 3D face + 2D eye features), and 7.3p (audio-3D face + 2D lip + 2D eye-eyebrows) respectively.

11 citations


Book ChapterDOI
22 Nov 2010
TL;DR: A novel multimedia sensor fusion approach based on heterogeneous sensors for biometric access control applications that uses multiple acoustic and visual sensors for extracting dominant biometric cues, and combines them with nondominant cues.
Abstract: In this paper, we propose a novel multimedia sensor fusion approach based on heterogeneous sensors for biometric access control applications. The proposed fusion technique uses multiple acoustic and visual sensors for extracting dominant biometric cues, and combines them with nondominant cues. The performance evaluation of the proposed fusion protocol and a novel cascaded authentication approach using a 3D stereovision database shows a significant improvement in performance and robustness.

2 citations


References
More filters

Journal ArticleDOI
21 Oct 1999-Nature
TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

9,911 citations


01 Jan 1999
Abstract: Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.

9,604 citations


Journal ArticleDOI
TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.
Abstract: A fundamental problem in neural network research, as well as in many other disciplines, is finding a suitable representation of multivariate data, i.e. random vectors. For reasons of computational and conceptual simplicity, the representation is often sought as a linear transformation of the original data. In other words, each component of the representation is a linear combination of the original variables. Well-known linear transformation methods include principal component analysis, factor analysis, and projection pursuit. Independent component analysis (ICA) is a recently developed method in which the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible. Such a representation seems to capture the essential structure of the data in many applications, including feature extraction and signal separation. In this paper, we present the basic theory and applications of ICA, and our recent work on the subject.

7,434 citations


Journal ArticleDOI
TL;DR: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems.
Abstract: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems. The Face Recognition Technology (FERET) program has addressed both issues through the FERET database of facial images and the establishment of the FERET tests. To date, 14,126 images from 1,199 individuals are included in the FERET database, which is divided into development and sequestered portions of the database. In September 1996, the FERET program administered the third in a series of FERET face-recognition tests. The primary objectives of the third test were to 1) assess the state of the art, 2) identify future areas of research, and 3) measure algorithm performance.

4,690 citations


Proceedings ArticleDOI
17 Oct 2003
TL;DR: Results show that recognition from indoor images has made substantial progress since FRVT 2000 and that three-dimensional morphable models and normalization increase performance and that face recognition from video sequences offers only a limited increase in performance over still images.
Abstract: Summary form only given. The face recognition vendor test (FRVT) 2002 is an independently administered technology evaluation of mature face recognition systems. FRVT 2002 provides performance measures for assessing the capability of face recognition systems to meet requirements for large-scale, real-world applications. Participation in FRVT 2002 was open to commercial and mature prototype systems from universities, research institutes, and companies. Ten companies submitted either commercial or prototype systems. FRVT 2002 computed performance statistics on an extremely large data set-121,589 operational facial images of 37,437 individuals. FRVT 2002 1) characterized identification and watch list performance as a function of database size, 2) estimated the variability in performance for different groups of people, 3) characterized performance as a function of elapsed time between enrolled and new images of a person, and 4) investigated the effect of demographics on performance. FRVT 2002 showed that recognition from indoor images has made substantial progress since FRVT 2000. Demographic results show that males are easier to recognize than females and that older people are easier to recognize than younger people. FRVT 2002 also assessed the impact of three new techniques for improving face recognition: three-dimensional morphable models, normalization of similarity scores, and face recognition from video sequences. Results show that three-dimensional morphable models and normalization increase performance and that face recognition from video sequences offers only a limited increase in performance over still images. A new XML-based evaluation protocol was developed for FRVT 2002. This protocol is flexible and supports evaluations of biometrics in general The FRVT 2002 reports can be found at http://www.frvt.org.

397 citations


"3D face recognition by projection-b..." refers background in this paper

  • ...3D Face Recognition by Projection Based Methods Helin Dutağacı ((1)), Bülent Sankur ((1)), Yücel Yemez (2) (1) Electrical and Electronic Engineering Department, Boğaziçi University, Bebek, İstanbul, Turkey [dutagach, bulent....

    [...]


Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "3d face recognition by projection based methods" ?

In this paper, the authors investigate recognition performances of various projection-based features applied on registered 3D scans of faces. The authors apply the feature extraction techniques to three different representations of registered faces, namely, 3D point clouds, 2D depth images and 3D voxel. The authors consider both global and local features.