scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Combined face and gait recognition using alpha matte preprocessing

TL;DR: This method is based on combining an improved gait recognition method with an adapted low resolution face recognition method, and reaches the highest recognition rates and the largest absolute number of correct detections to date.
Abstract: This paper presents advances on the Human ID Gait Challenge. Our method is based on combining an improved gait recognition method with an adapted low resolution face recognition method. For this, we experiment with a new automated segmentation technique based on alpha-matting. This allows better construction of feature images used for gait recognition. The same segmentation is also used as a basis for finding and recognizing low-resolution facial profile images in the same database. Both, gait and face recognition methods show results comparable to the state of the art. Next, the two approaches are fused (which to our knowledge, has not yet been done for the Human ID Gait Challenge). With this fusion gain, we show significant performance improvement. Moreover, we reach the highest recognition rates and the largest absolute number of correct detections to date.

Summary (3 min read)

1. Introduction

  • The focus of this paper is on recognizing people from larger distances.
  • In their approach the authors make use of gait recognition combined with person identification based on low-resolution face profile images.
  • Early studies in 1977 by Cutting and Kozlowski [2] suggest that it is possible to recog- nize friends from just their way of walking.
  • A major advantage of these behavior based features over other physiologic features is the possibility to identify people from large distances and without the person’s direct cooperation.
  • However the authors feel that a lot of identity information gets lost by this early binarization.

3. Segmentation using Alpha Mattes

  • Current gait recognition methods rely on good segmentation to extract the contour and the silhouettes of the foreground objects.
  • Then the foreground is estimated by finding the pixels with significant deviation from the background model.
  • To leverage the high number of unknowns, proximity and smoothness assumptions are made.
  • Also the typical matting application has a human in the loop who has to provide some scribbles for foreground and background, leading to the so called trimap.
  • It can be seen that this segmentation is superior to the initial background segmentation.

4.1. Feature Extraction using α-GEI

  • For gait recognition the authors use a method based on the classical Gait Energy Image (GEI) [3].
  • Instead of using binary silhouettes, the authors use the alpha channel from the alpha matting as described in the previous section.
  • The authors call this the Alpha Gait Energy Image (α-GEI) In essence, the Alpha Gait Energy Image is an arithmetic mean of the alpha channel.

4.2. Feature Space Reduction

  • Thus the feature vector is still large with 11264 dimensions.
  • The authors apply principal component analysis (PCA) followed by multiple discriminant analysis (MDA) to reduce the size of the feature vector.
  • While PCA seeks a projection that best represents the data, MDA seeks a projection that best separates the data.
  • This matrix results from optimizing the ratio of the between-class scatter matrix SB and the within-class scatter matrix SW : J(Umda) = |S̃B | |S̃W | = |UTmdaSBUmda| |UTmdaSWUmda| .

4.3. Classification

  • Each class c is modeled with only one vector, which is the mean feature vector zc: zc = 1 |Zc| ∑ z∈Zc z. (7) For each α-GEI from the test set ĝi, the authors perform the transformation in Equation 6 to get the reduced feature vector ẑi.
  • It defines for all sequences i, the distance to the c-th class.
  • Final person identification using gait then becomes a nearest-neighbor classification.
  • The authors assign a class label Li to each test gait image according to Li = argmin c Dgaiti (c) (8).

5.1. Pre-faces

  • In the first part of the algorithm, the gallery set is processed.
  • Thus, always five consecutive prefaces are combined.
  • Those five pre-faces are registered using sum of absolute differences.
  • Note that due to the alpha matte preprocessing the segmentations contain only color foreground regions.
  • This way, multiple aspects of the person are captured and in addition, the influence of erroneous segmentations is reduced.

5.2. Eigenface Calculation

  • The authors apply the classical eigenface method [18] for face recognition.
  • This means that the average face is calculated by taking the mean.
  • This average face is subtracted from the gallery faces and a covariance matrix is estimated from the gallery data.
  • In order to capture color information like skin and hair color, all three color channels are appended and used for the calculation of the covariance matrix.

5.3. Classification

  • Face recognition is done similarly to gait recognition.
  • Typically one would use k-nearest neighbor in such a case.
  • For the later fusion step, however, the authors need a continuous score for each potential class.
  • Thus for each of the sub faces of a test sequence the authors calculate the distance to all sub faces of all trainings sequences .
  • Out of these matches, the authors only keep the k nearest matches.

6. Fusion of Face and Gait

  • This means that the distance scores Dgait(c) and Dface(c) are combined before decision making.
  • There are multiple ways of fusing the results.
  • The distances result from different modalities, thus the values are not directly comparable.

7. Results and Comparison

  • Figure 5 shows the quantitative results on the Human ID Gait database.
  • Summarizing results are shown in Table 1 (largely taken from [5]).
  • Both their face (54, 6%) and their gait recognition method (53, 6%) alone cannot compete with the current state of the art.
  • When combining these multimodal methods, recogniton rates exceed all previous approaches.
  • It can be seen that simple product and sum rules lead to good fusion results and to adramatic increase in performance.

8. Conclusion and Outlook

  • A new preprocessing method using closed form alpha matting was introduced.
  • It was applied to both face and gait recognition.
  • In order to use this method, which typically requires a ”human in the loop”, an automated generation of the trimap was presented.
  • Similar fusion techniques have currently only been carried out on other datasets.
  • It can be foreseen that recognition rates could improve even further.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Combined Face and Gait Recognition using Alpha Matte Preprocessing
Martin Hofmann
1
, Stephan M. Schmidt
1
, AN. Rajagopalan
1,2
, Gerhard Rigoll
1
1
Institute for Human-Machine Communication, Technische Universit
¨
at M
¨
unchen, Germany
2
Department of Electrical Engineering, Indian Institute of Technology Madras, India
martin.hofmann@tum.de, stephan.schmidt@mytum.de, raju@iitm.ac.in, rigoll@tum.de
Abstract
This paper presents advances on the Human ID Gait
Challenge. Our method is based on combining an improved
gait recognition method with an adapted low resolution face
recognition method. For this, we experiment with a new au-
tomated segmentation technique based on alpha-matting.
This allows better construction of feature images used for
gait recognition. The same segmentation is also used as
a basis for finding and recognizing low-resolution facial
profile images in the same database. Both, gait and face
recognition methods show results comparable to the state
of the art. Next, the two approaches are fused (which to
our knowledge, has not yet been done for the Human ID
Gait Challenge). With this fusion gain, we show significant
performance improvement. Moreover, we reach the highest
recognition rates and the largest absolute number of correct
detections to date.
1. Introduction
The focus of this paper is on recognizing people from
larger distances. At a distance, many typical physiologic
features, such as fingerprint, DNA, hand, ear, retina and
face, are obscured or cannot be obtained at all. By con-
trast, behavior based features such as gait features can be
extracted from walking people at a distance.
In our approach we make use of gait recognition com-
bined with person identification based on low-resolution
face profile images. As such we combine physiologic and
behavior based features. We show that both modalities lead
to good results on their own. When combining them, we
observe a significant improvement in recognition perfor-
mance, which demonstrates the strength of a multimodal
approach.
Primarily our approach is motivated by the success of
gait recognition methods for recognition at a distance. In
1967, Murray [11] suggested that if all gait movements are
considered, gait is unique. Early studies in 1977 by Cut-
ting and Kozlowski [2] suggest that it is possible to recog-
nize friends from just their way of walking. Later, Steve-
nage et al. [15] showed that people can be recognized with-
out any information on the body-shape, only using gait fea-
tures. A major advantage of these behavior based features
over other physiologic features is the possibility to iden-
tify people from large distances and without the person’s
direct cooperation. Also no direct interaction with a sens-
ing device is necessary, which allows for undisclosed iden-
tification. Thus gait recognition has great potential in video
surveillance, tracking and monitoring.
For low resolution data, gait recognition has its clear ad-
vantages. However, in our approach, we also use low reso-
lution face data. Even though face recognition has its per-
formance peak at high resolution frontal face images, it can
still be seen that facial profile recognition can contribute to
the performance, when combined correctly.
A multitude of gait recognition algorithms (see Table
1) have so far been proposed, which leads to a rich set of
results we can compare to. Most of these methods build
solely on the binarized silhouette images. However we feel
that a lot of identity information gets lost by this early bi-
narization. Thus instead of binarizing, both our face and
gait recognition methods build on a novel automated color
foreground segmentation method based on alpha-matting.
For gait recognition we use the continuous alpha-matte seg-
mentation and show a small increase in performance. To
our knowledge so far face recognition has not been applied
to the Human ID Gait database [12], so we cannot compare
these results directly. When fusing gait and face features we
observe a significant performance gain, such that our com-
bined method outperforms the state of the art.
2. Related Work
Generally speaking there are two kinds of gait recogni-
tion methods. On the one hand model-based methods, on
the other hand model-free methods. Model based methods
[1][21] define a (simplified) human model and match the
gait sequences to this model. Gait recognition is then per-
formed on the temporal change of the model parameters,
such as leg angles [21]. Those methods are typically very

(a) (b) (c) (d) (e)
Figure 1: Left to right: input image; foreground segmenta-
tion; tri-state labeling with morphologic operations; alpha
matte; final segmentation
demanding and good results are hard to achieve. Model-
free methods [3][5][7][9][12][17][19][20] on the other hand
have shown more success in the recent past. Here, the per-
son identity is directly inferred from the features without
an intermediate person model. Most methods build on a
silhouette extraction for each frame in a gait cycle. Silhou-
ettes are either averaged [3][9][19], or all silhouettes are
used simultaneously [7][12][16]. Different classifiers rang-
ing from nearest neighbor [3], SVM and HMM [7][16] have
been applied with similarly good results.
Recently gait recognition has been combined with face
recognition [6][10][22]. Typical face recognition meth-
ods require a high resolution frontal face image. How-
ever for gait recognition, persons are only captured in low-
resolution side view images. In [6], for face recognition,
only the final segment of the gait video, where the person is
visible in near frontal, is used. In [13], multiple cameras are
used to ensure that both the side view, as well as the frontal
view are available. To avoid these special cases, face recog-
nition can be performed on the low-resolution side view im-
ages [22]. Our approach is similar to the latter ones, because
we also do not depend on specialized data, but instead work
directly on the low-resolution side view videos.
For performance evaluation, many databases have been
recorded. However, the most popular and widely used
database is probably the Human ID Gait database [12]. This
database features video sequences of a total of 122 subjects,
which walk perpendicular to the camera at a distance. While
many methods have been applied to this dataset, so far no
fusion method using gait and face was ever applied to this
database.
3. Segmentation using Alpha Mattes
In this work, we investigate a new segmentation tech-
nique which we apply to both gait recognition as well as
face recognition. Current gait recognition methods rely on
good segmentation to extract the contour and the silhouettes
of the foreground objects. Typically, a background is esti-
mated by calculating the mean and variance of the scene
over a certain period. Then the foreground is estimated by
finding the pixels with significant deviation from the back-
ground model. This leads to a noisy, binary segmentation
as depicted in Figure 1b). However, due to the nature of
the image capturing, there is a band on the silhouette which
belongs partially to foreground and partially to background.
Thus at each pixel (x, y), the image I is modeled as a linear
composition of the foreground F and the background B:
I(x, y)=α(x, y)F (x, y)+(1 α(x, y))B(x, y) (1)
Here, α(x, y) is the opacity of the pixel at (x, y). F (x, y),
B(x, y) and α(x, y) are unknown. For a typical color image
with three color channels we thus have 7 unknowns to solve
for at each pixel. This kind of problem statement is typical
for matting problems. To leverage the high number of un-
knowns, proximity and smoothness assumptions are made.
Also the typical matting application has a human in the
loop who has to provide some scribbles for foreground and
background, leading to the so called trimap. This map con-
tains regions which are definitely foreground (α(x, y)=1),
some which are definitely background (α(x, y)=0) and
some unknown regions for which the matting method deter-
mines the α(x, y).
However, for automated gait recognition it is infeasible
to have a human in the loop. We therefore automatically
generate the trimap from the noisy foreground segmenta-
tion. We get the definite-foreground regions (α(x, y)=
1) by eroding the foreground segmentation with a circu-
lar structure element with radius r =4. The definite-
background regions are obtained by eroding the background
region with the same circular structure element. The result-
ing trimap is shown in Figure 1c).
For background segmentation we use Gaussian mixture
models [14], for alpha matting we used closed form matting
[8].
The resulting foreground segmentation the alpha-matte
is depicted in Figure 1d). It can be seen that this seg-
mentation is superior to the initial background segmenta-
tion. Holes are closed, erroneous pixels are removed and
most of all, the smooth transition of the foreground to
the background is captured. Furthermore by F (x, y)=
I(x, y) · α(x, y) we can approximate a precise color seg-
mentation of the foreground object (see Figure 1e) ). This
color segmentation is used for the face recognition part.
4. Gait recognition
4.1. Feature Extraction using α-GEI
For gait recognition we use a method based on the classi-
cal Gait Energy Image (GEI) [3]. However, instead of using
binary silhouettes, we use the alpha channel from the alpha
matting as described in the previous section. We call this
the Alpha Gait Energy Image (α-GEI)

In essence, the Alpha Gait Energy Image is an arithmetic
mean of the alpha channel. Denote α
t
the alpha matte in
frame t. Then, the α-GEI g is formally defined as the alpha
matte average over one full gait cycle:
g(x, y)=
1
T
T
t=1
α
t
(x, y) (2)
4.2. Feature Space Reduction
The gait energy images g(x, y) have a resolution of
88 × 128 pixels. Thus the feature vector is still large with
11264 dimensions. We apply principal component analysis
(PCA) followed by multiple discriminant analysis (MDA)
to reduce the size of the feature vector. A combination of
PCA and MDA, as proposed in [4], results in the best recog-
nition performance. While PCA seeks a projection that best
represents the data, MDA seeks a projection that best sepa-
rates the data.
Assume that the training set, consisting of Nd-
dimensional training vectors {g
1
,g
2
,...,g
N
}, is given.
Then the projection to the d
<ddimensional PCA space
is given by
y
k
= U
pca
(g
k
g),k=1,...,N (3)
Here U
pca
is the d
×d transformation matrix with the first d
orthonormal basis vectors obtained using PCA on the train-
ing set {g
1
,g
2
,...,g
N
} and g =
N
k=1
g
k
is the mean of
the training set. After PCA, MDA is performed. It is as-
sumed that the reduced vectors Y = {y
1
,y
2
,...,y
N
} be-
long to c classes. Thus the set of reduced training vectors Y
is composed of its c disjunct subsets Y = Y
1
∩Y
2
...Y
c
.
The MDA projection has by construction (c 1) dimen-
sions. These (c 1) dimensional vectors z
k
are obtained as
follows
z
k
= U
mda
y
k
,k=1,...,N (4)
where U
mda
is the transformation matrix obtained using
MDA. This matrix results from optimizing the ratio of the
between-class scatter matrix S
B
and the within-class scatter
matrix S
W
:
J(U
mda
)=
|
S
B
|
|
S
W
|
=
|U
T
mda
S
B
U
mda
|
|U
T
mda
S
W
U
mda
|
. (5)
Here the within-class scatter matrix S
W
is defined as S
W
=
c
i=1
S
i
, with S
i
=
y ∈Y
i
(y m
i
)(y m
i
)
T
and m
i
=
1
N
i
y ∈Y
i
y. Where N
i
= |Y
i
| is the number of vectors
in Y
i
. The between-class scatter S
B
is defined as S
B
=
c
i=1
N
i
(m
i
m)(m
i
m)
T
, with m =
1
N
c
i=1
N
i
m
i
.
Finally, for each Gait Energy Image, the corresponding
gait feature vector is computed as follows
z
k
= U
pca
U
mda
(g
k
g)=T (g
k
g),k=1,...,N
(6)
(a) (b)
Figure 2: a) Rough definition of the pre-face around the
face region. b) Registration of the pre-faces using sum of
absolute differences.
4.3. Classification
Each class c is modeled with only one vector, which is
the mean feature vector
z
c
:
z
c
=
1
|Z
c
|
z∈Z
c
z. (7)
For each α-GEI from the test set ˆg
i
, we perform the
transformation in Equation 6 to get the reduced feature vec-
tor ˆz
i
. A distance D
gait
i
(c)=||ˆz
i
z
c
|| using Euclidean
distance measure is defined. It defines for all sequences i,
the distance to the c-th class. Final person identification us-
ing gait then becomes a nearest-neighbor classification. We
assign a class label L
i
to each test gait image according to
L
i
=argmin
c
D
gait
i
(c) (8)
5. Face recognition
5.1. Pre-faces
In the first part of the algorithm, the gallery set is pro-
cessed. The goal is to find a 20×20 patch of the face pro-
file of each person. To robustly achieve this and to avoid
erroneous segmentations, first for each gallery sequence a
pre-face is calculated. To this end, the mean of all frames
in a sequence is calculated (similar to GEI), in order to find
the person more precisely than using a bounding box. Over
this mean image, a 30×40 patch is defined, which is used
to cut the region for all frames (see Figure 2).
Because viewing direction and body positions slightly
changes when the person walks across the scene, instead of
only extracting one face per sequence, multiple such faces,
which are evenly spread over the sequence, are extracted.
This ensures that as much information about the person is

(a)
(b)
Figure 3: a) The alpha matte based segmentation; the
roughly cropped pre-face and the final face segmentation.
b) Several sub faces of a specific sequence. It can be clearly
seen that the appearance of a face changes within the se-
quence.
captured as possible. Thus, always five consecutive pre-
faces are combined. Those five pre-faces are registered us-
ing sum of absolute differences. After registration, the mean
is taken to find the averaged pre-face.
Finally, to find the precise head location within the
30×40 pixel pre-face, a simple threshold method is used to
find the highest point (top of head) and the left-most point
(nose). Using these two points, a 20×20 pixel patch is ex-
tracted, which captures the final segmentation of the face.
Results of segmentation can be seen in Figure 3. Note that
due to the alpha matte preprocessing the segmentations con-
tain only color foreground regions. Disturbing background
pixels are eliminated.
The same segmentation is carried out on the test se-
quences. The splitting of the test sequences has the ad-
vantage, that for each sequence, multiple sub faces of each
person can be used for classification. This way, multiple
aspects of the person are captured and in addition, the influ-
ence of erroneous segmentations is reduced.
5.2. Eigenface Calculation
We apply the classical eigenface method [18] for face
recognition. This means that the average face is calculated
by taking the mean. This average face is subtracted from the
gallery faces and a covariance matrix is estimated from the
gallery data. Thus a PCA is performed. In order to capture
color information like skin and hair color, all three color
channels are appended and used for the calculation of the
covariance matrix.
Let {f
1
,f
2
,...,f
M
} be the set of M 20 × 20 × 3 color
face patches in the gallery set. Here M is number of all sub
faces, so it is roughly 40 times larger than the number of
people in the gallery set. Then the resulting transformation
is
v
k
= U
face
(f
k
f ) (9)
where
f =
N
k=1
f
k
is the mean face and U
face
is the
transformation matrix learned by PCA.
5.3. Classification
Face recognition is done similarly to gait recognition.
However, instead of having one average gait template,
we have several sub faces for each sequence as described
above. Typically one would use k-nearest neighbor in such
a case. For the later fusion step, however, we need a contin-
uous score for each potential class. Thus for each of the sub
faces of a test sequence we calculate the distance to all sub
faces of all trainings sequences (see Figure 4). Out of these
matches, we only keep the k nearest matches. Within these
k matches, the average distance to all comprised classes is
averaged, thus resulting in a distance D
face
i
(j). If a class c
is not comprised in the k best matches at all, then the dis-
tance is set to D
face
i
(j)=. In our experiments we set
k = 100, however the method is not sensitive to this value
as long as it is big enough (> 10).
For pure face classification the class c with the minimal
distance argmin
c
D
face
i
(c) is taken as the recognition result.
eigenvector 1
eigenvector 2
subfaces test-sequence A
subfaces gallery-sequence B
subfaces gallery-sequence C
1
2
3
4
5
1
2
3
mean(B):
4
5
{
{
class B
class C
mean(C):
Figure 4: Illustration of the face classification (shown for
the first two eigenvalues). For a given test sequence A, the
k closest matches are found (here k =5). Within those top
k matches, the class averages (here, to class B and C, re-
spectively) are a measure for the similarity to these classes.

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
recognition rate
probe A
probe B
probe C
probe D
probe E
probe F
probe G
probe H
probe I
probe J
probe K
probe L
mean
gait only
face only
prod. rule
sum rule
max rule
Figure 5: Quantitative results on the Human ID Gait
database [12]. (1) using only gait information, (2) using
only face, (3) fusion using product rule, (4) fusion using
sum rule, (5) fusion using max rule
6. Fusion of Face and Gait
In this work we use score level fusion. This means that
the distance scores D
gait
(c) and D
face
(c) are combined
before decision making. There are multiple ways of fusing
the results. We use max, product and sum rules:
D
i
(c)=D
gait
i
(c) · D
face
i
(c) (10)
D
i
(c)=D
gait
i
(c)+D
face
i
(c) (11)
D
i
(c)=max(D
gait
i
(c),D
face
i
(c)) (12)
(13)
The distances result from different modalities, thus the
values are not directly comparable. Therefore normaliza-
tion of the vectors is of central importance. Before fu-
sion, the vectors are normalized to have unit length, i.e.
D(c) D(c)/
ˆc
Dc).
7. Results and Comparison
Figure 5 shows the quantitative results on the Human ID
Gait database. It can be seen that fusion using either the
product rule or the sum rule greatly improves the recog-
nition rates, except Probe B, where fusion slightly reduces
recognition rates of gait, but greatly increases results of face
recognition. The max rule shows inferior performance.
For performance evaluation, we compare our method
to several state-of-the-art results. Summarizing results are
shown in Table 1 (largely taken from [5]). Here, recognition
rates for all 12 experiments, as well as the weighed recog-
nition average are shown.
It can be seen that our α-GEI (53.6.0%) - which does not
use synthetic images as in [3] - outperforms the standard
GEI (48.2%). This demonstrates the effectiveness of the al-
pha matte preprocessing and it can be foreseen that when
implementing synthetic images, recognition rates can be
even improved further. We cannot compare our α-eigenface
method, since currently no other face recognition method
was applied to the Human ID Gait database.
Both our face (54, 6%) and our gait recognition method
(53, 6%) alone cannot compete with the current state of the
art. However, when combining these multimodal meth-
ods, recogniton rates exceed all previous approaches. This
shows the importance of simultaneously using multiple
modalities and fusing them. It can be seen that simple prod-
uct and sum rules lead to good fusion results and to adra-
matic increase in performance.
8. Conclusion and Outlook
In this work, a new preprocessing method using closed
form alpha matting was introduced. It was applied to both
face and gait recognition. In order to use this method, which
typically requires a ”human in the loop”, an automated gen-
eration of the trimap was presented. Using this preprocess-
ing it was possible to increase the performance of the stan-
dard Gait Energy Image.
Combining both the modified face and gait recognition
method, it was possible to achieve unprecedented perfor-
mance results on the Human ID Gait challenge. Similar
fusion techniques have currently only been carried out on
other (smaller) datasets.
For future work, stronger and better face and gait meth-
ods should be combined. It can be foreseen that recognition
rates could improve even further.
References
[1] C. BenAbdelkader, R. Cutler, and L. Davis. Stride and ca-
dence as a biometric in automatic person identification and
verification. In Proceedings Fifth IEEE International Con-
ference on Automatic Face and Gesture Recognition, pages
372–377. IEEE, 2002.
[2] J. Cutting and L. Kozlowski. Recognizing friends by their
walk: Gait perception without familiarity cues. Bulletin of
the Psychonomic Society, 9(5):353–356, 1977.
[3] J. Han and B. Bhanu. Individual recognition using gait en-
ergy image. IEEE Transactions on Pattern Analysis and Ma-
chine Intelligence, pages 316–322, 2006.
[4] P. Huang, C. Harris, and M. Nixon. Recognising humans by
gait via parametric canonical space. Journal of Artificial In-
telligence in Engineering, 13(4):359–366, November 1999.
[5] Y. Huang, D. Xu, and T.-J. Cham. Face and human gait
recognition using image-to-class distance. IEEE Trans. Cir-
cuits Syst. Video Techn., 20(3):431–438, 2010.

Citations
More filters
Journal ArticleDOI
TL;DR: This work provides a means for multimodal gait recognition, by introducing the freely available TUM Gait from Audio, Image and Depth (GAID) database, which simultaneously contains RGB video, depth and audio.

195 citations


Cites background or methods from "Combined face and gait recognition ..."

  • ...Many recent gait recognition approaches rely solely on visual data [1, 2, 3, 4, 5, 6, 7]....

    [...]

  • ...Profile and side-view approaches as well as multiview approaches have been used in combination with gait recognition [7, 17, 18, 19, 20, 21]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive overview of existing robust gait recognition methods is provided to provide researchers with state of the art approaches in order to help advance the research topic through an understanding of basic taxonomies, comparisons, and summaries of the state-of-the-art performances on several widely used gait Recognition datasets.
Abstract: Gait recognition has emerged as an attractive biometric technology for the identification of people by analysing the way they walk. However, one of the main challenges of the technology is to address the effects of inherent various intra-class variations caused by covariate factors such as clothing, carrying conditions, and view angle that adversely affect the recognition performance. The main aim of this survey is to provide a comprehensive overview of existing robust gait recognition methods. This is intended to provide researchers with state of the art approaches in order to help advance the research topic through an understanding of basic taxonomies, comparisons, and summaries of the state-of-the-art performances on several widely used gait recognition datasets.

83 citations

Journal ArticleDOI
TL;DR: An implementation blueprint for a multi-biometric system is presented in the form of a list of questions to be answered when designing the system, and a comprehensive review of current issues, including sensor spoofing, template security, and biometric encryption are discussed.
Abstract: Increasing operational and security demands changed biometrics by shifting the focus from single to multi-biometrics. Multi-biometrics are mandatory in the current context of large international biometric databases and to accommodate new emerging security demands. Our paper is a comprehensive survey on multi-biometrics, covering two important topics related to the multi-biometric field: fusion methods and security. Fusion is a core requirement in multi-biometric systems, being the method used to combine multiple biometric methods into a single system. The fusion section surveys recent multi-biometric schemes categorized from the perspective of fusion method. The security section is a comprehensive review of current issues, such as sensor spoofing, template security, and biometric encryption. New research trends and open challenges are discussed, such as soft, adaptive contextual-based biometrics. Finally, an implementation blueprint for a multi-biometric system is presented in the form of a list of questions to be answered when designing the system.

59 citations


Cites methods from "Combined face and gait recognition ..."

  • ...[132] using a foreground segmentation technique based on alpha-matting....

    [...]

Journal ArticleDOI
TL;DR: The proposed postured-based gait recognition technique outperforms the existing techniques in both fixed direction and freestyle walk scenarios, which suggests that a set of postures and quick movements are sufficient to identify a person.
Abstract: With the increase of terrorist threats around the world, human identification research has become a sought after area of research. Unlike standard biometric recognition techniques, gait recognition is a non-intrusive technique. Both data collection and classification processes can be done without a subject’s cooperation. In this paper, we propose a new model-based gait recognition technique called postured-based gait recognition. It consists of two elements: posture-based features and posture-based classification. Posture-based features are composed of displacements of all joints between current and adjacent frames and center-of-body (CoB) relative coordinates of all joints, where the coordinates of each joint come from its relative position to four joints: hip-center, hip-left, hip-right, and spine joints, from the front forward. The CoB relative coordinate system is a critical part to handle the different observation angle issue. In posture-based classification, postured-based gait features of all frames are considered. The dominant subject becomes a classification result. The postured-based gait recognition technique outperforms the existing techniques in both fixed direction and freestyle walk scenarios, where turning around and changing directions are involved. This suggests that a set of postures and quick movements are sufficient to identify a person. The proposed technique also performs well under the gallery-size test and the cumulative match characteristic test, which implies that the postured-based gait recognition technique is not gallery-size sensitive and is a good potential tool for forensic and surveillance use.

57 citations


Cites background from "Combined face and gait recognition ..."

  • ...Some works, [13], [16], [17], have addressed the issue of the silhouette quality and background noise of GEI by using a standard gait model or improving the segmentation pre-processing step....

    [...]

Journal ArticleDOI
Ke Yang1, Yong Dou1, Shaohe Lv1, Fei Zhang1, Qi Lv1 
TL;DR: This study shows that gait feature can effectively distinguish from different human beings through a novel representation - relative distance-based gait features and indicates that the relative distance feature is quite effective and worthy of further study in more general scenarios.

52 citations


Cites background from "Combined face and gait recognition ..."

  • ...An enhanced version of GEI, namely, alphaGEI is proposed to mitigate such a non-random noise in reference [23]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

13,037 citations

Proceedings ArticleDOI
07 Jul 2001
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the "Integral Image" which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algo- rithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a "cascade" which allows back- ground regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection perfor- mance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

10,592 citations

Proceedings ArticleDOI
23 Jun 1999
TL;DR: This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model, resulting in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes.
Abstract: A common method for real-time segmentation of moving regions in image sequences involves "background subtraction", or thresholding the error between an estimate of the image without moving objects and the current image. The numerous approaches to this problem differ in the type of background model used and the procedure used to update the model. This paper discusses modeling each pixel as a mixture of Gaussians and using an on-line approximation to update the model. The Gaussian, distributions of the adaptive mixture model are then evaluated to determine which are most likely to result from a background process. Each pixel is classified based on whether the Gaussian distribution which represents it most effectively is considered part of the background model. This results in a stable, real-time outdoor tracker which reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. This system has been run almost continuously for 16 months, 24 hours a day, through rain and snow.

7,660 citations


"Combined face and gait recognition ..." refers methods in this paper

  • ...For background segmentation we use Gaussian mixture models [14], for alpha matting we used closed form matting [8]....

    [...]

Journal ArticleDOI
TL;DR: A closed-form solution to natural image matting that allows us to find the globally optimal alpha matte by solving a sparse linear system of equations and predicts the properties of the solution by analyzing the eigenvectors of a sparse matrix, closely related to matrices used in spectral image segmentation algorithms.
Abstract: Interactive digital matting, the process of extracting a foreground object from an image based on limited user input, is an important task in image and video editing. From a computer vision perspective, this task is extremely challenging because it is massively ill-posed - at each pixel we must estimate the foreground and the background colors, as well as the foreground opacity ("alpha matte") from a single color measurement. Current approaches either restrict the estimation to a small part of the image, estimating foreground and background colors based on nearby pixels where they are known, or perform iterative nonlinear estimation by alternating foreground and background color estimation with alpha estimation. In this paper, we present a closed-form solution to natural image matting. We derive a cost function from local smoothness assumptions on foreground and background colors and show that in the resulting expression, it is possible to analytically eliminate the foreground and background colors to obtain a quadratic cost function in alpha. This allows us to find the globally optimal alpha matte by solving a sparse linear system of equations. Furthermore, the closed-form formula allows us to predict the properties of the solution by analyzing the eigenvectors of a sparse matrix, closely related to matrices used in spectral image segmentation algorithms. We show that high-quality mattes for natural images may be obtained from a small amount of user input.

1,851 citations

Journal ArticleDOI
TL;DR: Experimental results show that the proposed GEI is an effective and efficient gait representation for individual recognition, and the proposed approach achieves highly competitive performance with respect to the published gait recognition approaches.
Abstract: In this paper, we propose a new spatio-temporal gait representation, called Gait Energy Image (GEI), to characterize human walking properties for individual recognition by gait. To address the problem of the lack of training templates, we also propose a novel approach for human recognition by combining statistical gait features from real and synthetic templates. We directly compute the real templates from training silhouette sequences, while we generate the synthetic templates from training sequences by simulating silhouette distortion. We use a statistical approach for learning effective features from real and synthetic templates. We compare the proposed GEI-based gait recognition approach with other gait recognition approaches on USF HumanID Database. Experimental results show that the proposed GEI is an effective and efficient gait representation for individual recognition, and the proposed approach achieves highly competitive performance with respect to the published gait recognition approaches

1,670 citations


"Combined face and gait recognition ..." refers background in this paper

  • ...Here, the person identity is directly inferred from the features without an intermediate person model....

    [...]

Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Combined face and gait recognition using alpha matte preprocessing" ?

This paper presents advances on the Human ID Gait Challenge. With this fusion gain, the authors show significant performance improvement. 

For future work, stronger and better face and gait methods should be combined. It can be foreseen that recognition rates could improve even further. 

The splitting of the test sequences has the advantage, that for each sequence, multiple sub faces of each person can be used for classification. 

A major advantage of these behavior based features over other physiologic features is the possibility to identify people from large distances and without the person’s direct cooperation. 

due to the nature of the image capturing, there is a band on the silhouette which belongs partially to foreground and partially to background. 

These (c− 1) dimensional vectors zk are obtained as followszk = Umdayk, k = 1, . . . , N (4)where Umda is the transformation matrix obtained using MDA. 

Then the projection to the d′ < d dimensional PCA space is given byyk = Upca(gk − g), k = 1, . . . , N (3) Here Upca is the d′×d transformation matrix with the first d′ orthonormal basis vectors obtained using PCA on the training set {g1, g2, . . . , gN} and g = ∑N k=1 gk is the mean of the training set. 

Even though face recognition has its performance peak at high resolution frontal face images, it can still be seen that facial profile recognition can contribute to the performance, when combined correctly.