Combined face and gait recognition using alpha matte preprocessing
Summary (3 min read)
1. Introduction
- The focus of this paper is on recognizing people from larger distances.
- In their approach the authors make use of gait recognition combined with person identification based on low-resolution face profile images.
- Early studies in 1977 by Cutting and Kozlowski [2] suggest that it is possible to recog- nize friends from just their way of walking.
- A major advantage of these behavior based features over other physiologic features is the possibility to identify people from large distances and without the person’s direct cooperation.
- However the authors feel that a lot of identity information gets lost by this early binarization.
3. Segmentation using Alpha Mattes
- Current gait recognition methods rely on good segmentation to extract the contour and the silhouettes of the foreground objects.
- Then the foreground is estimated by finding the pixels with significant deviation from the background model.
- To leverage the high number of unknowns, proximity and smoothness assumptions are made.
- Also the typical matting application has a human in the loop who has to provide some scribbles for foreground and background, leading to the so called trimap.
- It can be seen that this segmentation is superior to the initial background segmentation.
4.1. Feature Extraction using α-GEI
- For gait recognition the authors use a method based on the classical Gait Energy Image (GEI) [3].
- Instead of using binary silhouettes, the authors use the alpha channel from the alpha matting as described in the previous section.
- The authors call this the Alpha Gait Energy Image (α-GEI) In essence, the Alpha Gait Energy Image is an arithmetic mean of the alpha channel.
4.2. Feature Space Reduction
- Thus the feature vector is still large with 11264 dimensions.
- The authors apply principal component analysis (PCA) followed by multiple discriminant analysis (MDA) to reduce the size of the feature vector.
- While PCA seeks a projection that best represents the data, MDA seeks a projection that best separates the data.
- This matrix results from optimizing the ratio of the between-class scatter matrix SB and the within-class scatter matrix SW : J(Umda) = |S̃B | |S̃W | = |UTmdaSBUmda| |UTmdaSWUmda| .
4.3. Classification
- Each class c is modeled with only one vector, which is the mean feature vector zc: zc = 1 |Zc| ∑ z∈Zc z. (7) For each α-GEI from the test set ĝi, the authors perform the transformation in Equation 6 to get the reduced feature vector ẑi.
- It defines for all sequences i, the distance to the c-th class.
- Final person identification using gait then becomes a nearest-neighbor classification.
- The authors assign a class label Li to each test gait image according to Li = argmin c Dgaiti (c) (8).
5.1. Pre-faces
- In the first part of the algorithm, the gallery set is processed.
- Thus, always five consecutive prefaces are combined.
- Those five pre-faces are registered using sum of absolute differences.
- Note that due to the alpha matte preprocessing the segmentations contain only color foreground regions.
- This way, multiple aspects of the person are captured and in addition, the influence of erroneous segmentations is reduced.
5.2. Eigenface Calculation
- The authors apply the classical eigenface method [18] for face recognition.
- This means that the average face is calculated by taking the mean.
- This average face is subtracted from the gallery faces and a covariance matrix is estimated from the gallery data.
- In order to capture color information like skin and hair color, all three color channels are appended and used for the calculation of the covariance matrix.
5.3. Classification
- Face recognition is done similarly to gait recognition.
- Typically one would use k-nearest neighbor in such a case.
- For the later fusion step, however, the authors need a continuous score for each potential class.
- Thus for each of the sub faces of a test sequence the authors calculate the distance to all sub faces of all trainings sequences .
- Out of these matches, the authors only keep the k nearest matches.
6. Fusion of Face and Gait
- This means that the distance scores Dgait(c) and Dface(c) are combined before decision making.
- There are multiple ways of fusing the results.
- The distances result from different modalities, thus the values are not directly comparable.
7. Results and Comparison
- Figure 5 shows the quantitative results on the Human ID Gait database.
- Summarizing results are shown in Table 1 (largely taken from [5]).
- Both their face (54, 6%) and their gait recognition method (53, 6%) alone cannot compete with the current state of the art.
- When combining these multimodal methods, recogniton rates exceed all previous approaches.
- It can be seen that simple product and sum rules lead to good fusion results and to adramatic increase in performance.
8. Conclusion and Outlook
- A new preprocessing method using closed form alpha matting was introduced.
- It was applied to both face and gait recognition.
- In order to use this method, which typically requires a ”human in the loop”, an automated generation of the trimap was presented.
- Similar fusion techniques have currently only been carried out on other datasets.
- It can be foreseen that recognition rates could improve even further.
Did you find this useful? Give us your feedback
Citations
195 citations
Cites background or methods from "Combined face and gait recognition ..."
...Many recent gait recognition approaches rely solely on visual data [1, 2, 3, 4, 5, 6, 7]....
[...]
...Profile and side-view approaches as well as multiview approaches have been used in combination with gait recognition [7, 17, 18, 19, 20, 21]....
[...]
83 citations
59 citations
Cites methods from "Combined face and gait recognition ..."
...[132] using a foreground segmentation technique based on alpha-matting....
[...]
57 citations
Cites background from "Combined face and gait recognition ..."
...Some works, [13], [16], [17], have addressed the issue of the silhouette quality and background noise of GEI by using a standard gait model or improving the segmentation pre-processing step....
[...]
52 citations
Cites background from "Combined face and gait recognition ..."
...An enhanced version of GEI, namely, alphaGEI is proposed to mitigate such a non-random noise in reference [23]....
[...]
References
13,037 citations
10,592 citations
7,660 citations
"Combined face and gait recognition ..." refers methods in this paper
...For background segmentation we use Gaussian mixture models [14], for alpha matting we used closed form matting [8]....
[...]
1,851 citations
1,670 citations
"Combined face and gait recognition ..." refers background in this paper
...Here, the person identity is directly inferred from the features without an intermediate person model....
[...]
Related Papers (5)
Frequently Asked Questions (8)
Q2. What future works have the authors mentioned in the paper "Combined face and gait recognition using alpha matte preprocessing" ?
For future work, stronger and better face and gait methods should be combined. It can be foreseen that recognition rates could improve even further.
Q3. What is the advantage of the splitting of the test sequences?
The splitting of the test sequences has the advantage, that for each sequence, multiple sub faces of each person can be used for classification.
Q4. What is the main advantage of behavior based features over other physiologic features?
A major advantage of these behavior based features over other physiologic features is the possibility to identify people from large distances and without the person’s direct cooperation.
Q5. Why is there a band on the silhouette?
due to the nature of the image capturing, there is a band on the silhouette which belongs partially to foreground and partially to background.
Q6. What is the dimensional transformation matrix obtained using MDA?
These (c− 1) dimensional vectors zk are obtained as followszk = Umdayk, k = 1, . . . , N (4)where Umda is the transformation matrix obtained using MDA.
Q7. What is the d′ d dimensional PCA space?
Then the projection to the d′ < d dimensional PCA space is given byyk = Upca(gk − g), k = 1, . . . , N (3) Here Upca is the d′×d transformation matrix with the first d′ orthonormal basis vectors obtained using PCA on the training set {g1, g2, . . . , gN} and g = ∑N k=1 gk is the mean of the training set.
Q8. What is the way to recognize a face?
Even though face recognition has its performance peak at high resolution frontal face images, it can still be seen that facial profile recognition can contribute to the performance, when combined correctly.