scispace - formally typeset
Open AccessProceedings ArticleDOI

On co-training online biometric classifiers

Reads0
Chats0
TLDR
The proposed co-training online classifier update algorithm is presented as a semi-supervised learning task and is applied to a face verification application and experiments indicate that the proposed algorithm improves the performance both in terms of classification accuracy and computational time.
Abstract
In an operational biometric verification system, changes in biometric data over a period of time can affect the classification accuracy. Online learning has been used for updating the classifier decision boundary. However, this requires labeled data that is only available during new enrolments. This paper presents a biometric classifier update algorithm in which the classifier decision boundary is updated using both labeled enrolment instances and unlabeled probe instances. The proposed co-training online classifier update algorithm is presented as a semi-supervised learning task and is applied to a face verification application. Experiments indicate that the proposed algorithm improves the performance both in terms of classification accuracy and computational time.

read more

Content maybe subject to copyright    Report

On Co-training Online Biometric Classifiers
Himanshu S. Bhatt, Samarth Bharadwaj, Richa Singh, Mayank Vatsa
IIIT Delhi, India
{himanshub, samarthb, rsingh, mayank}@iiitd.ac.in
Afzel Noore, Arun Ross
West Virginia University, USA
{afzel.noore, arun.ross}@mail.wvu.edu
Abstract
In an operational biometric verification system, changes
in biometric data over a period of time can affect the classi-
fication accuracy. Online learning has been used for updat-
ing the classifier decision boundary. However, this requires
labeled data that is only available during new enrolments.
This paper presents a biometric classifier update algorithm
in which the classifier decision boundary is updated using
both labeled enrolment instances and unlabeled probe in-
stances. The proposed co-training online classifier update
algorithm is presented as a semi-supervised learning task
and is applied to a face verification application. Exper-
iments indicate that the proposed algorithm improves the
performance both in terms of classification accuracy and
computational time.
1. INTRODUCTION
A biometric verification system typically uses a classi-
fier to determine if the unlabeled probe data matches with
the labeled gallery data. The performance of such a clas-
sifier is affected by the intra-class and inter-class dynamics
as biometric data is acquired over a period of time [21].
New information that can affect the biometric data distribu-
tion (e.g. match scores) is available from two fronts: (1)
new subjects enrolling into the biometric system (labeled
data) and (2) previously enrolled subjects interacting with
the system and providing new probes (unlabeled data). New
enrolments can lead to variations in genuine and impos-
tor score distributions while probe images may introduce
wide intra-class variations (due to temporal changes). To
maintain the performance and to accommodate the varia-
tions caused due to new enrolments and probes, biometric
systems generally require re-training. Since re-training with
existing and new information in batch mode requires a huge
amount of time, it is not pragmatic for large scale applica-
tions. However, if the classifiers are not re-trained, then the
verification performance can be compromised.
Online learning [18] and co-training [7] are used to up-
date the classifiers in real time and make them scalable.
These paradigms can also be used for updating biometric
classifiers. Intuitively,
labeled information from newly enrolled individuals
can be used to update the classifier in incremental-
decremental learning mode, also known as online
learning. Since corresponding labels (“genuine” or
“impostor”) are available during enrolment, classifier
update using online learning can be viewed as a super-
vised learning approach.
unlabeled information obtained at probe level can be
used to update the classifier using co-training. In the
co-training framework, two classifiers evolve by co-
training each other using unlabeled probe information.
If the first classifier confidently predicts the class (gen-
uine or impostor) for an instance, while the second
classifier is unsure of its classification decision, then
this data instance is added to re-train the second clas-
sifier with the pseudo label assigned by the first classi-
fier.
If we incorporate both the paradigms, then updating a
biometric classifier can be posed as a semi-supervised learn-
ing [9] task that seamlessly exploits unlabeled data in addi-
tion to the labeled data.
In the literature, incremental (online) learning ap-
proaches for principle component analysis [18] and linear
discriminant analysis [22] have shown the effectiveness of
this paradigm. Kim et al. [14] have shown that online learn-
ing algorithms can be used for biometric score fusion in or-
der to resolve the computational problems with increasing
number of users. Singh et al. [21] have proposed an online
learning approach for updating a face classifier. Their re-
sults show that the performance of online SVM classifiers
1
978-1-4577-1359-0/11/$26.00 ©2011 IEEE
Appeared in Proc. of International Joint Conference on Biometrics (IJCB), (Washington DC, USA), OCtober 2011

is comparable to the batch mode counterpart. Further, on-
line SVM classifiers have a significant advantage of reduced
re-training time using only the new sample points to update
the decision boundary.
In co-training, as proposed by Blum and Mitchell [7],
two classifiers that are trained on separate views (features),
co-train each other based on their confidence in predicting
the labels. Nonetheless, success of a co-training framework
is susceptible to various assumptions. Blum and Mitchell
[7] showed that two classifiers should have sufficient indi-
vidual accuracy and should be conditionally independent of
each other. Later, Abney [2] showed that weak dependence
between the two classifiers can also guarantee successful
co-training. Wang and Zhou [23] also reported the suffi-
cient and necessary conditions for success of a co-training
framework.
Though co-training has been used in several computer
vision applications, in the biometrics literature, use of un-
labeled data for updating the system has been mainly re-
stricted to biometric template updates. Jiang and Ser [13]
proposed a method to improve fingerprint templates by
merging and averaging minutiae from multiple samples of
a fingerprint. Ryu et al. [20] also proposed a method to up-
date the fingerprint templates by appending new minutiae
from the query fingerprint with the gallery fingerprint tem-
plate. Balcan et al. [5] developed a method to address the
problem of person identification in low quality web-camera
images. They formulated the task of person identification
in web-camera images as a graph-based semi-supervised
learning problem. Roli et al. [19] designed a biometric
system that uses co-training to address the temporal vari-
ations in a face and fingerprint based multimodal system.
Liu et al. [15] proposed to retrain the Eigenspace in a face
recognition system using the unlabeled data stream. Re-
cently, Poh et al. [17] performed a study on the goal of
semi-supervised learning where they focused on some of
the challenges and research directions for designing adap-
tive biometric systems.
This research focuses on seamlessly improving the per-
formance of a biometric classifier by updating the classi-
fier’s knowledge using additional labeled data obtained dur-
ing new enrolments as well as unlabeled data obtained dur-
ing probe verification. The paper presents a framework
for co-training biometric classifiers in an online manner.
Specifically, the concepts of co-training and online learning
are applied to a support vector machine (SVM) based bio-
metric classifier update scenario. While online learning up-
dates the SVM classifier using labeled enrolment data, co-
training updates the SVM decision boundaries with a large
number of unlabeled probe examples. The performance of
the proposed co-training framework is evaluated in the con-
text of multi-classifier SVM based face verification where it
shows improvements in both verification accuracy and com-
putational time.
2. Proposed Co-training Online Framework
Mathematically, for a two classifier biometric verifi-
cation system, the process is as follows. Two types of
data instances are available: a set of labeled data in-
stances, {(u
1
,𝑧
1
), (u
2
,𝑧
2
), ..., (u
𝑛
,𝑧
𝑛
)}, is available when
new users are enrolled into the system and a set of unla-
beled data instances, {u
1
, u
2
, ..., u
𝑛
}, is available dur-
ing probe verification. Every instance u
𝑖
or u
𝑖
has two
views, u
𝑖
= {𝑥
𝑖,1
,𝑥
𝑖,2
}; here 𝑥
𝑖,1
and 𝑥
𝑖,2
represent the
match scores obtained from the two classifiers and the la-
bel 𝑧
𝑖
∈{+1, 1} represents the genuine or impostor
class. For labeled data instances available during enrol-
ments, classifier 𝑐
𝑗
predicts the label for every instance:
𝑐
𝑗
(𝑥
𝑖,𝑗
) 𝑦
𝑖,𝑗
, where 𝑦
𝑖,𝑗
is the predicted label for the
𝑖
𝑡ℎ
instance on the 𝑗
𝑡ℎ
view, 𝑖 =1, 2, ..., 𝑚, 𝑚 is the total
number of scores generated when a newly enrolled user is
compared against the existing gallery and its own multiple
samples, and 𝑗 is the number of views
1
(number of classi-
fiers), 𝑗 =1, 2. In online learning, classifiers 𝑐
1
and 𝑐
2
are
updated for every incorrect prediction (i.e., when 𝑦
𝑖
= 𝑧
𝑖
)
while no action is taken when the instances are correctly
classified. For unlabeled instances, classifiers 𝑐
1
and 𝑐
2
pre-
dict labels on the two separate views, 𝑐
1
(𝑥
𝑖,1
) 𝑦
𝑖,1
and
𝑐
2
(𝑥
𝑖,2
) 𝑦
𝑖,2
. Here, 𝑥
𝑖,1
and 𝑥
𝑖,2
are the two views of the
𝑖
𝑡ℎ
instance 𝑢
𝑖
, and 𝑦
𝑖,1
and 𝑦
𝑖,2
are the corresponding pre-
dicted labels. Classifiers are co-trained for a given instance
if one classifier confidently predicts the label of the instance
while the other classifier is unsure of its prediction.
2.1. Online SVM Classifiers
Let {u
𝑖
,𝑧
𝑖
} be the set of data instances (scores) where
𝑖 =1, ..., 𝑁 , N is the total number of instances, and 𝑧
𝑖
is
the label such that 𝑧
𝑖
∈{+1, 1}. Since u
𝑖
represents the
two individual views (classifiers), SVMs are trained indi-
vidually for both the views using 𝑥
𝑖,𝑗
where 𝑗 =1, 2.
The basic principle behind SVM is to find the hyperplane
that separates two classes with the widest margin, i.e., to
maximize 𝑤𝜙(𝑥
𝑖,𝑗
)+𝑏 =0or equivalently minimize:
𝑚𝑖𝑛
𝑤,𝑏,𝜖
1
2
∣∣𝑤∣∣
2
+ 𝐶
𝑁
𝑖=1
𝜖
𝑖
(1)
subject to constraints:
𝑧
𝑖
(𝑤𝜙(𝑥
𝑖,𝑗
)+𝑏) 1 𝜖, 𝜖 0,𝑖 1, ..., 𝑁 (2)
where 𝜖 are the slack variables, 𝑏 is the offset of the de-
cision hyperplane, 𝑤 is the normal weight vector, 𝜙(𝑥
𝑖,𝑗
)
1
The terms “views” and “classifiers” are used interchangeably because
each classifier is trained on a single view and, therefore, there are as many
classifiers as there are number of views.
Appeared in Proc. of International Joint Conference on Biometrics (IJCB), (Washington DC, USA), OCtober 2011

is the mapping function used to map the data space to the
feature space, and 𝐶 is the tradeoff parameter between the
permissible error in the samples and the margin. Note
that, in this context, input to the two class SVM is match
scores with labels {+1, 1} representing the genuine and
impostor classes. In large scale biometrics applications, re-
training the SVM classifiers is computationally expensive.
Existing approaches allow the training of SVM in online
manner using only the support vectors and new data points.
Methods to add or remove one sample at a time to update
SVM (in online manner) are proposed in [8], [21] where an
exact solution for 𝑁 ± 1 can be obtained using the 𝑁 old
samples and the one sample to be added or removed.
Figure 1. Illustrating the online learning process where each clas-
sifier learns from the incorrectly classified instances.
Figure 1 shows the proposed online learning approach
when two SVMs are used as biometric classifiers. SVM
classifiers for each view/score are first trained on the initial
enrolment training data 𝐷
𝐿
. A unique identification num-
ber is assigned to every user being enrolled in the biomet-
ric system. Note that, during enrolment, we can store mul-
tiple samples from each individual to accommodate intra-
class variations and for performing online learning on the
SVM classifier. Biometric features of the new user are ex-
tracted and compared against the gallery of other individ-
uals to compute the impostor match scores. For genuine
match score computation, we use multiple samples captured
during enrolment. SVM classifiers are then used to clas-
sify each of these match scores as genuine or impostor. In
the enrolment stage, labels (ground truth) corresponding to
the match scores are compared with the prediction of the
classifier. The match scores for which the classifier makes
incorrect predictions are used to update the decision bound-
ary of the SVM classifier using online learning [21]. This
online learning process is performed for both the classifiers
and the two classifiers are updated independently. The on-
line learning algorithm to update the classifiers is described
in Algorithm 1.
Algorithm 1 Online Classifier Update
Input: Initial labeled enrolment training data 𝐷
𝐿
,aset
of additional labeled instances {𝑢
𝑖
,𝑧
𝑖
} due to enrolments,
𝑖 =1, 2, ....𝑁 , where 𝑁 is the number of additional in-
stances. Each instance 𝑢
𝑖
=(𝑥
𝑖,1
,𝑥
𝑖,2
) represents two
views (or scores).
Iterate: 𝑗=1to number of views (number of classifiers)
Process: Train classifier 𝑐
𝑗
on 𝑗
𝑡ℎ
views of 𝐷
𝐿
for 𝑘 =1to 𝑁 do
Predict labels: 𝑐
𝑗
(𝑥
𝑖,𝑗
) 𝑦
𝑖
if 𝑦
𝑖
= 𝑧
𝑖
then
Update 𝑐
𝑗
with labeled instance {𝑥
𝑖,𝑗
,𝑧
𝑖
}
end if
end for
End iterate
Output: Updated classifier 𝑐
1
and 𝑐
2
.
2.2. Co-training SVM Classifiers
In biometrics, obtaining a large number of labeled ex-
amples is a difficult and expensive task. On the other hand,
obtaining large scale unlabeled examples is relatively easy.
In a semi-supervised co-training framework, a small ini-
tial labeled training set is available for training the classi-
fiers and then a large number of unlabeled instances (scores
generated during probe verification) are available sequen-
tially once the system is in use. In the proposed frame-
work, co-training is used to leverage the availability of mul-
tiple classifiers and unlabeled instances to update the deci-
sion boundaries of both the classifiers and account for the
wide intra-class variations introduced by the probe set. It
assumes the availability of two classifiers trained on sepa-
rate views where the classifier for each view has sufficient
(better than random) classification performance. Further, it
is important that the classifiers have low correlation in their
match scores. This is because, with low correlation, the
two classifiers potentially yield different results. For exam-
ple, one classifier may correctly classifies the unlabeled in-
stance with high confidence, while the other classifier may
make a mistake or may not be confident of the prediction.
However, even with limited dependence, the proposed co-
training framework can improve the performance of indi-
vidual classifiers as discussed in [2].
The two classifiers are first trained on an initial small
labeled data set. During probe verification, instances
(scores) are generated by comparing probe images against
the gallery. Unlike online learning, the instances obtained
during probe verification are unlabeled. For every query
given to the biometric system, both the classifiers are used
to classify the instance. Here, each instance has two views,
u
= {𝑥
1
,𝑥
2
} and constitutes the unlabeled set 𝐷
𝑈
. If one
classifier confidently predicts the genuine label for the in-
Appeared in Proc. of International Joint Conference on Biometrics (IJCB), (Washington DC, USA), OCtober 2011

stance while the other classifier predicts the impostor label
with low confidence, then this instance is added as a labeled
re-training sample for the second classifier and vice-versa.
In this manner, the co-training framework transforms unla-
beled scores into labeled training data to update the classi-
fiers.
Figure 2. Illustrates the process of computing the confidence of
prediction for the SVM classifier.
Figure 3. Illustrates the co-training process where each online clas-
sifier provides informative labeled instances to the other classifier.
In the co-training approach, as shown in Figure 2,the
confidence of prediction by each SVM classifier is mea-
sured in terms of distance of the instance from the deci-
sion hyperplane. A genuine threshold is computed as the
distance of the farthest impostor point that is erroneously
classified as a genuine point. An impostor threshold is
computed as the distance of the farthest genuine point that
is erroneously classified as an impostor. For an instance
to be confident enough to lie in the genuine class, its dis-
tance from the decision hyperplane should be greater than
the genuine threshold. Similarly, for an instance to be
confident enough to lie in the impostor class, its distance
from the decision hyperplane should be greater than the im-
postor threshold. Varying the thresholds will change the
number of instances on which the co-training is performed.
High threshold values imply conservative co-training while
smaller values of the threshold will lead to aggressive co-
Algorithm 2 Co-training
Input: Set of labeled training data 𝐷
𝐿
, set of unlabeled
instances 𝐷
𝑈
, where each instance u
=(𝑥
𝑖,1
,𝑥
𝑖,2
) rep-
resents two view/scores.
Process: Train classifier 𝑐
𝑗
on separate views of 𝐷
𝐿
.
Compute confidence threshold 𝑇
𝑗
, where 𝑗 = no of views
for 𝑘 =1to sizeof(𝐷
𝑈
) do
Predict labels: 𝑐
𝑗
(𝑥
𝑖
) 𝑦
𝑖,𝑗
; 𝛼
𝑗
represents confi-
dence of prediction
if 𝛼
1
>𝑇
1
& 𝛼
2
<𝑇
2
then
Update 𝑐
2
with labeled instance {𝑥
𝑖,2
,𝑦
𝑖,1
)} &re-
compute 𝑇
2
end if
if 𝛼
1
<𝑇
1
& 𝛼
2
>𝑇
2
then
Update 𝑐
1
with labeled instance {𝑥
𝑖,1
,𝑦
𝑖,2
)} &re-
compute 𝑇
1
end if
end for
Output: Updated classifier 𝑐
1
and 𝑐
2
.
training. The proposed co-training framework is illustrated
in Figure 3 and described in Algorithm 2.
2.3. Co-training Online SVM Classifiers
The online learning and co-training approaches are ex-
tended to propose a framework that simultaneously uses on-
line learning and co-training to update the classifier using
labeled and unlabeled data as and when they arrive. The
classifiers are initially trained on a small labeled training
data set. For every new user being enrolled in the system,
online learning is used to update the classifiers using the
labeled data generated during enrolment. During probe ver-
ification, whenever a user queries the system, co-training is
used to update the classifiers using the unlabeled data.
3. Case Study: Multi-classifier Face Verifica-
tion
To evaluate the effectiveness of the proposed co-training
framework, experiments are performed using a multi-
classifier face verification application. The case study on
multi-classifier face verification comprises of two classifiers
trained on separate views (scores) of a face image. Point-
based Speeded Up Robust Features (SURF) [6] and texture-
based Uniform Circular Local Binary Pattern (UCLBP) [3]
are used as facial feature extractors along with 𝜒
2
dis-
tance for matching. UCLBP and SURF are used for fa-
cial feature extraction because they are fast, discriminat-
ing, rotation invariant, and robust to changes in gray level
intensities due to illumination variations. Further, select-
ing point and texture based extractors ensure that the two
Appeared in Proc. of International Joint Conference on Biometrics (IJCB), (Washington DC, USA), OCtober 2011

Table 1. Constituent face databases used in this research.
Database Number of Number of
subjects images
AR [16] 119 714
WVU mutimodal [10] 270 3482
MBGC v.2 [1] 446 5468
Caspeal [12] 711 5658
CMU Multi-PIE [4] 287 4828
Tot al 1833 20150
views have lower dependence
2
. Two SVM classifiers, one
for SURF (classifier1) and another for UCLBP (classifier2),
are trained to classify the scores as 𝑔𝑒𝑛𝑢𝑖𝑛𝑒 or 𝑖𝑚𝑝𝑜𝑠𝑡𝑜𝑟.
SVM classifiers are then updated using the proposed frame-
work for the labeled and unlabeled instances as and when
they arrive. The final classification is obtained by com-
bining the responses from the two updated classifiers using
SVM fusion [11].
To analyze the performance on a large database, images
from multiple face databases are combined to create a het-
erogeneous face database of 1833 subjects. The heteroge-
neous face database comprise of face images with slight
pose, expression, and illumination variations. Table 1 pro-
vides details about the constituent face databases used in
this research. Every subject having six or more samples of
face images is selected from these databases. In all the ex-
periments, two images per subject are used in the gallery
and the remaining are used as probe. Though each con-
stituent database has large number of images per subject,
images exhibiting large pose (> 30 degree), extreme illumi-
nation conditions, and occlusion are ignored. Further, face
images are geometrically normalized, and the size of each
detected face is 196 × 224 pixels.
3.1. Experimental Protocol
The experimental protocol is designed such that the clas-
sifiers are first trained on labeled training data and then vari-
ations due to new enrolments and probes are simultaneously
learned using online learning and co-training. To update
biometric classifiers, a joint adapt-and-test strategy [17]is
used which allows for seamlessly adapting and testing. The
performance of the proposed framework is compared with
batch/offline learning, online learning, and co-training. The
following experiments are performed to analyze the perfor-
mance of the proposed framework.
For batch learning, the classifiers are trained on all
1833 subjects in batch mode.
For online learning, the classifiers are initially trained
on randomly chosen 600 subjects and then online
learning is performed using the remaining 1233 sub-
jects, one subject at a time.
2
In our experiments, SURF and UCLBP had genuine Pearson’s corre-
lation of 0.58 and impostor Pearson’s correlation of 0.46.
To evaluate the effectiveness of co-training, two exper-
iments are performed.
In the first experiment, the two classifiers are
trained on (initial) 600 subjects; however, the
gallery comprises of 1833 subjects. The co-
training is performed using the probes of all 1833
subjects and this experiment is termed as co-
training-1.
In the second experiment, the classifiers are
trained using all 1833 subjects in batch mode and
co-training is performed using the probe images.
This experiment is referred as co-training-2.
The results are reported based on five-fold non-
overlapping random cross validation and verification accu-
racies are computed at 0.01% false accept rate (FAR).
3.2. Results and Analysis
Figure 4 shows the Receiver Operating Characteristic
(ROC) curves for the multi-classifier face verification sys-
tem. Table 2 summarizes the verification accuracies and
computational time for the experiments. The key results
and analysis are listed below:
ROC curves in Figure 4 show modest improvement
in the performance of classifiers with the proposed
classifier update framework. The framework improves
the performance by at least 0.54% compared to batch
learning, online learning, and co-training. As men-
tioned previously, the proposed framework provides a
mechanism to seamlessly update the individual clas-
sifiers using labeled as well as unlabeled instances.
Further, a better classification performance is obtained
by combining the decisions from the two classifiers
(SVM-fusion) as shown in Figure 4(c).
The proposed framework provides another benefit in
terms of reducing the classifier training time. Table 2
shows that the framework reduces the training time to
almost half the time required for batch learning while
modestly improving the accuracy.
It is observed that classification performance of on-
line learning is comparable to that of batch learning.
However, online learning provides a great benefit by
reducing the training time to one-third. Once the ini-
tial training is performed, the classifier is re-trained in
a supervised manner using only the instances in which
it makes an error and the previous support vectors.
Co-training provides an improvement in verification
accuracy over both batch learning and online learn-
ing because the classifiers trained on different scores
update each other by providing pseudo labels for the
Appeared in Proc. of International Joint Conference on Biometrics (IJCB), (Washington DC, USA), OCtober 2011

Citations
More filters
Journal ArticleDOI

Biometric quality: a review of fingerprint, iris, and face

TL;DR: The analysis of the characteristic function of quality and match scores shows that a careful selection of complimentary set of quality metrics can provide more benefit to various applications of biometric quality.
Journal ArticleDOI

Cooperative learning and its application to emotion recognition from speech

TL;DR: The method based on the combination of Active Learning and Co-Training leads to the same performance of a model trained on the whole training set, but using 75% fewer labeled instances, which efficiently and robustly reduces the need for human annotations.
Journal ArticleDOI

Learning discriminative binary codes for finger vein recognition

TL;DR: Compared with existing binary codes for finger vein recognition, DBC are more discriminative and shorter, and generated with considering the relationships among subjects which may be useful to improve performance.
Journal ArticleDOI

Improving cross-resolution face matching using ensemble-based co-transfer learning.

TL;DR: A co-transfer learning framework is proposed, which is a cross-pollination of transfer learning and co-training paradigms and is applied for cross-resolution face matching and enhances the performance of cross- resolution face recognition.
Posted Content

Face Image Quality Assessment: A Literature Survey

TL;DR: This survey provides an overview of the face image quality assessment literature, which predominantly focuses on visible wavelength face image input and a trend towards deep learning based methods is observed, including notable conceptual differences among the recent approaches.
References
More filters
Journal Article

The AR face databasae

Proceedings Article

Incremental and Decremental Support Vector Machine Learning

TL;DR: An on-line recursive algorithm for training support vector machines, one vector at a time, is presented and interpretation of decremental unlearning in feature space sheds light on the relationship between generalization and geometry of the data.
Proceedings Article

Multi-PIE

TL;DR: The CMU Multi-PIE database as mentioned in this paper contains 337 subjects, imaged under 15 view points and 19 illumination conditions in up to four recording sessions, with a limited number of subjects, a single recording session and only few expressions captured.

Introduction to Semi-Supervised Learning

TL;DR: This chapter contains sections titled: Supervised, Unsupervised, and Semi-Supervised Learning, When Can Semi- Supervised Learning Work, Classes of Algorithms and Organization of This Book.
Proceedings ArticleDOI

Bootstrapping

Steven Abney
TL;DR: This paper refines the analysis of cotraining, defines and evaluates a new co-training algorithm that has theoretical justification, gives a theoretical justification for the Yarowsky algorithm, and shows that co-trained algorithms are based on different independence assumptions.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What contributions have the authors mentioned in the paper "On co-training online biometric classifiers" ?

This paper presents a biometric classifier update algorithm in which the classifier decision boundary is updated using both labeled enrolment instances and unlabeled probe instances. The proposed co-training online classifier update algorithm is presented as a semi-supervised learning task and is applied to a face verification application. 

As future work, the proposed framework can be extended to different stages of a biometric system that require regular updates. The authors also plan to incorporate the quality of the given gallery-probe pair in computing the confidence of prediction rather than making a decision based only on the distance from the hyperplane. 

In the proposed framework, co-training is used to leverage the availability of multiple classifiers and unlabeled instances to update the decision boundaries of both the classifiers and account for the wide intra-class variations introduced by the probe set. 

During probe verification, whenever a user queries the system, co-training is used to update the classifiers using the unlabeled data. 

For online learning, during enrolment, classifier1 was updated using 22,145instances and classifier2 was updated using 31,846 instances. 

New enrolments can lead to variations in genuine and impostor score distributions while probe images may introduce wide intra-class variations (due to temporal changes). 

To maintain the performance and to accommodate the variations caused due to new enrolments and probes, biometric systems generally require re-training. 

online SVM classifiers have a significant advantage of reduced re-training time using only the new sample points to update the decision boundary. 

Since corresponding labels (“genuine” or “impostor”) are available during enrolment, classifier update using online learning can be viewed as a supervised learning approach.∙ unlabeled information obtained at probe level can be used to update the classifier using co-training. 

is the mapping function used to map the data space to the feature space, and 𝐶 is the tradeoff parameter between the permissible error in the samples and the margin. 

Kim et al. [14] have shown that online learning algorithms can be used for biometric score fusion in order to resolve the computational problems with increasing number of users. 

Iterate: 𝑗= 1 to number of views (number of classifiers) Process: Train classifier 𝑐𝑗 on 𝑗𝑡ℎ views of 𝐷𝐿 for 𝑘 = 1 to 𝑁 doPredict labels: 𝑐𝑗(𝑥𝑖,𝑗) → 𝑦𝑖 if 𝑦𝑖 ∕= 𝑧𝑖 thenUpdate 𝑐𝑗 with labeled instance {𝑥𝑖,𝑗 ,𝑧𝑖} end ifend for End iterate Output: Updated classifier 𝑐1 and 𝑐2. 

For the proposed framework, classifier1 was updated on 34, 086 instances and classifier2 was updated on 42, 102 instances using co-training during probe verification. 

In co-training, as proposed by Blum and Mitchell [7], two classifiers that are trained on separate views (features), co-train each other based on their confidence in predicting the labels. 

By varying the confidence threshold for a classifier, the number of sample points on which co-training is performed can be controlled.