scispace - formally typeset

Proceedings ArticleDOI

Cross Spectral Periocular Matching using ResNet Features

04 Jun 2019-pp 1-7

TL;DR: This paper uses a ResNet-101 pretrained model for the ImageNet Large Scale Visual Recognition Challenge to extract features from the IIITD Multispectral Periocular Database and trains a neural network to match the best CNN feature layer vector from each spectrum.
Abstract: Periocular recognition has gained attention in the last years thanks to its high discrimination capabilities in less constraint scenarios than other ocular modalities. In this paper we propose a method for periocular verification under different light spectra using CNN features with the particularity that the network has not been trained for this purpose. We use a ResNet-101 pretrained model for the ImageNet Large Scale Visual Recognition Challenge to extract features from the IIITD Multispectral Periocular Database. At each layer the features are compared using χ2 distance and cosine similitude to carry on verification between images, achieving an improvement in the EER and accuracy at 1% FAR of up to 63.13% and 24.79% in comparison to previous works that employ the same database. In addition to this, we train a neural network to match the best CNN feature layer vector from each spectrum. With this procedure, we achieve improvements of up to 65% (EER) and 87% (accuracy at 1% FAR) in cross-spectral verification with respect to previous studies.

Content maybe subject to copyright    Report

http://www.diva-portal.org
Preprint
This is the submitted version of a paper presented at 12th IAPR International Conference on
Biometrics, Crete, Greece, June 4-7, 2019.
Citation for the original published paper:
Hernandez-Diaz, K., Alonso-Fernandez, F., Bigun, J. (2019)
Cross Spectral Periocular Matching using ResNet Features
In: 2019 International Conference on Biometrics (ICB)
Biometrics (ICB), IAPR International Conference on
https://doi.org/10.1109/ICB45273.2019.8987303
N.B. When citing this work, cite the original published paper.
Permanent link to this version:
http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-40499

Cross Spectral Periocular Matching using ResNet Features
Kevin Hernandez-Diaz, Fernando Alonso-Fernandez, Josef Bigun
Halmstad University
kevher,feralo,josef.bigun
@hh.se
Abstract
Periocular recognition has gained attention in the last
years thanks to its high discrimination capabilities in less
constraint scenarios than other ocular modalities. In this
paper we propose a method for periocular verification un-
der different light spectra using CNN features with the par-
ticularity that the network has not been trained for this pur-
pose. We use a ResNet-101 pretrained model for the Ima-
geNet Large Scale Visual Recognition Challenge to extract
features from the IIITD Multispectral Periocular Database.
At each layer the features are compared using χ
2
distance
and cosine similitude to carry on verification between im-
ages, achieving an improvement in the EER and accuracy
at 1% FAR of up to 63.13% and 24.79% in comparison to
previous works that employ the same database. In addition
to this, we train a neural network to match the best CNN
feature layer vector from each spectrum. With this proce-
dure, we achieve improvements of up to 65% (EER) and
87% (accuracy at 1% FAR) in cross-spectral verification
with respect to previous studies.
1. Introduction
In 2009 U. Park et al. [16] introduced the concept of pe-
riocular recognition. They described this area as the facial
region in the immediate vicinity of the eye. It has raised as
a promising biometric trait for its high discriminatory rate
while having more relaxed acquisition requirements than its
other ocular counterparts [20][14][22] as well as being re-
silient to aging [6] and expressions [15], which makes it
ideal for non-cooperative scenarios [7]. Moreover, it can be
used not only as a sole biometric trait but also in combina-
tion with others like iris or face in order to increase perfor-
mance [18], if these are available.
Convolutional Neural Networks (CNNs) have gained a
lot of popularity since in 2012 Krizhevsky et al. [9] won the
ImageNet Large Scale Visual Recognition Challenge with
AlexNet. They have become the state of the art in many
pattern recognition applications [2], but since they rely on a
huge amount of data to work properly, their use in biomet-
ric systems are still somewhat limited, with work mostly
focused on face [17], iris [3] and soft biometrics [23].
Based on the study of Nguyen et al. [11], the authors in
Hernandez-Diaz et al. [5] investigated the use of features
from pre-trained off-the-shelf CNNs for periocular recog-
nition, eliminating the necessity of designing and training
new CNNs. In the present work, we further investigate the
behaviour of these pre-trained CNN architectures in cross-
spectral periocular scenarios. Cross-spectral comparison is
gaining attention in iris [10] and face [12] recognition, as
well as in periocular research [21]. The choice of a par-
ticular illumination is based on a trade-off between differ-
ent factors, including the quality required or practical as-
pects, such as the use of near-infrared images in controlled
conditions (e.g. border control) vs. visible images in un-
controlled environments (e.g. smartphones). It may also
happen that images captured with one type of illumination
need to be compared with legacy images captured in a dif-
ferent spectrum. Unfortunately, the performance of cross-
spectral matching is significantly degraded in comparison
with matching of images captured in the same spectrum
[21].
The rest of the paper is organized as follows. In Sec-
tion 2, we present our experimental framework, including
the database employed, the pre-processing steps, and the
protocol employed for cross-spectral periocular recognition
using off-the-shelf CNN features. Section 3 presents the
results obtained, both in same-sensor and cross-sensor sce-
narios. Finally, conclusions are given in Section 4.
2. Experimental Framework
2.1. Database and Preprocessing
Periocular databases with data in different spectra are a
scarce resource [1]. In this work we use the IIITD Mul-
stispectral Periocular database [22], the only database avail-
able that provides periocular images in three different spec-
tra: Visible, Night Vision and Near Infrared. Visible images
were captured using a Nikon SLR camera; Night vision im-
ages were taken with a Sony HandyCAM in night vision
mode and for NIR images, they used a Cogent Iris scan-

a) b) c)
d) e)
f)
g)
h) i)
Figure 1. samples of DB images: a) original visible spectrum image, b) original night vision image, c) original NIR images, d) centered,
masked, scaled and cropped visible image, e) gray-scaled and contrast enhanced visible image, f) centered, masked, scaled and cropped
night vision image, g) gray-scaled and contrast enhanced night vision image, h) masked and cropped NIR image, i) contrast enhanced NIR
image
ner. Visible and Night Vision images were captured at a dis-
tance of 1.3 meters while NIR images were taken from 15
centimeters. All images were taken in a controlled environ-
ment, with proper illumination for Visible and NIR images
and very low illumination for Night Vision images. Some
examples are shown in Figure 1 (top)
The database contains images from 62 persons with 5
pictures per person in Visible and Night Vision spectrum
containing both periocular areas, whereas in NIR there are
10 images per person, 5 for each eye. The resolution per
spectrum are 601x301, 540x260 and 640x480 for Visible,
Night Vision and NIR images respectively .
For this work, iris and sclera radius were manually an-
notated and irises masked to ensure periocular recognition.
For Visible and Night Vision images, left and right eyes
were centered, scaled to have their sclera radius equal to
the mean of the database and cropped to be a square with a
side length of 4 times the mean sclera radius. NIR images
were first cropped to have a square shape with the same
vertical length as the original image. Two different cases
were considered for this step, if the sclera center fell close
to the left or right border, the cropping was done so no extra
padding was necessary, if the eye had enough separation to
the borders, the cropping was made around the sclera cen-
ter. No further normalization was done in order to avoid an
excessive inclusion of artifacts or excessive cropping, this
was due to the different acquisition technique used for this
type of images.
In addition to this, extra preprocessing techniques were
tried and compared to find the best case for same spectrum
verification. For this purpose, all images were converted
to gray-scale and contrast enhanced by Contrast-Limited
Adaptive Histogram Equalization (CLAHE) [8] algorithm.
We employ CLAHE since it is usually the preprocessing
choice with images of the eye region [19].
Finally, since ResNet101 is configured to have as input
RGB images of size 224x224, the pictures were resized and
gray-scale images replicated to have three channels in order
to fit the input size of the network. Figure 1 shows ex-
amples of pre and post processed images for each spectrum
[19].
2.2. Experimentation
We carry out verification experiments for periocular im-
ages in different spectra, Visible, Night Vision and NIR with
the method described in Hernandez-Diaz et al. [5] using
the CNN that achieved best results. The CNN used for this
paper is Matlab’s ResNet-101 model [4]. The network con-
sists of 101 convolutional layers trained on more than a mil-
lion images from the ImageNet database to classify images
within 1000 classes.
We fed the network with the post-processed images and
extracted the output of every layer, including Convolution,
ReLu, Batch Normalization, Fully Connected and Addition
layers as shown in the Figure 2. Then, for each layer,
we tried a verification system that compares its output vec-
tors using simple similitude measures as the ones shown in
equations 1 and 2. Equation 1 is a modified χ
2
distance that
uses the absolute value of the elements of the vectors, and
normalize them so the sum of their elements equals 1, this is
done because this distance is usually applied to normalized
histograms, whose entry values are always positive. Equa-
tion 2 is the cosine between vectors. The goal is to find the
best performing layer for each case
χ
2
=
n
X
i=1
(|x
i
| |y
i
|)
2
|x
i
| + |y
i
|
(1)

CNN Feature vectors CNN Feature vectors
Figure 2. Extraction of the feature vectors from the CNN layers
cos(α) =
X
T
Y
||X||||Y ||
(2)
In our experiments, we consider each eye as a different
user, thus having 62x2 = 124 users, and 5 images per user
and spectrum. For genuine scores and same spectrum verifi-
cation we compare all vectors of the same user using combi-
nation of two elements without repetitions, getting a total of
124 users 10 combinations = 1240 genuine scores. For the
impostor score and same spectrum we compare the first im-
age of every user with the second picture of all other users
getting a total of 124 123 = 15252 impostor scores.
For the genuine scores and cross-spectrum verification,
we compared all images of each user in one spectrum
against all images of the same user in the other spectrum,
getting a total of 124 users 25 combinations with a total
of 3100 genuine scores. For the impostor scores we com-
pare the first image of a user in one spectrum with the first
image of the rest of the users in the other spectrum, getting
a total of 124 123 = 15252 impostor scores.
To further improve cross-sensor verification, we trained
a neural network classifier that uses as input the feature vec-
tors of the best CNN layer for each spectrum that we are
comparing concatenated, and as output, the classification
of either a genuine or an impostor matching. The network
used in this research consists of an input layer of 2 * flat-
tened CNN feature vector of size 50176 = 100352 units, a
hidden layer of 200 units with a ReLu activation function,
another hidden layer with 20 units, a softmax layer and an
output layer of size 2. We use SGDM, a minibatch size of
40 and the same amount of genuine and impostor vectors at
every batch during the training process.
For neural network training, we used 5-fold cross-
validation. Each partition consists of 20% of the whole
database, samples were randomly extracted and repetitions
between partitions was not allowed. Moreover, instead of
directly classifying the test cases at each test partition, we
extract the softmax output, concatenate all tests partitions
and manually change the threshold value at which we de-
cide the vectors are genuine or impostor. This way we are
not only able to calculate the EER and accuracy at 1% FAR,
but we also get a more consistent behaviour of the neural
network performance by considering all test partitions of
each model as a whole.
3. Results
3.1. Same Spectrum Verification
First of all we carried out experiments with the features
extracted at every layer of the CNN using χ
2
distance and
cosine similitude to find out at which layer the system per-
form best for same spectrum verification. We propose two
different cases per spectrum: for Visible and Night Vision
we used masked, centered, scaled and cropped periocular
images from the originals but also gray-scaled images with
its contrast enhanced by CLAHE algorithm. For NIR we
used masked, cropped images but also tried contrast en-
hanced images. All this cases can be seen at Figure 1 (bot-
tom).
3.1.1 Similarity between vectors
We report the EER at every layer of the CNN in the Fig-
ure 3 (top). Due to space constraints, we only give re-
sults of the best cases for each spectrum. For visible and
Night Vision, they correspond to gray-scaled images with
enhanced contrast. However, the best result in NIR is ob-
served using original gray-scaled images without their con-
trast enhanced. It is worth noting that even though the CNN
is trained using RGB images, the best results obtained are
all for gray-scaled images. This can be due to fewer free
variables in the input by having the same gray-scaled image

50 100 150 200 250 300
Layer Nº
0
10
20
30
40
EER (%)
Visible (CLAHE)
Chi distance
Cosine
50 100 150 200 250 300
Layer Nº
0
10
20
30
40
EER (%)
Night Vision (CLAHE)
Chi distance
Cosine
50 100 150 200 250 300
Layer Nº
0
10
20
30
40
EER (%)
NIR
Chi distance
Cosine
50 100 150 200 250 300
Layer Nº
0
20
40
60
80
EER (%)
Visible (CLAHE) vs Night Vision (CLAHE)
Chi distance
Cosine
50 100 150 200 250 300
Layer Nº
0
20
40
60
80
EER (%)
Visible (CLAHE) vs NIR
Chi distance
Cosine
50 100 150 200 250 300
Layer Nº
0
20
40
60
80
EER (%)
Night Vision (CLAHE) vs NIR
Chi distance
Cosine
Figure 3. EER at every layer for same and cross spectral verification
replicated at every channel for a network that has not been
trained for the task of periocular recognition.
From the results of Figure 3 (top), we can observe that
the measures perform similar for each layer, with χ
2
dis-
tance consistently giving slightly better results. We can also
observe that each spectrum have different behaviour. Visi-
ble spectrum starts with a small EER and decrease more or
less consistently until it reaches its minimum at layer 85,
with an EER of 2.95% using χ
2
distance, having similar
results until almost the first half of the network. Then, its
behaviour become irregular, increasing and decreasing the
EER every few layers. Night Vision has the most irregular
performance of all three, with peaks all along the layers of
up to more than 10% difference. It reaches the best per-
formance at layer 84, with an EER of 4.04% using cosine
similitude, as shown in the graph. Finally, NIR shows the
most stable performance. Except for some peak at the be-
ginning, it has a continuous descendent EER, reaching its
minimum almost at the end of the network, at layer 295
with an EER of 3.97% using χ
2
distance.
The best result for each spectrum is shown in Table 1,
we give the EER at the best performing layer as well as the
accuracy at 1% FAR. We also provide results of previous
studies employing the same database [22]. It is worth noting
that results shown in Table 1 were obtained using both pe-
riocular zones [22], which helps to improve the results dur-
ing the verification process. Instead, here we consider each
periocular zone as different users, which makes the recog-
nition somewhat harder. By using the proposed method,
we improve previously reported result by up to 63.13% and
24.79% for EER and accuracy at 1% FAR in Visible spec-
trum, respectively. In Night Vision spectrum, we improve
results by up to 42.30% EER and up to 13.67% accuracy at
1% FAR. Finally, we increase accuracy results up to 1.65%
in NIR spectrum.
3.1.2 Neural Network
Once we found the best performing layer for each spec-
trum, using these features, we seek to reduce the EER fur-
ther by training a neural network capable of discerning gen-
uine user vectors from impostor vectors in a verification
scenario. The results for this experiments are shown in the
Table 1 (5
th
column, top). We have obtained the follow-
ing EERs 5.08% 3.95% and 3.06%, which improves results
by 36.5%, 43.57% and 12.57% in relative terms for Visible,
Night Vision and NIR respectively, in relation to those of
Sharma et al. [22]. Achieving the best EER reported in this
paper for Night Vision and NIR spectrum. Additionally, we

Citations
More filters

Journal ArticleDOI
Abstract: One of the major challenges in ocular biometrics is the cross-spectral scenario, i.e. how to match images acquired in different wavelengths. This study designs and extensively evaluates cross-spectral ocular verification methods using well known deep learning representations based on the iris and periocular regions. Using as inputs, the bounding boxes of non-normalised iris-periocular regions, the authors fine-tune convolutional neural network models, originally trained for face recognition. On the basis of the experiments carried out in two publicly available cross-spectral ocular databases, they report results for intra-spectral and cross-spectral scenarios, with the best performance being observed when fusing ResNet-50 deep representations from both the periocular and iris regions. When compared to the state of the art, they observed that the proposed solution consistently reduces the equal error rate values by 90%/93%/96% and 61%/77%/83% on the cross-spectral scenario and in the PolyU bi-spectral and cross-eye-cross-spectral datasets. Finally, they evaluate the effect that the `deepness' factor of feature representations has in recognition effectiveness, and based on a subjective analysis of the most problematic pairwise comparisons - they point out further directions for this field of research.

13 citations


Posted Content
TL;DR: The proposed fusion approach achieves reductions in error rates of up to 20-30% in cross-spectral NIR-VW comparison, leading to an EER of 0.22% and a FRR of just 0.62% for FAR=0.01%, representing the best overall approach of the mentioned competition.
Abstract: The massive availability of cameras and personal devices results in a wide variability between imaging conditions, producing large intra-class variations and performance drop if such images are compared for person recognition. However, as biometric solutions are extensively deployed, it will be common to replace acquisition hardware as it is damaged or newer designs appear, or to exchange information between agencies or applications in heterogeneous environments. Furthermore, variations in imaging bands can also occur. For example, faces are typically acquired in the visible (VW) spectrum, while iris images are captured in the near-infrared (NIR) spectrum. However, cross-spectrum comparison may be needed if for example a face from a surveillance camera needs to be compared against a legacy iris database. Here, we propose a multialgorithmic approach to cope with cross-sensor periocular recognition. We integrate different systems using a fusion scheme based on linear logistic regression, in which fused scores tend to be log-likelihood ratios. This allows easy combination by just summing scores of available systems. We evaluate our approach in the context of the 1st Cross-Spectral Iris/Periocular Competition, whose aim was to compare person recognition approaches when periocular data from VW and NIR images is matched. The proposed fusion approach achieves reductions in error rates of up to 20-30% in cross-spectral NIR-VW comparison, leading to an EER of 0.22% and a FRR of just 0.62% for FAR=0.01%, representing the best overall approach of the mentioned competition.. Experiments are also reported with a database of VW images from two different smartphones, achieving even higher relative improvements in performance. We also discuss our approach from the point of view of template size and computation times, with the most computationally heavy system playing an important role in the results.

10 citations


Journal ArticleDOI
TL;DR: This study designs and extensively evaluates cross-spectral ocular verification methods using well known deep learning representations based on the iris and periocular regions, and fine-tune convolutional neural network models, originally trained for face recognition.
Abstract: One of the major challenges in ocular biometrics is the cross-spectral scenario, i.e., how to match images acquired in different wavelengths (typically visible (VIS) against near-infrared (NIR)). This article designs and extensively evaluates cross-spectral ocular verification methods, for both the closed and open-world settings, using well known deep learning representations based on the iris and periocular regions. Using as inputs the bounding boxes of non-normalized iris/periocular regions, we fine-tune Convolutional Neural Network(CNN) models (based either on VGG16 or ResNet-50 architectures), originally trained for face recognition. Based on the experiments carried out in two publicly available cross-spectral ocular databases, we report results for intra-spectral and cross-spectral scenarios, with the best performance being observed when fusing ResNet-50 deep representations from both the periocular and iris regions. When compared to the state-of-the-art, we observed that the proposed solution consistently reduces the Equal Error Rate(EER) values by 90% / 93% / 96% and 61% / 77% / 83% on the cross-spectral scenario and in the PolyU Bi-spectral and Cross-eye-cross-spectral datasets. Lastly, we evaluate the effect that the "deepness" factor of feature representations has in recognition effectiveness, and - based on a subjective analysis of the most problematic pairwise comparisons - we point out further directions for this field of research.

5 citations


Cites methods from "Cross Spectral Periocular Matching ..."

  • ...rs. It is important to point out that in [23] the ResNet152 model (i.e., a deeper model than ResNet50, used in our work) was employed. The same behavior can be observed in work by HenandezDiaz et al. [31], were the authors stated that features extracted from the intermediate layers of the ResNet-101 model reported the best results. Thus, the deepest layer reported in this work is approximately at the ...

    [...]

  • ...on cross-spectral recognition in the verification mode. Thus, it is used for comparison with the methodology presented in this paper. Also applied in the cross-spectral scenario, Hernandez-Diaz et al. [31] proposed a method using a ResNet-101 model pretrained in the Imagenet database [28] to extract deep representation from periocular images. The experiments were carried out in verification mode using t...

    [...]


Posted Content
Abstract: Ocular biometric systems working in unconstrained environments usually face the problem of small within-class compactness caused by the multiple factors that jointly degrade the quality of the obtained data. In this work, we propose an attribute normalization strategy based on deep learning generative frameworks, that reduces the variability of the samples used in pairwise comparisons, without reducing their discriminability. The proposed method can be seen as a preprocessing step that contributes for data regularization and improves the recognition accuracy, being fully agnostic to the recognition strategy used. As proof of concept, we consider the "eyeglasses" and "gaze" factors, comparing the levels of performance of five different recognition methods with/without using the proposed normalization strategy. Also, we introduce a new dataset for unconstrained periocular recognition, composed of images acquired by mobile devices, particularly suited to perceive the impact of "wearing eyeglasses" in recognition effectiveness. Our experiments were performed in two different datasets, and support the usefulness of our attribute normalization scheme to improve the recognition performance.

4 citations


Posted Content
TL;DR: This work proposes the use of Conditional Generative Adversarial Networks, trained to convert periocular images between visible and near-infrared spectra, so that biometric verification is carried out in the same spectrum.
Abstract: This work addresses the challenge of comparing periocular images captured in different spectra, which is known to produce significant drops in performance in comparison to operating in the same spectrum. We propose the use of Conditional Generative Adversarial Networks, trained to con-vert periocular images between visible and near-infrared spectra, so that biometric verification is carried out in the same spectrum. The proposed setup allows the use of existing feature methods typically optimized to operate in a single spectrum. Recognition experiments are done using a number of off-the-shelf periocular comparators based both on hand-crafted features and CNN descriptors. Using the Hong Kong Polytechnic University Cross-Spectral Iris Images Database (PolyU) as benchmark dataset, our experiments show that cross-spectral performance is substantially improved if both images are converted to the same spectrum, in comparison to matching features extracted from images in different spectra. In addition to this, we fine-tune a CNN based on the ResNet50 architecture, obtaining a cross-spectral periocular performance of EER=1%, and GAR>99% @ FAR=1%, which is comparable to the state-of-the-art with the PolyU database.

3 citations


Cites background or methods from "Cross Spectral Periocular Matching ..."

  • ...The use of off-the-shelf Convolutional Neural Networks (CNNs) as feature extraction method for NIR-VIS comparison was recently proposed in [11]....

    [...]

  • ...This work investigates the challenge of comparing periocular images captured in different spectra, which usually provides worse performance than if they were captured in the same spectrum [15, 31, 26, 11]....

    [...]

  • ...These networks have proven to be successful in recognition tasks apart from the task for which they were designed, including cross-spectral periocular recognition [11, 4]....

    [...]

  • ...Inspired by the works [21, 10, 11] in iris and ocular biometrics, we also leverage the power of existing architectures pre-trained with millions of images to classify hundreds of thousands of object categories....

    [...]

  • ...A significant performance degradation is usually observed when comparing periocular images acquired in different spectra [15, 31, 26, 11]....

    [...]


References
More filters

Proceedings ArticleDOI
Kaiming He1, Xiangyu Zhang1, Shaoqing Ren1, Jian Sun1Institutions (1)
27 Jun 2016
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

93,356 citations


Proceedings Article
03 Dec 2012
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,871 citations


Proceedings ArticleDOI
01 Jan 2015
TL;DR: It is shown how a very large scale dataset can be assembled by a combination of automation and human in the loop, and the trade off between data purity and time is discussed.
Abstract: The goal of this paper is face recognition – from either a single photograph or from a set of faces tracked in a video. Recent progress in this area has been due to two factors: (i) end to end learning for the task using a convolutional neural network (CNN), and (ii) the availability of very large scale training datasets. We make two contributions: first, we show how a very large scale dataset (2.6M images, over 2.6K people) can be assembled by a combination of automation and human in the loop, and discuss the trade off between data purity and time; second, we traverse through the complexities of deep network training and face recognition to present methods and procedures to achieve comparable state of the art results on the standard LFW and YTF face benchmarks.

4,347 citations


Journal ArticleDOI
Zubair Md. Fadlullah1, Fengxiao Tang1, Bomin Mao1, Nei Kato1  +3 moreInstitutions (2)
TL;DR: An overview of the state-of-the-art deep learning architectures and algorithms relevant to the network traffic control systems, and a new use case, i.e., deep learning based intelligent routing, which is demonstrated to be effective in contrast with the conventional routing strategy.
Abstract: Currently, the network traffic control systems are mainly composed of the Internet core and wired/wireless heterogeneous backbone networks. Recently, these packet-switched systems are experiencing an explosive network traffic growth due to the rapid development of communication technologies. The existing network policies are not sophisticated enough to cope with the continually varying network conditions arising from the tremendous traffic growth. Deep learning, with the recent breakthrough in the machine learning/intelligence area, appears to be a viable approach for the network operators to configure and manage their networks in a more intelligent and autonomous fashion. While deep learning has received a significant research attention in a number of other domains such as computer vision, speech recognition, robotics, and so forth, its applications in network traffic control systems are relatively recent and garnered rather little attention. In this paper, we address this point and indicate the necessity of surveying the scattered works on deep learning applications for various network traffic control aspects. In this vein, we provide an overview of the state-of-the-art deep learning architectures and algorithms relevant to the network traffic control systems. Also, we discuss the deep learning enablers for network systems. In addition, we discuss, in detail, a new use case, i.e., deep learning based intelligent routing. We demonstrate the effectiveness of the deep learning-based routing approach in contrast with the conventional routing strategy. Furthermore, we discuss a number of open research issues, which researchers may find useful in the future.

484 citations


"Cross Spectral Periocular Matching ..." refers background in this paper

  • ...00 c ©2019 European Union pattern recognition applications [2], but since they rely on a huge amount of data to work properly, their use in biometric systems are still somewhat limited, with work mostly focused on face [17], iris [3] and soft biometrics [23]....

    [...]


Book
Paul S. Heckbert1Institutions (1)
01 Aug 1994

482 citations


"Cross Spectral Periocular Matching ..." refers methods in this paper

  • ...We propose two different cases per spectrum: for Visible and Night Vision we used masked, centered, scaled and cropped periocular images from the originals but also gray-scaled images with its contrast enhanced by CLAHE algorithm....

    [...]

  • ...We employ CLAHE since it is usually the preprocessing choice with images of the eye region [19]....

    [...]

  • ...For this purpose, all images were converted to gray-scale and contrast enhanced by Contrast-Limited Adaptive Histogram Equalization (CLAHE) [8] algorithm....

    [...]

  • ...For this purpose, all images were converted to gray-scale and contrast enhanced by Contrast-Limited Adaptive Histogram Equalization (CLAHE) [8] algorithm....

    [...]


Network Information
Related Papers (5)
01 Oct 2014

Anjali Sharma, Shalini Verma +2 more

06 Aug 2012

Chandrashekhar Padole, Hugo Proença

01 Jan 2015

Omkar M. Parkhi, Andrea Vedaldi +1 more

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20205
20193