scispace - formally typeset
Search or ask a question
Author

Heydi Méndez-Vázquez

Bio: Heydi Méndez-Vázquez is an academic researcher. The author has contributed to research in topics: Facial recognition system & Face (geometry). The author has an hindex of 12, co-authored 56 publications receiving 490 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a novel facial feature extraction method named Gabor ordinal measures (GOM), which integrates the distinctiveness of Gabor features and the robustness of Ordinal measures as a promising solution to jointly handle inter-person similarity and intra-person variations in face images.
Abstract: Great progress has been achieved in face recognition in the last three decades. However, it is still challenging to characterize the identity related features in face images. This paper proposes a novel facial feature extraction method named Gabor ordinal measures (GOM), which integrates the distinctiveness of Gabor features and the robustness of ordinal measures as a promising solution to jointly handle inter-person similarity and intra-person variations in face images. In the proposal, different kinds of ordinal measures are derived from magnitude, phase, real, and imaginary components of Gabor images, respectively, and then are jointly encoded as visual primitives in local regions. The statistical distributions of these visual primitives in face image blocks are concatenated into a feature vector and linear discriminant analysis is further used to obtain a compact and discriminative feature representation. Finally, a two-stage cascade learning method and a greedy block selection method are used to train a strong classifier for face recognition. Extensive experiments on publicly available face image databases, such as FERET, AR, and large scale FRGC v2.0, demonstrate state-of-the-art face recognition performance of GOM.

133 citations

Proceedings ArticleDOI
04 Jun 2013
TL;DR: The proposed method not only encodes jointly the local spatial and temporal information, but also extracts the most discriminative facial dynamic information while trying to discard spatio-temporal features related to intra-personal variations.
Abstract: It has been shown in different studies the benefits of using spatio-temporal information for video face recognition. However, most of the existing spatio-temporal representations do not capture the local discriminative information present in human faces. In this paper we introduce a new local spatio-temporal descriptor, based on structured ordinal features, for video face recognition. The proposed method not only encodes jointly the local spatial and temporal information, but also extracts the most discriminative facial dynamic information while trying to discard spatio-temporal features related to intra-personal variations. Besides, a similarity measure based on a set of background samples is proposed to be used with our descriptor, showing to boost its performance. Extensive experiments conducted on the recent but difficult YouTube Faces database demonstrate the good performance of our proposal, achieving state-of-the-art results.

57 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: Inspired on the state-of-the-art ShuffleNetV2 model, a lightweight face architecture is presented in this paper, named ShuffleFaceNet, which introduces significant modifications in order to improve face recognition accuracy.
Abstract: The recent success of convolutional neural networks has led to the development of a variety of new effective and efficient architectures. However, few of them have been designed for the specific case of face recognition. Inspired on the state-of-the-art ShuffleNetV2 model, a lightweight face architecture is presented in this paper. The proposal, named ShuffleFaceNet, introduces significant modifications in order to improve face recognition accuracy. First, the Global Average Pooling layer is replaced by a Global Depth-wise Convolution layer, and Parametric Rectified Linear Unit is used as a non-linear activation function. Under the same experimental conditions, ShuffleFaceNet achieves significantly superior accuracy than the original ShuffleNetV2, maintaining the same speed and compact storage. In addition, extensive experiments conducted on three challenging benchmark face datasets, show that our proposal improves not only state-of-the-art lightweight models but also very deep face recognition models.

44 citations

Journal ArticleDOI
TL;DR: This paper studies the impact of lightweight face models on real applications and evaluates the performance of five recent lightweight architectures on five face recognition scenarios: image and video based face recognition, cross-factor and heterogeneous face Recognition, as well as active authentication on mobile devices.
Abstract: This paper studies the impact of lightweight face models on real applications. Lightweight architectures proposed for face recognition are analyzed and evaluated on different scenarios. In particular, we evaluate the performance of five recent lightweight architectures on five face recognition scenarios: image and video based face recognition, cross-factor and heterogeneous face recognition, as well as active authentication on mobile devices. In addition, we show the lacks of using common lightweight models unchanged for specific face recognition tasks, by assessing the performance of the original lightweight versions of the lightweight face models considered in our study. We also show that the inference time on different devices and the computational requirements of the lightweight architectures allows their use on real-time applications or computationally limited platforms. In summary, this paper can serve as a baseline in order to select lightweight face architectures depending on the practical application at hand. Besides, it provides some insights about the remaining challenges and possible future research topics.

30 citations

Journal ArticleDOI
TL;DR: An overview on face describable visual attributes and in particular of the so-called soft biometrics (e.g., facial marks, gender, age, skin color, and other physical characteristics) is presented.
Abstract: The face is one of the most reliable and easy-to-acquire biometric features, widely used for the recognition of individuals. In controlled environments facial recognition systems are highly effective, however, in real world scenarios and under varying lighting conditions, pose changes, facial expressions, occlusions and low resolution of captured images/videos, the task of recognizing faces becomes significantly complex. In this context it has been shown that certain attributes can be retrieved with a relative probability of success, being useful to complement a non conclusive result of a biometric system. In this paper we present an overview on face describable visual attributes and in particular of the so-called soft biometrics (e.g., facial marks, gender, age, skin color, and other physical characteristics). We review core issues regarding this topic, for instance what are the soft biometrics, which of them are the most robust in video surveillance and other uncontrolled scenarios, how different approaches have been addressed in the literature for their representation and classification, which datasets can be used for evaluation, which related problems remain unresolved and which are the possible ways to approach them.

30 citations


Cited by
More filters
Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work revisits both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network.
Abstract: In modern face recognition, the conventional pipeline consists of four stages: detect => align => represent => classify. We revisit both the alignment step and the representation step by employing explicit 3D face modeling in order to apply a piecewise affine transformation, and derive a face representation from a nine-layer deep neural network. This deep network involves more than 120 million parameters using several locally connected layers without weight sharing, rather than the standard convolutional layers. Thus we trained it on the largest facial dataset to-date, an identity labeled dataset of four million facial images belonging to more than 4, 000 identities. The learned representations coupling the accurate model-based alignment with the large facial database generalize remarkably well to faces in unconstrained environments, even with a simple classifier. Our method reaches an accuracy of 97.35% on the Labeled Faces in the Wild (LFW) dataset, reducing the error of the current state of the art by more than 27%, closely approaching human-level performance.

6,132 citations

Book ChapterDOI
E.R. Davies1
01 Jan 1990
TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.
Abstract: This chapter introduces the subject of statistical pattern recognition (SPR). It starts by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier. The concepts of an optimal number of features, representativeness of the training data, and the need to avoid overfitting to the training data are stressed. The chapter shows that methods such as the support vector machine and artificial neural networks are subject to these same training limitations, although each has its advantages. For neural networks, the multilayer perceptron architecture and back-propagation algorithm are described. The chapter distinguishes between supervised and unsupervised learning, demonstrating the advantages of the latter and showing how methods such as clustering and principal components analysis fit into the SPR framework. The chapter also defines the receiver operating characteristic, which allows an optimum balance between false positives and false negatives to be achieved.

1,189 citations

Journal ArticleDOI
TL;DR: PCANet as discussed by the authors is a simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.
Abstract: In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms. In the proposed architecture, PCA is employed to learn multistage filter banks. It is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus named as a PCA network (PCANet) and can be designed and learned extremely easily and efficiently. For comparison and better understanding, we also introduce and study two simple variations to the PCANet, namely the RandNet and LDANet. They share the same topology of PCANet but their cascaded filters are either selected randomly or learned from LDA. We have tested these basic networks extensively on many benchmark visual datasets for different tasks, such as LFW for face verification, MultiPIE, Extended Yale B, AR, FERET datasets for face recognition, as well as MNIST for hand-written digits recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state of the art features, either prefixed, highly hand-crafted or carefully learned (by DNNs). Even more surprisingly, it sets new records for many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST variations. Additional experiments on other public datasets also demonstrate the potential of the PCANet serving as a simple but highly competitive baseline for texture classification and object recognition.

1,157 citations

Journal ArticleDOI
TL;DR: Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)].
Abstract: In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.

1,034 citations

Proceedings ArticleDOI
23 Jun 2014
TL;DR: The proposed DDML trains a deep neural network which learns a set of hierarchical nonlinear transformations to project face pairs into the same feature subspace, under which the distance of each positive face pair is less than a smaller threshold and that of each negative pair is higher than a larger threshold.
Abstract: This paper presents a new discriminative deep metric learning (DDML) method for face verification in the wild. Different from existing metric learning-based face verification methods which aim to learn a Mahalanobis distance metric to maximize the inter-class variations and minimize the intra-class variations, simultaneously, the proposed DDML trains a deep neural network which learns a set of hierarchical nonlinear transformations to project face pairs into the same feature subspace, under which the distance of each positive face pair is less than a smaller threshold and that of each negative pair is higher than a larger threshold, respectively, so that discriminative information can be exploited in the deep network. Our method achieves very competitive face verification performance on the widely used LFW and YouTube Faces (YTF) datasets.

730 citations