scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Memetically Optimized MCWLD for Matching Sketches With Digital Face Images

TL;DR: An automated algorithm to extract discriminating information from local regions of both sketches and digital face images is presented and yields better identification performance compared to existing face recognition algorithms and two commercial face recognition systems.
Abstract: One of the important cues in solving crimes and apprehending criminals is matching sketches with digital face images. This paper presents an automated algorithm to extract discriminating information from local regions of both sketches and digital face images. Structural information along with minute details present in local facial regions are encoded using multiscale circular Weber's local descriptor. Further, an evolutionary memetic optimization algorithm is proposed to assign optimal weight to every local facial region to boost the identification performance. Since forensic sketches or digital face images can be of poor quality, a preprocessing technique is used to enhance the quality of images and improve the identification performance. Comprehensive experimental evaluation on different sketch databases show that the proposed algorithm yields better identification performance compared to existing face recognition algorithms and two commercial face recognition systems.
Citations
More filters
Journal ArticleDOI
TL;DR: It is suggested that the high-level features of Deep Convolutional Neural Networks trained in visual spectra images are potentially domain independent and can be used to encode faces sensed in different image domains.
Abstract: The task of Heterogeneous Face Recognition consists in matching face images that are sensed in different domains, such as sketches to photographs (visual spectra images), and thermal images to photographs or near-infrared images to photographs. In this paper, we suggest that the high-level features of Deep Convolutional Neural Networks trained in visual spectra images are potentially domain independent and can be used to encode faces sensed in different image domains. A generic framework for Heterogeneous Face Recognition is proposed by adapting Deep Convolutional Neural Networks low-level features in, so-called, Domain Specific Units. The adaptation using the Domain Specific Units allows the learning of shallow feature detectors specific for each new image domain. Furthermore, it handles its transformation to a generic face space shared between all image domains. Experiments carried out with four different face databases covering three different image domains show substantial improvements, in terms of recognition rate, surpassing the state-of-the-art for most of them. This work is made reproducible: all the source code, scores, and trained models of this approach are made publicly available.

63 citations

Journal ArticleDOI
TL;DR: In this paper, a novel superpixel-based face sketch–photo synthesis method is presented by estimating the face structures through image segmentation by first segmented into superpixels, which are then dilated to enhance the compatibility of neighboringsuperpixels.
Abstract: Face sketch–photo synthesis technique has attracted growing attention in many computer vision applications, such as law enforcement and digital entertainment. Existing methods either simply perform the face sketch–photo synthesis on the holistic image or divide the face image into regular rectangular patches ignoring the inherent structure of the face image. In view of such situations, this paper presents a novel superpixel-based face sketch–photo synthesis method by estimating the face structures through image segmentation. In our proposed method, face images are first segmented into superpixels, which are then dilated to enhance the compatibility of neighboring superpixels. Each input face image induces a specific graphical structure modeled by Markov networks. We employ a two-stage synthesis process to learn the face structures through Markov networks constructed from two scales of dilation, respectively. Experiments on several public databases demonstrate that our proposed face sketch–photo synthesis method achieves superior performance compared with the state-of-the-art methods.

61 citations


Cites methods from "Memetically Optimized MCWLD for Mat..."

  • ...[24] used modified Weber’s local descriptor and memetic optimization and achieved an accuracy of 84....

    [...]

  • ...We demonstrate the effectiveness of the proposed method on three face sketch databases: 1) the Chinese University of Hong Kong (CUHK) face sketch (CUFS) database [5]; 2) the CUHK face sketch FERET (CUFSF) database [23]; and 3) the IIIT-D viewed sketch database [24], and show that the proposed S-FSPS method achieves superior performance compared with the state-of-the-art methods....

    [...]

  • ...Then, we validate the superior performance of our method compared with the state-of-the-art synthesis methods in terms of both qualitative and quantitative experiments on three public face sketch databases: 1) the CUFS database [5]; 2) the CUFSF database [23]; and 3) the IIIT-D viewed sketch database [24]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes the mutual component convolutional neural network (MC-CNN), a modal-invariant deep learning framework, to tackle the large modality discrepancy and insufficient training samples of HFR.
Abstract: Heterogeneous face recognition (HFR) aims to identify a person from different facial modalities, such as visible and near-infrared images. The main challenges of HFR lie in the large modality discrepancy and insufficient training samples. In this paper, we propose the mutual component convolutional neural network (MC-CNN), a modal-invariant deep learning framework, to tackle these two issues simultaneously. Our MC-CNN incorporates a generative module, i.e., the mutual component analysis (MCA), into modern deep CNNs by viewing MCA as a special fully connected (FC) layer. Based on deep features, this FC layer is designed to extract modal-independent hidden factors and is updated according to maximum likelihood analytic formulation instead of back propagation which prevents overfitting from limited data naturally. In addition, we develop an MCA loss to update the network for modal-invariant feature learning. Extensive experiments show that our MC-CNN outperforms several fine-tuned baseline models significantly. Our methods achieve the state-of-the-art performance on the CASIA NIR-VIS 2.0, CUHK NIR-VIS, and IIIT-D Sketch datasets.

58 citations


Cites background from "Memetically Optimized MCWLD for Mat..."

  • ...0 [32], CUHK VIS-NIR, IIIT-D Sketch [40], and CUHK Face Sketch (CUFS) [41]....

    [...]

  • ...We then conduct extensive experiments on CASIA NIR-VIS 2.0 [32], CUHK VIS-NIR, IIIT-D Sketch [40], and CUHK Face Sketch (CUFS) [41]....

    [...]

Proceedings ArticleDOI
12 Dec 2016
TL;DR: This paper addresses the memory problem head on by introducing a database of 400 forensic sketches created at different time-delays and builds a model to reverse the forgetting process, and shows that it is possible to systematically "un-forget" facial details.
Abstract: We investigate whether it is possible to improve the performance of automated facial forensic sketch matching by learning from examples of facial forgetting over time. Forensic facial sketch recognition is a key capability for law enforcement, but remains an unsolved problem. It is extremely challenging because there are three distinct contributors to the domain gap between forensic sketches and photos: The well-studied sketch-photo modality gap, and the less studied gaps due to (i) the forgetting process of the eye-witness and (ii) their inability to elucidate their memory. In this paper, we address the memory problem head on by introducing a database of 400 forensic sketches created at different time-delays. Based on this database we build a model to reverse the forgetting process. Surprisingly, we show that it is possible to systematically "un-forget" facial details. Moreover, it is possible to apply this model to dramatically improve forensic sketch recognition in practice: we achieve the state of the art results when matching 195 benchmark forensic sketches against corresponding photos and a 10,030 mugshot database.

57 citations


Cites background or methods from "Memetically Optimized MCWLD for Mat..."

  • ...The main sketch/photo databases are 159 pairs identified by [12], and 190 pairs in the IIITD database [2]....

    [...]

  • ...Motivated by this, the computer vision [12] and biometrics [2] fields have extensively studied sketch to photo face matching....

    [...]

  • ...In computer vision, facial sketch-photo matching has been studied extensively using a variety of approaches including invariant feature engineering [1, 2, 4, 12], crossmodal regression/synthesis [22, 23] and shared subspace learning [20]....

    [...]

  • ...The cross-modal sketch-photo gap is thus small, and viewed sketches are relatively easy to match – resulting in benchmark performance saturated at near-perfect [1, 2, 4, 12]....

    [...]

  • ...Later studies such as [2] improved these results, again combining feature engineering (Weber and Wavelet descriptors) plus the discriminative learning (genetic algorithms) strategy to maximise matching accuracy....

    [...]

Book ChapterDOI
01 Nov 2014
TL;DR: This paper investigates sketch-photo face matching and goes beyond the well-studied viewed sketches to tackle forensic sketches and caricatures where representations are often symbolic, and learns a facial attribute model independently in each domain that represents faces in terms of semantic properties.
Abstract: Matching face images across different modalities is a challenging open problem for various reasons, notably feature heterogeneity, and particularly in the case of sketch recognition – abstraction, exaggeration and distortion. Existing studies have attempted to address this task by engineering invariant features, or learning a common subspace between the modalities. In this paper, we take a different approach and explore learning a mid-level representation within each domain that allows faces in each modality to be compared in a domain invariant way. In particular, we investigate sketch-photo face matching and go beyond the well-studied viewed sketches to tackle forensic sketches and caricatures where representations are often symbolic. We approach this by learning a facial attribute model independently in each domain that represents faces in terms of semantic properties. This representation is thus more invariant to heterogeneity, distortions and robust to mis-alignment. Our intermediate level attribute representation is then integrated synergistically with the original low-level features using CCA. Our framework shows impressive results on cross-modal matching tasks using forensic sketches, and even more challenging caricature sketches. Furthermore, we create a new dataset with \(\approx \)59, 000 attribute annotations for evaluation and to facilitate future research.

53 citations


Cites background from "Memetically Optimized MCWLD for Mat..."

  • ...Later studies such as [13] improved these results, again combining feature engineering (Weber and Wavelet descriptors) plus discriminative learning (genetic algorithms) strategy to maximize matching accuracy; while [16] followed up also with feature engineering (LBP) and discriminative learning (RS-LDA)....

    [...]

  • ...Alternatively, matrices W may also be learned by discriminative models [1, 8, 13] to maximize matching rate....

    [...]

  • ...In each case, strategies to bridge the cross-modal gap broadly break down into four categories: (i) those that learn a cross-modal mapping to synthesise one modality from the other, and then perform within-modality matching [10, 11], (ii) those that learn a common subspace where the two modalities are more comparable [12], (iii) those that learn discriminative models to maximise matching accuracy [1, 13], and (iv) those that engineer features which are simultaneously invariant to the details of each modality, while being variant to person identity [14, 4]....

    [...]

  • ...(1), where |·| indicates some distance metric such as L1, L2 [1] or X 2 [13, 14]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

01 Jan 2011
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

14,708 citations


Additional excerpts

  • ...On the other hand, sparse descriptor such as Scale Invariant Feature Transform (SIFT ) [23] is based on interest point detection and computing the descriptor in the vicinity of detected interest points....

    [...]

Journal ArticleDOI
TL;DR: In this article, the regularity of compactly supported wavelets and symmetry of wavelet bases are discussed. But the authors focus on the orthonormal bases of wavelets, rather than the continuous wavelet transform.
Abstract: Introduction Preliminaries and notation The what, why, and how of wavelets The continuous wavelet transform Discrete wavelet transforms: Frames Time-frequency density and orthonormal bases Orthonormal bases of wavelets and multiresolutional analysis Orthonormal bases of compactly supported wavelets More about the regularity of compactly supported wavelets Symmetry for compactly supported wavelet bases Characterization of functional spaces by means of wavelets Generalizations and tricks for orthonormal wavelet bases References Indexes.

14,157 citations

Journal ArticleDOI
TL;DR: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features that is assessed in the face recognition problem under different challenges.
Abstract: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features. The face image is divided into several regions from which the LBP feature distributions are extracted and concatenated into an enhanced feature vector to be used as a face descriptor. The performance of the proposed method is assessed in the face recognition problem under different challenges. Other applications and several extensions are also discussed

5,563 citations

01 Jan 1998

3,650 citations