scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

SWAPPED! Digital face presentation attack detection via weighted local magnitude pattern

TL;DR: A novel database, termed as SWAPPED — Digital Attack Video Face Database, prepared using Snap chat's application which swaps/stitches two faces and creates videos, which contains bonafide face videos and face swapped videos of multiple subjects is presented.
Abstract: Advancements in smartphone applications have empowered even non-technical users to perform sophisticated operations such as morphing in faces as few tap operations. While such enablements have positive effects, as a negative side, now anyone can digitally attack face (biometric) recognition systems. For example, face swapping application of Snapchat can easily create “swapped” identities and circumvent face recognition system. This research presents a novel database, termed as SWAPPED — Digital Attack Video Face Database, prepared using Snap chat's application which swaps/stitches two faces and creates videos. The database contains bonafide face videos and face swapped videos of multiple subjects. Baseline face recognition experiments using commercial system shows over 90% rank-1 accuracy when attack videos are used as probe. As a second contribution, this research also presents a novel Weighted Local Magnitude Pattern feature descriptor based presentation attack detection algorithm which outperforms several existing approaches.
Citations
More filters
Posted Content
TL;DR: This paper presents the first publicly available set of Deepfake videos generated from videos of VidTIMIT database, and demonstrates that GAN-generated Deep fake videos are challenging for both face recognition systems and existing detection methods.
Abstract: It is becoming increasingly easy to automatically replace a face of one person in a video with the face of another person by using a pre-trained generative adversarial network (GAN). Recent public scandals, e.g., the faces of celebrities being swapped onto pornographic videos, call for automated ways to detect these Deepfake videos. To help developing such methods, in this paper, we present the first publicly available set of Deepfake videos generated from videos of VidTIMIT database. We used open source software based on GANs to create the Deepfakes, and we emphasize that training and blending parameters can significantly impact the quality of the resulted videos. To demonstrate this impact, we generated videos with low and high visual quality (320 videos each) using differently tuned parameter sets. We showed that the state of the art face recognition systems based on VGG and Facenet neural networks are vulnerable to Deepfake videos, with 85.62% and 95.00% false acceptance rates respectively, which means methods for detecting Deepfake videos are necessary. By considering several baseline approaches, we found that audio-visual approach based on lip-sync inconsistency detection was not able to distinguish Deepfake videos. The best performing method, which is based on visual quality metrics and is often used in presentation attack detection domain, resulted in 8.97% equal error rate on high quality Deepfakes. Our experiments demonstrate that GAN-generated Deepfake videos are challenging for both face recognition systems and existing detection methods, and the further development of face swapping technology will make it even more so.

369 citations


Cites background or methods from "SWAPPED! Digital face presentation ..."

  • ...We also applied several baseline methods from presentation attack detection domain, by treating Deepfake videos as digital presentation attacks [8], including simple principal component analysis (PCA) and linear discriminant analysis...

    [...]

  • ...com/news/technology-42912529 responding to the public demand to detect face swapping technology, researchers are starting to work on databases and detection methods, including image and video data [6] generated with an older face swapping approach Face2Face [7] or videos collected using Snapchat3 application [8]....

    [...]

  • ...[8] and evaluated on the videos collected by the authors with Snapchat3 phone application....

    [...]

Journal ArticleDOI
TL;DR: This article explores the creation and detection of deepfakes and provides an in-depth view as to how these architectures work and the current trends and advancements in this domain.
Abstract: Generative deep learning algorithms have progressed to a point where it is difficult to tell the difference between what is real and what is fake. In 2018, it was discovered how easy it is to use this technology for unethical and malicious applications, such as the spread of misinformation, impersonation of political leaders, and the defamation of innocent individuals. Since then, these `deepfakes' have advanced significantly. In this paper, we explore the creation and detection of deepfakes and provide an in-depth view of how these architectures work. The purpose of this survey is to provide the reader with a deeper understanding of (1) how deepfakes are created and detected, (2) the current trends and advancements in this domain, (3) the shortcomings of the current defense solutions, and (4) the areas which require further research and attention.

211 citations


Cites background from "SWAPPED! Digital face presentation ..."

  • ...Models which identify blurred content [94] are affected by noise and sharpening GANs [63, 71], and models which search for the boundary where the face was blended in [4, 8, 38, 81, 94, 163] do not work on deepfakes passed through refiner networks or those which output full frames (e....

    [...]

  • ...The authors of [4, 8, 38, 94, 163] used edge detectors, quality measures, and frequency analysis to detect artifacts in the pasted content and borders....

    [...]

Journal ArticleDOI
TL;DR: A conceptual categorization and metrics for an evaluation of such methods are presented, followed by a comprehensive survey of relevant publications, and technical considerations and tradeoffs of the surveyed methods are discussed.
Abstract: Recently, researchers found that the intended generalizability of (deep) face recognition systems increases their vulnerability against attacks. In particular, the attacks based on morphed face images pose a severe security risk to face recognition systems. In the last few years, the topic of (face) image morphing and automated morphing attack detection has sparked the interest of several research laboratories working in the field of biometrics and many different approaches have been published. In this paper, a conceptual categorization and metrics for an evaluation of such methods are presented, followed by a comprehensive survey of relevant publications. In addition, technical considerations and tradeoffs of the surveyed methods are discussed along with open issues and challenges in the field.

191 citations


Cites methods from "SWAPPED! Digital face presentation ..."

  • ...[74] propose to train an SVM with Weighted Local Magnitude Pattern....

    [...]

Journal ArticleDOI
TL;DR: In 2018, it was discovered how easy it is to use this technology for unethical and malicious applications, such as the spread of misinformation, impersonation of political leaders, and the defamation of innocent individuals as discussed by the authors.
Abstract: Generative deep learning algorithms have progressed to a point where it is difficult to tell the difference between what is real and what is fake. In 2018, it was discovered how easy it is to use this technology for unethical and malicious applications, such as the spread of misinformation, impersonation of political leaders, and the defamation of innocent individuals. Since then, these “deepfakes” have advanced significantly. In this article, we explore the creation and detection of deepfakes and provide an in-depth view as to how these architectures work. The purpose of this survey is to provide the reader with a deeper understanding of (1) how deepfakes are created and detected, (2) the current trends and advancements in this domain, (3) the shortcomings of the current defense solutions, and (4) the areas that require further research and attention.

163 citations

Journal ArticleDOI
TL;DR: This paper attempts to unravel three aspects related to the robustness of DNNs for face recognition in terms of vulnerabilities to attacks, detecting the singularities by characterizing abnormal filter response behavior in the hidden layers of deep networks; and making corrections to the processing pipeline to alleviate the problem.
Abstract: Deep neural network (DNN) architecture based models have high expressive power and learning capacity. However, they are essentially a black box method since it is not easy to mathematically formulate the functions that are learned within its many layers of representation. Realizing this, many researchers have started to design methods to exploit the drawbacks of deep learning based algorithms questioning their robustness and exposing their singularities. In this paper, we attempt to unravel three aspects related to the robustness of DNNs for face recognition: (i) assessing the impact of deep architectures for face recognition in terms of vulnerabilities to attacks, (ii) detecting the singularities by characterizing abnormal filter response behavior in the hidden layers of deep networks; and (iii) making corrections to the processing pipeline to alleviate the problem. Our experimental evaluation using multiple open-source DNN-based face recognition networks, and three publicly available face databases demonstrates that the performance of deep learning based face recognition algorithms can suffer greatly in the presence of such distortions. We also evaluate the proposed approaches on four existing quasi-imperceptible distortions: DeepFool, Universal adversarial perturbations, $$l_2$$ , and Elastic-Net (EAD). The proposed method is able to detect both types of attacks with very high accuracy by suitably designing a classifier using the response of the hidden layers in the network. Finally, we present effective countermeasures to mitigate the impact of adversarial attacks and improve the overall robustness of DNN-based face recognition.

98 citations


Cites background from "SWAPPED! Digital face presentation ..."

  • ...However, Raghavendra et al. (2017) and Agarwal et al. (2017b) have prepared a database for multispectral spoofing and reported that even such systems are not immune to presentation attacks....

    [...]

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


"SWAPPED! Digital face presentation ..." refers methods in this paper

  • ...• Figure 7 shows the SVM score distribution of bonafide and attack videos pertaining to the proposed feature, LBP, and LPQ. LBP histogram feature provides the second lowest ACER value both for video and frame based detection....

    [...]

  • ...The extracted feature vectors from training data are provided to the supervised Support Vector Machine (SVM) [26] classifier to learn the presentation attack detection model....

    [...]

Journal ArticleDOI
01 Nov 1973
TL;DR: These results indicate that the easily computable textural features based on gray-tone spatial dependancies probably have a general applicability for a wide variety of image-classification applications.
Abstract: Texture is one of the important characteristics used in identifying objects or regions of interest in an image, whether the image be a photomicrograph, an aerial photograph, or a satellite image. This paper describes some easily computable textural features based on gray-tone spatial dependancies, and illustrates their application in category-identification tasks of three different kinds of image data: photomicrographs of five kinds of sandstones, 1:20 000 panchromatic aerial photographs of eight land-use categories, and Earth Resources Technology Satellite (ERTS) multispecial imagery containing seven land-use categories. We use two kinds of decision rules: one for which the decision regions are convex polyhedra (a piecewise linear decision rule), and one for which the decision regions are rectangular parallelpipeds (a min-max decision rule). In each experiment the data set was divided into two parts, a training set and a test set. Test set identification accuracy is 89 percent for the photomicrographs, 82 percent for the aerial photographic imagery, and 83 percent for the satellite imagery. These results indicate that the easily computable textural features probably have a general applicability for a wide variety of image-classification applications.

20,442 citations


"SWAPPED! Digital face presentation ..." refers background in this paper

  • ...• It is interesting to observe that the combination of Haralick+RDWT, which yields lowest EER on physical spoofing database, provides the highest EER value of 25.6% in video based attack detection....

    [...]

  • ...Therefore, we have compared the performance of the proposed algorithm with seven different textural feature based algorithms: LBP* [18], Rotation Invariant Uniform LBP (RIULBP)* [20], Complete LBP (CLBP)* [14], Uniform LBP (ULBP)*, LPQ [21], BSIF [17], and Combination of Redundant Discrete Wavelet Transform (RDWT) [11] with Haralick [15] proposed in [1]....

    [...]

  • ...(CLBP)* [14], Uniform LBP (ULBP)*, LPQ [21], BSIF [17], and Combination of Redundant Discrete Wavelet Transform (RDWT) [11] with Haralick [15] proposed in [1]....

    [...]

Journal ArticleDOI
TL;DR: A generalized gray-scale and rotation invariant operator presentation that allows for detecting the "uniform" patterns for any quantization of the angular space and for any spatial resolution and presents a method for combining multiple operators for multiresolution analysis.
Abstract: Presents a theoretically very simple, yet efficient, multiresolution approach to gray-scale and rotation invariant texture classification based on local binary patterns and nonparametric discrimination of sample and prototype distributions. The method is based on recognizing that certain local binary patterns, termed "uniform," are fundamental properties of local image texture and their occurrence histogram is proven to be a very powerful texture feature. We derive a generalized gray-scale and rotation invariant operator presentation that allows for detecting the "uniform" patterns for any quantization of the angular space and for any spatial resolution and presents a method for combining multiple operators for multiresolution analysis. The proposed approach is very robust in terms of gray-scale variations since the operator is, by definition, invariant against any monotonic transformation of the gray scale. Another advantage is computational simplicity as the operator can be realized with a few operations in a small neighborhood and a lookup table. Experimental results demonstrate that good discrimination can be achieved with the occurrence statistics of simple rotation invariant local binary patterns.

14,245 citations


"SWAPPED! Digital face presentation ..." refers background or methods in this paper

  • ...[20] have reported that sometimes more than 90% of the texture surfaces are uniform....

    [...]

  • ...Therefore, we have compared the performance of the proposed algorithm with seven different textural feature based algorithms: LBP* [18], Rotation Invariant Uniform LBP (RIULBP)* [20], Complete LBP...

    [...]

Journal ArticleDOI
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

13,037 citations

Journal ArticleDOI
TL;DR: This work describes a method for building models by learning patterns of variability from a training set of correctly annotated images that can be used for image search in an iterative refinement algorithm analogous to that employed by Active Contour Models (Snakes).

7,969 citations


"SWAPPED! Digital face presentation ..." refers methods in this paper

  • ...To make the change more accurate and precise, key point location of the facial features such as eye, mouth, and face boundary are detected using Active Shape Model (ASM) [5]....

    [...]