Generalized Facial Manipulation Detection with Edge Region Feature Extraction.

Home
/
Papers
/
Generalized Facial Manipulation Detection with Edge Region Feature Extraction.

Posted Content•

Generalized Facial Manipulation Detection with Edge Region Feature Extraction.

Dong-Keon Kim¹, Kwangsu Kim¹•Institutions (1)

01 Dec 2021-arXiv: Computer Vision and Pattern Recognition-

TL;DR: Zhang et al. as mentioned in this paper proposed a facial forensic framework that utilizes pixel-level color features appearing in the edge region of the whole image, which includes a 3D-CNN classification model that interprets the extracted color features spatially and temporally.

read less

Abstract: This paper presents a generalized and robust face manipulation detection method based on the edge region features appearing in images. Most contemporary face synthesis processes include color awkwardness reduction but damage the natural fingerprint in the edge region. In addition, these color correction processes do not proceed in the non-face background region. We also observe that the synthesis process does not consider the natural properties of the image appearing in the time domain. Considering these observations, we propose a facial forensic framework that utilizes pixel-level color features appearing in the edge region of the whole image. Furthermore, our framework includes a 3D-CNN classification model that interprets the extracted color features spatially and temporally. Unlike other existing studies, we conduct authenticity determination by considering all features extracted from multiple frames within one video. Through extensive experiments, including real-world scenarios to evaluate generalized detection ability, we show that our framework outperforms state-of-the-art facial manipulation detection technologies in terms of accuracy and robustness.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Key Point-Based Driver Activity Recognition

[...]

01 Jun 2022

TL;DR: In this paper , a key point-based activity recognition framework is presented, which extracts complex static and movement-based features from key frames in videos, which are used to predict a sequence of key-frame activities.

...read moreread less

Abstract: We present a key point-based activity recognition framework, built upon pre-trained human pose estimation and facial feature detection models. Our method extracts complex static and movement-based features from key frames in videos, which are used to predict a sequence of key-frame activities. Finally, a merge procedure is employed to identify robust activity segments while ignoring outlier frame activity predictions. We analyze the different components of our framework via a wide array of experiments and draw conclusions with regards to the utility of the model and ways it can be improved. Results show our model is competitive, taking the 11th place out of 27 teams submitting to Track 3 of the 2022 AI City Challenge.

...read moreread less

1 citations

Journal Article•DOI•

TCSD: Triple Complementary Streams Detector for Comprehensive Deepfake Detection

[...]

Xiaolong Liu, Yang Yu, Xiao-Long Li, Yao Zhao, Guodong Guo - Show less +1 more

22 Aug 2022-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: A novel triple complementary streams detector, namely TCSD is proposed, designed to perceive depth information (DI) which is not utilized by previous methods, and two attention-based feature fusion modules are proposed to adaptively fuse information.

...read moreread less

Abstract: Advancements in computer vision and deep learning have made it difficult to distinguish generated Deepfake media in visual. While existing detection frameworks have achieved significant performance on the challenging Deepfake datasets, these approaches consider a single perspective. More importantly, in urban scenes, neither complex scenarios can be covered by a single view, nor the correlation between multiple information is well utilized. In this paper, to mine the new view for Deepfake detection and utilize the correlation of multi-view information contained in images, we propose a novel triple complementary streams detector, namely TCSD. Specifically, firstly, a novel depth estimator is designed to perceive depth information (DI) which is not utilized by previous methods. Then, to supplement the depth information for obtaining comprehensive forgery clues, we consider the incoherence between image foreground and background information (FBI) and the inconsistency between local and global information (LGI). In addition, attention-based multi-scale feature extraction (MsFE) module is designed to perceive more complementary features from DI, FBI and LGI. Finally, two attention-based feature fusion modules are proposed to adaptively fuse information. Extensive experiment results show that the proposed approach achieves the state-of-the-art performance on detecting Deepfake.

...read moreread less

1 citations

Journal Article•DOI•

A forensic evaluation method for DeepFake detection using DCNN-based facial similarity scores.

[...]

Paulo Max Gil Innocencio Reis, Rafael O. Ribeiro

01 Jun 2023-Forensic Science International

TL;DR: Li et al. as mentioned in this paper proposed using a threshold classifier based on similarity scores obtained from a Deep Convolutional Neural Network (DCNN) trained for facial recognition, which compute a set of similarity scores between faces extracted from questioned videos and reference materials of the person depicted.

...read moreread less

Proceedings Article•DOI•

TI<sup>2</sup>Net: Temporal Identity Inconsistency Network for Deepfake Detection

[...]

01 Jan 2023

TL;DR: Wang et al. as discussed by the authors proposed the Temporal Identity Inconsistency Network (TI ² Net), a deepfake detector that focuses on temporal identity inconsistency.

...read moreread less

Abstract: In this paper, we propose the Temporal Identity Inconsistency Network (TI ² Net), a Deepfake detector that focuses on temporal identity inconsistency. Specifically, TI ² Net recognizes fake videos by capturing the dissimilarities of human faces among video frames of the same identity. Therefore, TI ² Net is a reference-agnostic detector and can be used on unseen datasets. For a video clip of a given identity, identity information in all frames will first be encoded to identity vectors. TI ² Net learns the temporal identity embedding from the temporal difference of the identity vectors. The temporal embedding, representing the identity inconsistency in the video clip, is finally used to determine the authenticity of the video clip. During training, TI ² Net incorporates triplet loss to learn more discriminative temporal embeddings. We conduct comprehensive experiments to evaluate the performance of the proposed TI ² Net. Experimental results indicate that TI ² Net generalizes well to unseen manipulations and datasets with unseen identities. Besides, TI ² Net also shows robust performance against compression and additive noise.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Generative Adversarial Nets

[...]

Ian Goodfellow¹, Jean Pouget-Abadie¹, Mehdi Mirza¹, Bing Xu¹, David Warde-Farley¹, Sherjil Ozair², Aaron Courville¹, Yoshua Bengio¹ - Show less +4 more•Institutions (2)

Université de Montréal¹, Indian Institute of Technology Delhi²

08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

...read moreread less

38,211 citations

Journal Article•DOI•

A Computational Approach to Edge Detection

[...]

John Canny¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jun 1986-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.

...read moreread less

Abstract: This paper describes a computational approach to edge detection. The success of the approach depends on the definition of a comprehensive set of goals for the computation of edge points. These goals must be precise enough to delimit the desired behavior of the detector while making minimal assumptions about the form of the solution. We define detection and localization criteria for a class of edges, and present mathematical forms for these criteria as functionals on the operator impulse response. A third criterion is then added to ensure that the detector has only one response to a single edge. We use the criteria in numerical optimization to derive detectors for several common image features, including step edges. On specializing the analysis to step edges, we find that there is a natural uncertainty principle between detection and localization performance, which are the two main goals. With this principle we derive a single operator shape which is optimal at any scale. The optimal detector has a simple approximate implementation in which edges are marked at maxima in gradient magnitude of a Gaussian-smoothed image. We extend this simple detector using operators of several widths to cope with different signal-to-noise ratios in the image. We present a general method, called feature synthesis, for the fine-to-coarse integration of information from operators at different scales. Finally we show that step edge detector performance improves considerably as the operator point spread function is extended along the edge.

...read moreread less

28,073 citations

Proceedings Article•DOI•

Densely Connected Convolutional Networks

[...]

Gao Huang¹, Zhuang Liu², Laurens van der Maaten³, Kilian Q. Weinberger¹•Institutions (3)

Cornell University¹, Tsinghua University², Facebook³

21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

...read moreread less

27,821 citations

Posted Content•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

22 Dec 2014-arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

23,486 citations

Proceedings Article•DOI•

Xception: Deep Learning with Depthwise Separable Convolutions

[...]

François Chollet¹•Institutions (1)

Google¹

21 Jul 2017

TL;DR: This work proposes a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions, and shows that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset, and significantly outperforms it on a larger image classification dataset.

...read moreread less

Abstract: We present an interpretation of Inception modules in convolutional neural networks as being an intermediate step in-between regular convolution and the depthwise separable convolution operation (a depthwise convolution followed by a pointwise convolution). In this light, a depthwise separable convolution can be understood as an Inception module with a maximally large number of towers. This observation leads us to propose a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions. We show that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset (which Inception V3 was designed for), and significantly outperforms Inception V3 on a larger image classification dataset comprising 350 million images and 17,000 classes. Since the Xception architecture has the same number of parameters as Inception V3, the performance gains are not due to increased capacity but rather to a more efficient use of model parameters.

...read moreread less

10,422 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Related Papers (1)

Face Recognition Using Gradient Texture Features

[...]

15 Jul 2022