scispace - formally typeset
Open AccessProceedings ArticleDOI

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

Reads0
Chats0
TLDR
ForgeryNet as discussed by the authors is a large dataset of 2.9 million images, 221,247 videos, manipulations (7 imagelevel approaches, 8 video-level approaches), perturbations (36 independent and more mixed perturbation) and annotations (6.3 million classification labels, 2.5 million manipulated area annotations and 221, 247 temporal forgery segment labels).
Abstract
The rapid progress of photorealistic synthesis techniques have reached at a critical point where the boundary between real and manipulated images starts to blur. Thus, benchmarking and advancing digital forgery analysis have become a pressing issue. However, existing face forgery datasets either have limited diversity or only support coarse-grained analysis.To counter this emerging threat, we construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across four tasks: 1) Image Forgery Classification, including two-way (real/fake), three-way (real/fake with identity-replaced forgery approaches/fake with identity-remained forgery approaches), and n-way (real and 15 respective forgery approaches) classification. 2) Spatial Forgery Localization, which segments the manipulated area of fake images compared to their corresponding real images. 3) Video Forgery Classification, which re-defines the video-level forgery classification with manipulated frames in random positions. This task is important because attackers in real world are free to manipulate any target frame. and 4) Temporal Forgery Localization, to localize the temporal segments which are manipulated. ForgeryNet is by far the largest publicly available deep face forgery dataset in terms of data-scale (2.9 million images, 221,247 videos), manipulations (7 image-level approaches, 8 video-level approaches), perturbations (36 independent and more mixed perturbations) and annotations (6.3 million classification labels, 2.9 million manipulated area annotations and 221,247 temporal forgery segment labels). We perform extensive benchmarking and studies of existing face forensics methods and obtain several valuable observations. We hope that the scale, quality, and variety of our ForgeryNet dataset will foster further research and innovation in the area of face forgery classification, as well as spatial and temporal forgery localization etc.

read more

Citations
More filters
Journal ArticleDOI

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

- 04 Jun 2022 - 
TL;DR: In this article , a comprehensive review and detailed analysis of existing tools and machine learning (ML) based approaches for deepfake generation, and the methodologies used to detect such manipulations in both audio and video are presented.
Proceedings ArticleDOI

Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection

TL;DR: This paper harnesses the natural correspondence between the visual and auditory modalities in real videos to learn temporally dense video representations that capture factors such as facial movements, expression, and identity, and suggests that leveraging natural and unlabelled videos is a promising direction for the development of more robust face forgery detectors.
Journal ArticleDOI

Countering Malicious DeepFakes: Survey, Battleground, and Horizon

TL;DR: A comprehensive overview and detailed analysis of the research work on the topic of DeepFake generation, DeepFake detection as well as evasion of deepFake detection, with more than 318 research papers carefully surveyed is provided in this article .
Journal ArticleDOI

ForgeryNIR: Deep Face Forgery and Detection in Near-Infrared Scenario

TL;DR: This paper presents an attempt at constructing a large-scale dataset for face forgery detection in the near-infrared modality and proposes a new forgery Detection method based on knowledge distillation named cross-modality knowledgedistillation aiming to use a teacher model which is pre-trained on the visible light-based (VIS) big data to guide the student model with a small amount of near- Infrared (NIR) data.
Proceedings ArticleDOI

Cross-Forgery Analysis of Vision Transformers and CNNs for Deepfake Image Detection

TL;DR: From the authors' experiments, it emerges that EfficientNetV2 has a greater tendency to specialize often obtaining better results on training methods while Vision Transformers exhibit a superior generalization ability that makes them more competent even on images generated with new methodologies.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Book ChapterDOI

U-Net: Convolutional Networks for Biomedical Image Segmentation

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Proceedings ArticleDOI

Xception: Deep Learning with Depthwise Separable Convolutions

TL;DR: This work proposes a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions, and shows that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset, and significantly outperforms it on a larger image classification dataset.
Proceedings Article

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

TL;DR: EfficientNet-B7 as discussed by the authors proposes a new scaling method that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient, which achieves state-of-the-art accuracy on ImageNet, while being 8.4x smaller and 6.1x faster on inference.
Posted Content

Searching for MobileNetV3.

TL;DR: This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art of MobileNets.
Related Papers (5)