scispace - formally typeset
Search or ask a question
Author

Michael Goebel

Bio: Michael Goebel is an academic researcher from University of California, Santa Barbara. The author has contributed to research in topics: Deep learning & Pixel. The author has an hindex of 3, co-authored 11 publications receiving 24 citations.

Papers
More filters
Posted Content
TL;DR: A novel approach to detect, attribute and localize GAN generated images that combines image features with deep learning methods is proposed.
Abstract: Recent advances in Generative Adversarial Networks (GANs) have led to the creation of realistic-looking digital images that pose a major challenge to their detection by humans or computers. GANs are used in a wide range of tasks, from modifying small attributes of an image (StarGAN [14]), transferring attributes between image pairs (CycleGAN [91]), as well as generating entirely new images (ProGAN [36], StyleGAN [37], SPADE/GauGAN [64]). In this paper, we propose a novel approach to detect, attribute and localize GAN generated images that combines image features with deep learning methods. For every image, co-occurrence matrices are computed on neighborhood pixels of RGB channels in different directions (horizontal, vertical and diagonal). A deep learning network is then trained on these features to detect, attribute and localize these GAN generated/manipulated images. A large scale evaluation of our approach on 5 GAN datasets comprising over 2.76 million images (ProGAN, StarGAN, CycleGAN, StyleGAN and SPADE/GauGAN) shows promising results in detecting GAN generated images.

23 citations

Journal ArticleDOI
TL;DR: This paper proposes several novel methods to predict if an image was captured at one of several noteworthy events, using a set of images from several recorded events such as storms, marathons, protests, and other large public gatherings.
Abstract: The authenticity of images posted on social media is an issue of growing concern. Many algorithms have been developed to detect manipulated images, but few have investigated the ability of deep neural network based approaches to verify the authenticity of image labels, such as event names. In this paper, we propose several novel methods to predict if an image was captured at one of several noteworthy events. We use a set of images from several recorded events such as storms, marathons, protests, and other large public gatherings. Two strategies of applying pre-trained Imagenet network for event verification are presented, with two modifications for each strategy. The first method uses the features from the last convolutional layer of a pre-trained network as input to a classifier. We also consider the effects of tuning the convolutional weights of the pre-trained network to improve classification. The second method combines many features extracted from smaller scales and uses the output of a pre-trained network as the input to a second classifier. For both methods, we investigated several different classifiers and tested many different pre-trained networks. Our experiments demonstrate both these approaches are effective for event verification and image re-purposing detection. The classification at the global scale tends to marginally outperform our tested local methods and fine tuning the network further improves the results.

7 citations

Proceedings ArticleDOI
01 Jan 2021
TL;DR: In this paper, a hybrid emission representation model is proposed to model the direct emission and absorption of heat by the skin and underlying blood vessels, which results in an information-rich feature representation of the face, which is used by spatio-temporal network for reconstructing the initial systolic time interval.
Abstract: Precise measurement of physiological signals is critical for the effective monitoring of human vital signs. Recent developments in computer vision have demonstrated that signals such as pulse rate and respiration rate can be extracted from digital video of humans, increasing the possibility of contact-less monitoring. This paper presents a novel approach to obtaining physiological signals and classifying stress states from thermal video. The proposed network–"StressNet"–features a hybrid emission representation model that models the direct emission and absorption of heat by the skin and underlying blood vessels. This results in an information-rich feature representation of the face, which is used by spatio-temporal network for reconstructing the ISTI ( Initial Systolic Time Interval : a measure of change in cardiac sympathetic activity that is considered to be a quantitative index of stress in humans). The reconstructed ISTI signal is fed into a stress-detection model to detect and classify the individual’s stress state (i.e. stress or no stress). A detailed evaluation demonstrates that Stress-Net achieves estimated the ISTI signal with 95% accuracy and detect stress with average precision of 0.842.

6 citations

Posted Content
TL;DR: A novel approach to obtaining physiological signals and classifying stress states from thermal video, which features a hybrid emission representation model that models the direct emission and absorption of heat by the skin and underlying blood vessels is presented.
Abstract: Precise measurement of physiological signals is critical for the effective monitoring of human vital signs. Recent developments in computer vision have demonstrated that signals such as pulse rate and respiration rate can be extracted from digital video of humans, increasing the possibility of contact-less monitoring. This paper presents a novel approach to obtaining physiological signals and classifying stress states from thermal video. The proposed network--"StressNet"--features a hybrid emission representation model that models the direct emission and absorption of heat by the skin and underlying blood vessels. This results in an information-rich feature representation of the face, which is used by spatio-temporal network for reconstructing the ISTI ( Initial Systolic Time Interval: a measure of change in cardiac sympathetic activity that is considered to be a quantitative index of stress in humans ). The reconstructed ISTI signal is fed into a stress-detection model to detect and classify the individual's stress state ( i.e. stress or no stress ). A detailed evaluation demonstrates that StressNet achieves estimated the ISTI signal with 95% accuracy and detect stress with average precision of 0.842. The source code is available on Github.

5 citations

Posted Content
TL;DR: This paper develops two novel adversarial attacks on co-occurrence based GAN detectors, the first attacks to be presented against such a detector, and shows that this method can reduce accuracy from over 98% to less than 4%, with no knowledge of the deep learning model or weights.
Abstract: Improvements in Generative Adversarial Networks (GANs) have greatly reduced the difficulty of producing new, photo-realistic images with unique semantic meaning. With this rise in ability to generate fake images comes demand to detect them. While numerous methods have been developed for this task, the majority of them remain vulnerable to adversarial attacks. In this paper, develop two novel adversarial attacks on co-occurrence based GAN detectors. These are the first attacks to be presented against such a detector. We show that our method can reduce accuracy from over 98% to less than 4%, with no knowledge of the deep learning model or weights. Furthermore, accuracy can be reduced to 0% with full knowledge of the deep learning model details.

5 citations


Cited by
More filters
Posted Content
TL;DR: This work shows that GAN synthesized faces can be exposed with the inconsistent corneal specular highlights between two eyes, and describes an automatic method to extract and compare corneals from two eyes.
Abstract: Sophisticated generative adversary network (GAN) models are now able to synthesize highly realistic human faces that are difficult to discern from real ones visually. In this work, we show that GAN synthesized faces can be exposed with the inconsistent corneal specular highlights between two eyes. The inconsistency is caused by the lack of physical/physiological constraints in the GAN models. We show that such artifacts exist widely in high-quality GAN synthesized faces and further describe an automatic method to extract and compare corneal specular highlights from two eyes. Qualitative and quantitative evaluations of our method suggest its simplicity and effectiveness in distinguishing GAN synthesized faces.

37 citations

Proceedings ArticleDOI
06 Jun 2021
TL;DR: This paper showed that GAN synthesized faces can be exposed with inconsistent corneal specular highlights between two eyes due to the lack of physical/physiological constraints in the GAN models.
Abstract: Sophisticated generative adversary network (GAN) models are now able to synthesize highly realistic human faces that are difficult to discern from real ones visually. In this work, we show that GAN synthesized faces can be exposed with the inconsistent corneal specular highlights between two eyes. The inconsistency is caused by the lack of physical/physiological constraints in the GAN models. We show that such artifacts exist widely in high-quality GAN synthesized faces and further describe an automatic method to extract and compare corneal specular highlights from two eyes. Qualitative and quantitative evaluations of our method suggest its simplicity and effectiveness in distinguishing GAN synthesized faces.

30 citations

Posted Content
TL;DR: A comprehensive overview and detailed analysis of the research work on the topic of DeepFake generation, DeepFake detection as well as evasion of deepfake detection, with more than 191 research papers carefully surveyed is provided in this article.
Abstract: The creation and the manipulation of facial appearance via deep generative approaches, known as DeepFake, have achieved significant progress and promoted a wide range of benign and malicious applications. The evil side of this new technique poses another popular study, i.e., DeepFake detection aiming to identify the fake faces from the real ones. With the rapid development of the DeepFake-related studies in the community, both sides (i.e., DeepFake generation and detection) have formed the relationship of the battleground, pushing the improvements of each other and inspiring new directions, e.g., the evasion of DeepFake detection. Nevertheless, the overview of such battleground and the new direction is unclear and neglected by recent surveys due to the rapid increase of related publications, limiting the in-depth understanding of the tendency and future works. To fill this gap, in this paper, we provide a comprehensive overview and detailed analysis of the research work on the topic of DeepFake generation, DeepFake detection as well as evasion of DeepFake detection, with more than 191 research papers carefully surveyed. We present the taxonomy of various DeepFake generation methods and the categorization of various DeepFake detection methods, and more importantly, we showcase the battleground between the two parties with detailed interactions between the adversaries (DeepFake generation) and the defenders (DeepFake detection). The battleground allows fresh perspective into the latest landscape of the DeepFake research and can provide valuable analysis towards the research challenges and opportunities as well as research trends and directions in the field of DeepFake generation and detection. We also elaborately design interactive diagrams (this http URL) to allow researchers to explore their own interests on popular DeepFake generators or detectors.

24 citations

Proceedings ArticleDOI
01 Jun 2021
TL;DR: This work explores the use of a semantically related task, emotion detection, for equally competent but more explainable and human-like psychological stress detection as compared to a black-box model, and explores theUse of multi-task learning as well as emotion-based language model fine-tuning.
Abstract: The problem of detecting psychological stress in online posts, and more broadly, of detecting people in distress or in need of help, is a sensitive application for which the ability to interpret models is vital. Here, we present work exploring the use of a semantically related task, emotion detection, for equally competent but more explainable and human-like psychological stress detection as compared to a black-box model. In particular, we explore the use of multi-task learning as well as emotion-based language model fine-tuning. With our emotion-infused models, we see comparable results to state-of-the-art BERT. Our analysis of the words used for prediction show that our emotion-infused models mirror psychological components of stress.

24 citations