scispace - formally typeset
Search or ask a question
Author

Dong-Keon Kim

Bio: Dong-Keon Kim is an academic researcher from Sungkyunkwan University. The author has contributed to research in topics: Pixel & Pattern recognition (psychology). The author has co-authored 3 publications.

Papers
More filters
Posted Content
TL;DR: Zhang et al. as mentioned in this paper proposed a facial forensic framework that utilizes pixel-level color features appearing in the edge region of the whole image, which includes a 3D-CNN classification model that interprets the extracted color features spatially and temporally.
Abstract: This paper presents a generalized and robust face manipulation detection method based on the edge region features appearing in images. Most contemporary face synthesis processes include color awkwardness reduction but damage the natural fingerprint in the edge region. In addition, these color correction processes do not proceed in the non-face background region. We also observe that the synthesis process does not consider the natural properties of the image appearing in the time domain. Considering these observations, we propose a facial forensic framework that utilizes pixel-level color features appearing in the edge region of the whole image. Furthermore, our framework includes a 3D-CNN classification model that interprets the extracted color features spatially and temporally. Unlike other existing studies, we conduct authenticity determination by considering all features extracted from multiple frames within one video. Through extensive experiments, including real-world scenarios to evaluate generalized detection ability, we show that our framework outperforms state-of-the-art facial manipulation detection technologies in terms of accuracy and robustness.

4 citations

Posted Content
TL;DR: In this article, a multi-head attention CNN model (MHAC) is proposed to predict inbound tourist changes in South Korea by considering external factors such as politics, disease, season, and attraction of Korean culture.
Abstract: Developing an accurate tourism forecasting model is essential for making desirable policy decisions for tourism management. Early studies on tourism management focus on discovering external factors related to tourism demand. Recent studies utilize deep learning in demand forecasting along with these external factors. They mainly use recursive neural network models such as LSTM and RNN for their frameworks. However, these models are not suitable for use in forecasting tourism demand. This is because tourism demand is strongly affected by changes in various external factors, and recursive neural network models have limitations in handling these multivariate inputs. We propose a multi-head attention CNN model (MHAC) for addressing these limitations. The MHAC uses 1D-convolutional neural network to analyze temporal patterns and the attention mechanism to reflect correlations between input variables. This model makes it possible to extract spatiotemporal characteristics from time-series data of various variables. We apply our forecasting framework to predict inbound tourist changes in South Korea by considering external factors such as politics, disease, season, and attraction of Korean culture. The performance results of extensive experiments show that our method outperforms other deep-learning-based prediction frameworks in South Korea tourism forecasting.
Posted Content
TL;DR: Zhang et al. as mentioned in this paper presented a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated image and designed the neural network for detecting face-manipulated image with these distinctive features in facial boundary and background edge.
Abstract: In this work, we present a generalized and robust facial manipulation detection method based on color distribution analysis of the vertical region of edge in a manipulated image Most of the contemporary facial manipulation method involves pixel correction procedures for reducing awkwardness of pixel value differences along the facial boundary in a synthesized image For this procedure, there are distinctive differences in the facial boundary between face manipulated image and unforged natural image Also, in the forged image, there should be distinctive and unnatural features in the gap distribution between facial boundary and background edge region because it tends to damage the natural effect of lighting We design the neural network for detecting face-manipulated image with these distinctive features in facial boundary and background edge Our extensive experiments show that our method outperforms other existing face manipulation detection methods on detecting synthesized face image in various datasets regardless of whether it has participated in training

Cited by
More filters
Proceedings ArticleDOI
01 Jun 2022
TL;DR: In this paper , a key point-based activity recognition framework is presented, which extracts complex static and movement-based features from key frames in videos, which are used to predict a sequence of key-frame activities.
Abstract: We present a key point-based activity recognition framework, built upon pre-trained human pose estimation and facial feature detection models. Our method extracts complex static and movement-based features from key frames in videos, which are used to predict a sequence of key-frame activities. Finally, a merge procedure is employed to identify robust activity segments while ignoring outlier frame activity predictions. We analyze the different components of our framework via a wide array of experiments and draw conclusions with regards to the utility of the model and ways it can be improved. Results show our model is competitive, taking the 11th place out of 27 teams submitting to Track 3 of the 2022 AI City Challenge.

1 citations

Journal ArticleDOI
TL;DR: A novel triple complementary streams detector, namely TCSD is proposed, designed to perceive depth information (DI) which is not utilized by previous methods, and two attention-based feature fusion modules are proposed to adaptively fuse information.
Abstract: Advancements in computer vision and deep learning have made it difficult to distinguish generated Deepfake media in visual. While existing detection frameworks have achieved significant performance on the challenging Deepfake datasets, these approaches consider a single perspective. More importantly, in urban scenes, neither complex scenarios can be covered by a single view, nor the correlation between multiple information is well utilized. In this paper, to mine the new view for Deepfake detection and utilize the correlation of multi-view information contained in images, we propose a novel triple complementary streams detector, namely TCSD. Specifically, firstly, a novel depth estimator is designed to perceive depth information (DI) which is not utilized by previous methods. Then, to supplement the depth information for obtaining comprehensive forgery clues, we consider the incoherence between image foreground and background information (FBI) and the inconsistency between local and global information (LGI). In addition, attention-based multi-scale feature extraction (MsFE) module is designed to perceive more complementary features from DI, FBI and LGI. Finally, two attention-based feature fusion modules are proposed to adaptively fuse information. Extensive experiment results show that the proposed approach achieves the state-of-the-art performance on detecting Deepfake.

1 citations

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed using a threshold classifier based on similarity scores obtained from a Deep Convolutional Neural Network (DCNN) trained for facial recognition, which compute a set of similarity scores between faces extracted from questioned videos and reference materials of the person depicted.
Proceedings ArticleDOI
01 Jan 2023
TL;DR: Wang et al. as discussed by the authors proposed the Temporal Identity Inconsistency Network (TI 2 Net), a deepfake detector that focuses on temporal identity inconsistency.
Abstract: In this paper, we propose the Temporal Identity Inconsistency Network (TI 2 Net), a Deepfake detector that focuses on temporal identity inconsistency. Specifically, TI 2 Net recognizes fake videos by capturing the dissimilarities of human faces among video frames of the same identity. Therefore, TI 2 Net is a reference-agnostic detector and can be used on unseen datasets. For a video clip of a given identity, identity information in all frames will first be encoded to identity vectors. TI 2 Net learns the temporal identity embedding from the temporal difference of the identity vectors. The temporal embedding, representing the identity inconsistency in the video clip, is finally used to determine the authenticity of the video clip. During training, TI 2 Net incorporates triplet loss to learn more discriminative temporal embeddings. We conduct comprehensive experiments to evaluate the performance of the proposed TI 2 Net. Experimental results indicate that TI 2 Net generalizes well to unseen manipulations and datasets with unseen identities. Besides, TI 2 Net also shows robust performance against compression and additive noise.