scispace - formally typeset
Search or ask a question
Author

Di Wen

Bio: Di Wen is an academic researcher from Case Western Reserve University. The author has contributed to research in topics: Face detection & Cluster analysis. The author has an hindex of 9, co-authored 24 publications receiving 728 citations. Previous affiliations of Di Wen include Michigan State University & Tsinghua University.

Papers
More filters
Journal ArticleDOI
TL;DR: An efficient and rather robust face spoof detection algorithm based on image distortion analysis (IDA) that outperforms the state-of-the-art methods in spoof detection and highlights the difficulty in separating genuine and spoof faces, especially in cross-database and cross-device scenarios.
Abstract: Automatic face recognition is now widely used in applications ranging from deduplication of identity to authentication of mobile payment. This popularity of face recognition has raised concerns about face spoof attacks (also known as biometric sensor presentation attacks), where a photo or video of an authorized person’s face could be used to gain access to facilities or services. While a number of face spoof detection techniques have been proposed, their generalization ability has not been adequately addressed. We propose an efficient and rather robust face spoof detection algorithm based on image distortion analysis (IDA). Four different features (specular reflection, blurriness, chromatic moment, and color diversity) are extracted to form the IDA feature vector. An ensemble classifier, consisting of multiple SVM classifiers trained for different face spoof attacks (e.g., printed photo and replayed video), is used to distinguish between genuine (live) and spoof faces. The proposed approach is extended to multiframe face spoof detection in videos using a voting-based scheme. We also collect a face spoof database, MSU mobile face spoofing database (MSU MFSD), using two mobile devices (Google Nexus 5 and MacBook Air) with three types of spoof attacks (printed photo, replayed video with iPhone 5S, and replayed video with iPad Air). Experimental results on two public-domain face spoof databases (Idiap REPLAY-ATTACK and CASIA FASD), and the MSU MFSD database show that the proposed approach outperforms the state-of-the-art methods in spoof detection. Our results also highlight the difficulty in separating genuine and spoof faces, especially in cross-database and cross-device scenarios.

716 citations

Proceedings ArticleDOI
19 Oct 2009
TL;DR: A face clustering system which automatically groups photos into clusters, with each cluster containing photos of the same person, based on an advanced face recognition engine and a semi-supervised clustering approach is presented.
Abstract: People are often the most important subjects in photos, and the ability of finding photos of a particular person easily and quickly in an image collection is highly desired. In this paper, we present a face clustering system which automatically groups photos into clusters, with each cluster containing photos of the same person. This is done based on an advanced face recognition engine and a semi-supervised clustering approach. The system achieved good clustering accuracy when tested on different image sets and by different users. Moreover, features such as adding new images, face cluster navigation and face based image retrieval are added that greatly improve the usability of the system. It also facilitates efficient manual manipulations of clustering results. On top of this technology, image navigation systems have been built, including the "face bubble" visualization which provides one-glance view of a photo collection, and shows the relations among people.

28 citations

Journal ArticleDOI
TL;DR: This paper analyzes nonlocally redundant and low-rank properties and provides quantifications of them in a data-driven and parametric way, respectively, obtaining the new measures of regional redundancy and nonlocal patch rank.
Abstract: In recent years, image priors based on nonlocal self-similarity and low-rank approximation have been proven as powerful tools for image restoration. Many restoration methods group similar patches as a matrix and recover the underlying low-rank structure from the corrupted matrix via rank minimization. However, both the nonlocally redundant and low-rank properties are highly content dependent, and whether they can faithfully characterize a wide range of natural images still remains unclear. In this paper, we analyze these two properties and provide quantifications of them in a data-driven and parametric way, respectively, obtaining the new measures of regional redundancy and nonlocal patch rank. Leveraging these prior leads to an adaptive image restoration method with content-awareness. In particular, our method iteratively removes outliers and recovers latent fine details. To handle outliers, we propose an adaptive low-rank and sparse matrix approximation algorithm to encourage the estimated nonlocal rank in the patch matrix. The guidance of regional redundancy further gives rise to the “denoise” quality. In the detail recovery step, we propose an adaptive joint kernel regression algorithm using the redundancy measure to determine the confidence of each regression group. It also bridges the gap between our online and offline dictionary learning schemes. Experiments on synthetic and real-world images show the efficacy of our method in image deblurring and super-resolution tasks, especially when subject to practical outliers such as rain drops.

26 citations

Proceedings ArticleDOI
23 Jan 2004
TL;DR: The research of document digitization technology and its applications for constructing digital libraries in China are introduced and it is indicated that current technologies can greatly facilitate the mass-volume digitization labour in building digital library infrastructure.
Abstract: We introduce the research of document digitization technology and its applications for constructing digital libraries in China. We focus on two major objectives of document digitization technologies: performance and efficiency. Taking the most representative TH-OCR product as an example, the up-to-date research achievements on both kernel OCR technologies and peripheral technologies in China are presented. The kernel technologies include high performance multilingual (Chinese, Japanese, Korean and English) text recognition, layout analysis, understanding and reconstruction; the peripheral technologies include the network document digitization workflow and intelligent proofreading, which greatly improve the efficiency. The applications of TH-OCR has two types of final output digital documents, one is the reconstructed electronic document with full text and layout information of the original paper-based document, the other is the multilevel document with OCR output text layer under the image layer. Numerous applications indicate that current technologies can greatly facilitate the mass-volume digitization labour in building digital library infrastructure.

18 citations

Book ChapterDOI
Yinan Na1, Di Wen1
21 Sep 2010
TL;DR: Experimental results show that the proposed text tracking algorithm is robust to different text forms, including multilingual captions, credits, scene texts with shift, rotation and scale change, under complex backgrounds and light changing.
Abstract: Video text provides important clues for semantic-based video analysis, indexing and retrieval And text tracking is performed to locate specific text information across video frames and enhance text segmentation and recognition over time This paper presents a multilingual video text tracking algorithm based on the extraction and tracking of Scale Invariant Feature Transform (SIFT) features description through video frames SIFT features are extracted from video frames to correspond the region of interests across frames Meanwhile, a global matching method using geometric constraint is proposed to decrease false matches, which effectively improves the accuracy and stability of text tracking results Based on the correct matches, the motion of text is estimated in adjacent frames and a match score of text is calculated to determine Text Change Boundary (TCB) Experimental results on a large number of video frames show that the proposed text tracking algorithm is robust to different text forms, including multilingual captions, credits, scene texts with shift, rotation and scale change, under complex backgrounds and light changing

16 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A brief introduction of SVMs is provided, many applications are described and challenges and trends are summarized, especially in the some fields.

611 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper argues the importance of auxiliary supervision to guide the learning toward discriminative and generalizable cues, and introduces a new face anti-spoofing database that covers a large range of illumination, subject, and pose variations.
Abstract: Face anti-spoofing is crucial to prevent face recognition systems from a security breach. Previous deep learning approaches formulate face anti-spoofing as a binary classification problem. Many of them struggle to grasp adequate spoofing cues and generalize poorly. In this paper, we argue the importance of auxiliary supervision to guide the learning toward discriminative and generalizable cues. A CNN-RNN model is learned to estimate the face depth with pixel-wise supervision, and to estimate rPPG signals with sequence-wise supervision. The estimated depth and rPPG are fused to distinguish live vs. spoof faces. Further, we introduce a new face anti-spoofing database that covers a large range of illumination, subject, and pose variations. Experiments show that our model achieves the state-of-the-art results on both intra- and cross-database testing.

502 citations

Book
01 Jan 2015
TL;DR: This edited volume is unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks.
Abstract: Social network analysis applications have experienced tremendous advances within the last few years due in part to increasing trends towards users interacting with each other on the internet. Social networks are organized as graphs, and the data on social networks takes on the form of massive streams, which are mined for a variety of purposes. Social Network Data Analytics covers an important niche in the social network analytics field. This edited volume, contributed by prominent researchers in this field, presents a wide selection of topics on social network data mining such as Structural Properties of Social Networks, Algorithms for Structural Discovery of Social Networks and Content Analysis in Social Networks. This book is also unique in focussing on the data analytical aspects of social networks in the internet scenario, rather than the traditional sociology-driven emphasis prevalent in the existing books, which do not focus on the unique data-intensive characteristics of online social networks. Emphasis is placed on simplifying the content so that students and practitioners benefit from this book. This book targets advanced level students and researchers concentrating on computer science as a secondary text or reference book. Data mining, database, information security, electronic commerce and machine learning professionals will find this book a valuable asset, as well as primary associations such as ACM, IEEE and Management Science.

497 citations

Journal ArticleDOI
TL;DR: This paper introduces a novel and appealing approach for detecting face spoofing using a colour texture analysis that exploits the joint colour-texture information from the luminance and the chrominance channels by extracting complementary low-level feature descriptions from different colour spaces.
Abstract: Research on non-intrusive software-based face spoofing detection schemes has been mainly focused on the analysis of the luminance information of the face images, hence discarding the chroma component, which can be very useful for discriminating fake faces from genuine ones. This paper introduces a novel and appealing approach for detecting face spoofing using a colour texture analysis. We exploit the joint colour-texture information from the luminance and the chrominance channels by extracting complementary low-level feature descriptions from different colour spaces. More specifically, the feature histograms are computed over each image band separately. Extensive experiments on the three most challenging benchmark data sets, namely, the CASIA face anti-spoofing database, the replay-attack database, and the MSU mobile face spoof database, showed excellent results compared with the state of the art. More importantly, unlike most of the methods proposed in the literature, our proposed approach is able to achieve stable performance across all the three benchmark data sets. The promising results of our cross-database evaluation suggest that the facial colour texture representation is more stable in unknown conditions compared with its gray-scale counterparts.

449 citations

Proceedings ArticleDOI
01 May 2017
TL;DR: This work introduces a new public face PAD database, OULU-NPU, aiming at evaluating the generalization of PAD methods in more realistic mobile authentication scenarios across three covariates: unknown environmental conditions, acquisition devices and presentation attack instruments.
Abstract: The vulnerabilities of face-based biometric systems to presentation attacks have been finally recognized but yet we lack generalized software-based face presentation attack detection (PAD) methods performing robustly in practical mobile authentication scenarios. This is mainly due to the fact that the existing public face PAD datasets are beginning to cover a variety of attack scenarios and acquisition conditions but their standard evaluation protocols do not encourage researchers to assess the generalization capabilities of their methods across these variations. In this present work, we introduce a new public face PAD database, OULU-NPU, aiming at evaluating the generalization of PAD methods in more realistic mobile authentication scenarios across three covariates: unknown environmental conditions (namely illumination and background scene), acquisition devices and presentation attack instruments (PAI). This publicly available database consists of 5940 videos corresponding to 55 subjects recorded in three different environments using high-resolution frontal cameras of six different smartphones. The high-quality print and videoreplay attacks were created using two different printers and two different display devices. Each of the four unambiguously defined evaluation protocols introduces at least one previously unseen condition to the test set, which enables a fair comparison on the generalization capabilities between new and existing approaches. The baseline results using color texture analysis based face PAD method demonstrate the challenging nature of the database.

416 citations