Showing papers on "Face detection published in 2009"

PDF

Open Access

Proceedings Article•DOI•

A 3D Face Model for Pose and Illumination Invariant Face Recognition

[...]

Pascal Paysan, Reinhard Knothe¹, Brian Amberg¹, Sami Romdhani¹, Thomas Vetter¹ - Show less +1 more•Institutions (1)

02 Sep 2009

TL;DR: This paper publishes a generative 3D shape and texture model, the Basel Face Model (BFM), and demonstrates its application to several face recognition task and publishes a set of detailed recognition and reconstruction results on standard databases to allow complete algorithm comparisons.

...read moreread less

Abstract: Generative 3D face models are a powerful tool in computer vision. They provide pose and illumination invariance by modeling the space of 3D faces and the imaging process. The power of these models comes at the cost of an expensive and tedious construction process, which has led the community to focus on more easily constructed but less powerful models. With this paper we publish a generative 3D shape and texture model, the Basel Face Model (BFM), and demonstrate its application to several face recognition task. We improve on previous models by offering higher shape and texture accuracy due to a better scanning device and less correspondence artifacts due to an improved registration algorithm. The same 3D face model can be fit to 2D or 3D images acquired under different situations and with different sensors using an analysis by synthesis method. The resulting model parameters separate pose, lighting, imaging and identity parameters, which facilitates invariant face recognition across sensors and data sets by comparing only the identity parameters. We hope that the availability of this registered face model will spur research in generative models. Together with the model we publish a set of detailed recognition and reconstruction results on standard databases to allow complete algorithm comparisons.

...read moreread less

1,265 citations

Proceedings Article•DOI•

Poselets: Body part detectors trained using 3D human pose annotations

[...]

Lubomir Bourdev¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

01 Sep 2009

TL;DR: A new dataset, H3D, is built of annotations of humans in 2D photographs with 3D joint information, inferred using anthropometric constraints, to address the classic problems of detection, segmentation and pose estimation of people in images with a novel definition of a part, a poselet.

...read moreread less

Abstract: We address the classic problems of detection, segmentation and pose estimation of people in images with a novel definition of a part, a poselet. We postulate two criteria (1) It should be easy to find a poselet given an input image (2) it should be easy to localize the 3D configuration of the person conditioned on the detection of a poselet. To permit this we have built a new dataset, H3D, of annotations of humans in 2D photographs with 3D joint information, inferred using anthropometric constraints. This enables us to implement a data-driven search procedure for finding poselets that are tightly clustered in both 3D joint configuration space as well as 2D image appearance. The algorithm discovers poselets that correspond to frontal and profile faces, pedestrians, head and shoulder views, among others. Each poselet provides examples for training a linear SVM classifier which can then be run over the image in a multiscale scanning mode. The outputs of these poselet detectors can be thought of as an intermediate layer of nodes, on top of which one can run a second layer of classification or regression. We show how this permits detection and localization of torsos or keypoints such as left shoulder, nose, etc. Experimental results show that we obtain state of the art performance on people detection in the PASCAL VOC 2007 challenge, among other datasets. We are making publicly available both the H3D dataset as well as the poselet parameters for use by other researchers.

...read moreread less

1,153 citations

Journal Article•DOI•

A Survey of Face Recognition Techniques

[...]

Rabia Jafri, Hamid R. Arabnia

01 Jun 2009-Journal of Information Processing Systems

TL;DR: A discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has been provided.

...read moreread less

Abstract: Face recognition presents a challenging problem in the field of image analysis and computer vision, and as such has received a great deal of attention over the last few years because of its many applications in various domains. Face recognition techniques can be broadly divided into three categories based on the face data acquisition methodology: methods that operate on intensity images; those that deal with video sequences; and those that require other sensory data such as 3D information or infra-red imagery. In this paper, an overview of some of the well-known methods in each of these categories is provided and some of the benefits and drawbacks of the schemes mentioned therein are examined. Furthermore, a discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has also been provided. This paper also mentions some of the most recent algorithms developed for this purpose and attempts to give an idea of the state of the art of face recognition technology.

...read moreread less

751 citations

Journal Article•DOI•

Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol

[...]

Rangachar Kasturi¹, Dmitry B. Goldgof¹, Padmanabhan Soundararajan¹, Vasant Manohar¹, J. Garofolo¹, R. Bowers, Matthew Boonstra¹, Valentina N. Korzhova, Jing Zhang - Show less +5 more•Institutions (1)

University of South Florida¹

01 Feb 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The goal of this work was to systematically address the challenges of object detection and tracking through a common evaluation framework that permits a meaningful objective comparison of techniques, provides the research community with sufficient data for the exploration of automatic modeling techniques, encourages the incorporation of objective evaluation into the development process, and contributes useful lasting resources.

...read moreread less

Abstract: Common benchmark data sets, standardized performance metrics, and baseline algorithms have demonstrated considerable impact on research and development in a variety of application domains. These resources provide both consumers and developers of technology with a common framework to objectively compare the performance of different algorithms and algorithmic improvements. In this paper, we present such a framework for evaluating object detection and tracking in video: specifically for face, text, and vehicle objects. This framework includes the source video data, ground-truth annotations (along with guidelines for annotation), performance metrics, evaluation protocols, and tools including scoring software and baseline algorithms. For each detection and tracking task and supported domain, we developed a 50-clip training set and a 50-clip test set. Each data clip is approximately 2.5 minutes long and has been completely spatially/temporally annotated at the I-frame level. Each task/domain, therefore, has an associated annotated corpus of approximately 450,000 frames. The scope of such annotation is unprecedented and was designed to begin to support the necessary quantities of data for robust machine learning approaches, as well as a statistically significant comparison of the performance of algorithms. The goal of this work was to systematically address the challenges of object detection and tracking through a common evaluation framework that permits a meaningful objective comparison of techniques, provides the research community with sufficient data for the exploration of automatic modeling techniques, encourages the incorporation of objective evaluation into the development process, and contributes useful lasting resources of a scale and magnitude that will prove to be extremely useful to the computer vision research community for years to come.

...read moreread less

534 citations

Journal Article•DOI•

Face recognition across pose: A review

[...]

Xiaozheng Zhang¹, Yongsheng Gao¹•Institutions (1)

Griffith University¹

01 Nov 2009-Pattern Recognition

TL;DR: A critical survey of researches on image-based face recognition across pose is provided, classified into different categories according to their methodologies in handling pose variations, and several promising directions for future research have been suggested.

...read moreread less

511 citations

Proceedings Article•DOI•

Recognition using regions

[...]

Chunhui Gu¹, Joseph J. Lim¹, Pablo Arbeláez¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

20 Jun 2009

TL;DR: This paper presents a unified framework for object detection, segmentation, and classification using regions using a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis.

...read moreread less

Abstract: This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have not been popular as features due to their sensitivity to segmentation errors. In this paper, we start by producing a robust bag of overlaid regions for each image using Arbeldez et al., CVPR 2009. Each region is represented by a rich set of image cues (shape, color and texture). We then learn region weights using a max-margin framework. In detection and segmentation, we apply a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis. The proposed approach significantly outperforms the state of the art on the ETHZ shape database(87.1% average detection rate compared to Ferrari et al. 's 67.2%), and achieves competitive performance on the Caltech 101 database.

...read moreread less

433 citations

Proceedings Article•DOI•

CNP: An FPGA-based processor for Convolutional Networks

[...]

Clement Farabet¹, Cyril Poulet¹, Jefferson Y. Han², Yann LeCun¹•Institutions (2)

Courant Institute of Mathematical Sciences¹, Perceptive Pixel²

29 Sep 2009

TL;DR: The implementation exploits the inherent parallelism of ConvNets and takes full advantage of multiple hardware multiplyaccumulate units on the FPGA and can be used for low-power, lightweight embedded vision systems for micro-UAVs and other small robots.

...read moreread less

Abstract: Convolutional Networks (ConvNets) are biologicallyinspired hierarchical architectures that can be trained to perform a variety of detection, recognition and segmentation tasks. ConvNets have a feed-forward architecture consisting of multiple linear convolution filters interspersed with pointwise non-linear squashing functions. This paper presents an efficient implementation of ConvNets on a low-end DSPoriented Field Programmable Gate Array (FPGA). The implementation exploits the inherent parallelism of ConvNets and takes full advantage of multiple hardware multiplyaccumulate units on the FPGA. The entire system uses a single FPGA with an external memory module, and no extra parts. A network compiler software was implemented, which takes a description of a trained ConvNet and compiles it into a sequence of instructions for the ConvNet Processor (CNP). A ConvNet face detection system was implemented and tested. Face detection on a 512 × 384 frame takes 100ms (10 frames per second), which corresponds to an average performance of 3.4×109 connections per second for this 340 million connection network. The design can be used for low-power, lightweight embedded vision systems for micro-UAVs and other small robots.

...read moreread less

376 citations

Proceedings Article•DOI•

Understanding images of groups of people

[...]

Andrew C. Gallagher¹, Tsuhan Chen¹•Institutions (1)

Carnegie Mellon University¹

20 Jun 2009

TL;DR: This paper introduced contextual features that encapsulate the group structure locally (for each person in the group), and globally (the overall structure of the group) to accomplish a variety of tasks, such as demographic recognition, calculating scene and camera parameters, and even event recognition.

...read moreread less

Abstract: In many social settings, images of groups of people are captured The structure of this group provides meaningful context for reasoning about individuals in the group, and about the structure of the scene as a whole For example, men are more likely to stand on the edge of an image than women Instead of treating each face independently from all others, we introduce contextual features that encapsulate the group structure locally (for each person in the group) and globally (the overall structure of the group) This “social context” allows us to accomplish a variety of tasks, such as such as demographic recognition, calculating scene and camera parameters, and even event recognition We perform human studies to show this context aids recognition of demographic information in images of strangers

...read moreread less

339 citations

Proceedings Article•DOI•

A liveness detection method for face recognition based on optical flow field

[...]

Wei Bao¹, Hong Li¹, Nan Li¹, Wei Jiang¹•Institutions (1)

Zhejiang University¹

11 Apr 2009

TL;DR: A new liveness detection method for face recognition based on differences in optical flow fields generated by movements of two-dimensional planes and three-dimensional objects is proposed.

...read moreread less

Abstract: It is a common spoof to use a photograph to fool face recognition algorithm. In light of differences in optical flow fields generated by movements of two-dimensional planes and three-dimensional objects, we proposed a new liveness detection method for face recognition. Under the assumption that the test region is a two-dimensional plane, we can obtain a reference field from the actual optical flow field data. Then the degree of differences between the two fields can be used to distinguish between a three-dimensional face and a two-dimensional photograph. Empirical study shows that the proposed approach is both feasible and effective.

...read moreread less

327 citations

Journal Article•DOI•

Multisensor-Based Human Detection and Tracking for Mobile Service Robots

[...]

Nicola Bellotto¹, Huosheng Hu¹•Institutions (1)

University of Essex¹

01 Feb 2009

TL;DR: A solution for human tracking with a mobile robot that implements multisensor data fusion techniques based on the recognition of typical leg patterns extracted from laser scans, showing that robust human tracking can be performed within complex indoor environments.

...read moreread less

Abstract: One of fundamental issues for service robots is human-robot interaction. In order to perform such a task and provide the desired services, these robots need to detect and track people in the surroundings. In this paper, we propose a solution for human tracking with a mobile robot that implements multisensor data fusion techniques. The system utilizes a new algorithm for laser-based leg detection using the onboard laser range finder (LRF). The approach is based on the recognition of typical leg patterns extracted from laser scans, which are shown to also be very discriminative in cluttered environments. These patterns can be used to localize both static and walking persons, even when the robot moves. Furthermore, faces are detected using the robot's camera, and the information is fused to the legs' position using a sequential implementation of unscented Kalman filter. The proposed solution is feasible for service robots with a similar device configuration and has been successfully implemented on two different mobile platforms. Several experiments illustrate the effectiveness of our approach, showing that robust human tracking can be performed within complex indoor environments.

...read moreread less

304 citations

Patent•

Classification and Organization of Consumer Digital Images Using Workflow, and Face Detection and Recognition

[...]

Eran Steinberg, Peter Corcoran, Yury Prilutsky, Petronel Bigioi, Mihai Ciuc, Stefanita Ciurel, Constantin Vertan - Show less +3 more

04 Sep 2009

TL;DR: In this article, a processor-based system operating according to digitally-embedded programming instructions performs a method including identifying a group of pixels corresponding to a face region within digital image data acquired by an image acquisition device.

...read moreread less

Abstract: A processor-based system operating according to digitally-embedded programming instructions performs a method including identifying a group of pixels corresponding to a face region within digital image data acquired by an image acquisition device. A set of face analysis parameter values is extracted from said face region, including a faceprint associated with the face region. First and second reference faceprints are determined for a person using reference images captured respectively in predetermined face-portrait conditions and using ambient conditions. The faceprints are analyzed to determine a baseline faceprint and a range of variability from the baseline associated with the person. Results of the analyzing are stored and used in subsequent recognition of the person in a subsequent image acquired under ambient conditions.

...read moreread less

Proceedings Article•DOI•

Large-scale privacy protection in Google Street View

[...]

Andrea Frome¹, German Cheung¹, Ahmad Abdulkader¹, Marco Zennaro¹, Bo Wu¹, Alessandro Bissacco¹, Hartwig Adam¹, Hartmut Neven¹, Luc Vincent¹ - Show less +5 more•Institutions (1)

Google¹

01 Sep 2009

TL;DR: This work presents a system that combines a standard sliding-window detector tuned for a high recall, low-precision operating point with a fast post-processing stage that is able to remove additional false positives by incorporating domain-specific information not available to the sliding- window detector.

...read moreread less

Abstract: The last two years have witnessed the introduction and rapid expansion of products based upon large, systematically-gathered, street-level image collections, such as Google Street View, EveryScape, and Mapjack. In the process of gathering images of public spaces, these projects also capture license plates, faces, and other information considered sensitive from a privacy standpoint. In this work, we present a system that addresses the challenge of automatically detecting and blurring faces and license plates for the purpose of privacy protection in Google Street View. Though some in the field would claim face detection is “solved”, we show that state-of-the-art face detectors alone are not sufficient to achieve the recall desired for large-scale privacy protection. In this paper we present a system that combines a standard sliding-window detector tuned for a high recall, low-precision operating point with a fast post-processing stage that is able to remove additional false positives by incorporating domain-specific information not available to the sliding-window detector. Using a completely automatic system, we are able to sufficiently blur more than 89% of faces and 94 – 96% of license plates in evaluation sets sampled from Google Street View imagery.

...read moreread less

Proceedings Article•DOI•

“Who are you?” - Learning person specific classifiers from video

[...]

Josef Sivic¹, Mark Everingham², Andrew Zisserman³•Institutions (3)

École Normale Supérieure¹, University of Leeds², University of Oxford³

20 Jun 2009

TL;DR: A character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters is reported, demonstrating significantly increased coverage and performance with respect to previous methods on this material.

...read moreread less

Abstract: We investigate the problem of automatically labelling faces of characters in TV or movie material with their names, using only weak supervision from automatically-aligned subtitle and script text. Our previous work (Everingham et al. [8]) demonstrated promising results on the task, but the coverage of the method (proportion of video labelled) and generalization was limited by a restriction to frontal faces and nearest neighbour classification. In this paper we build on that method, extending the coverage greatly by the detection and recognition of characters in profile views. In addition, we make the following contributions: (i) seamless tracking, integration and recognition of profile and frontal detections, and (ii) a character specific multiple kernel classifier which is able to learn the features best able to discriminate between the characters. We report results on seven episodes of the TV series "Buffy the Vampire Slayer", demonstrating significantly increased coverage and performance with respect to previous methods on this material.

...read moreread less

Book Chapter•DOI•

Facial Expression Recognition

[...]

Maja Pantic

20 Jul 2009

TL;DR: Facial expression recognition is a process performed by humans or computers that consists of analyzing the motion of facial features and/or the changes in the appearance of facial Features and classifying this information into some facialexpression-interpretative categories such as facial muscle activations.

...read moreread less

Abstract: Facial expression recognition is a process performed by humans or computers, which consists of: 1. Locating faces in the scene (e.g., in an image; this step is also referred to as face detection), 2. Extracting facial features from the detected face region (e.g., detecting the shape of facial components or describing the texture of the skin in a facial area; this step is referred to as facial feature extraction), 3. Analyzing the motion of facial features and/or the changes in the appearance of facial features and classifying this information into some facialexpression-interpretative categories such as facial muscle activations like smile or frown, emotion (affect) categories like happiness or anger, attitude categories like (dis)liking or ambivalence, etc. (this step is also referred to as facial expression interpretation).

...read moreread less

Journal Article•DOI•

Automatic Temporal Segment Detection and Affect Recognition From Face and Body Display

[...]

Hatice Gunes¹, Massimo Piccardi¹•Institutions (1)

University of Technology, Sydney¹

01 Feb 2009

TL;DR: This paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes Affective states based on phase synchronization/alignment.

...read moreread less

Abstract: Psychologists have long explored mechanisms with which humans recognize other humans' affective states from modalities, such as voice and face display. This exploration has led to the identification of the main mechanisms, including the important role played in the recognition process by the modalities' dynamics. Constrained by the human physiology, the temporal evolution of a modality appears to be well approximated by a sequence of temporal segments called onset, apex, and offset. Stemming from these findings, computer scientists, over the past 15 years, have proposed various methodologies to automate the recognition process. We note, however, two main limitations to date. The first is that much of the past research has focused on affect recognition from single modalities. The second is that even the few multimodal systems have not paid sufficient attention to the modalities' dynamics: The automatic determination of their temporal segments, their synchronization to the purpose of modality fusion, and their role in affect recognition are yet to be adequately explored. To address this issue, this paper focuses on affective face and body display, proposes a method to automatically detect their temporal segments or phases, explores whether the detection of the temporal phases can effectively support recognition of affective states, and recognizes affective states based on phase synchronization/alignment. The experimental results obtained show the following: 1) affective face and body displays are simultaneous but not strictly synchronous; 2) explicit detection of the temporal phases can improve the accuracy of affect recognition; 3) recognition from fused face and body modalities performs better than that from the face or the body modality alone; and 4) synchronized feature-level fusion achieves better performance than decision-level fusion.

...read moreread less

Patent•

Classification system for consumer digital images using automatic workflow and face detection and recognition

[...]

Eran Steinberg, Peter Corcoran, Yury Prilutsky, Petronel Bigioi, Mihai Ciuc, Stefanita Ciurel, Constantin Vertran - Show less +3 more

20 Jul 2009

TL;DR: In this article, a processor-based system operating according to digitally-embedded programming instructions includes a face detection module for identifying face regions within digital images, a normalization module generates a normalized version of the face region, and a face recognition module automatically extracts a set of face classifier parameter values from the normalized face region.

...read moreread less

Abstract: A processor-based system operating according to digitally-embedded programming instructions includes a face detection module for identifying face regions within digital images. A normalization module generates a normalized version of the face region. A face recognition module automatically extracts a set of face classifier parameter values from the normalized face region that are referred to as a faceprint. A workflow module automatically compares the extracted faceprint to a database of archived faceprints previously determined to correspond to known identities. The workflow module determines based on the comparing whether the new faceprint corresponds to any of the known identities, and associates the new faceprint and normalized face region with a new or known identity within a database. A database module serves to archive data corresponding to the new faceprint and its associated parent image according to the associating by the workflow module within one or more digital data storage media.

...read moreread less

Patent•

Automatic face detection and identity masking in images, and applications thereof

[...]

Sergey Ioffe¹, Lance Williams¹, Dennis Strelow¹, Andrea Frome¹, Luc Vincent¹ - Show less +1 more•Institutions (1)

Google¹

31 Mar 2009

TL;DR: In this paper, a face detector is applied to detect a set of possible face regions in the image and an identity masker is used to process the detected face regions by identity masking techniques in order to obscure identities corresponding to the regions.

...read moreread less

Abstract: A method and system of identity masking to obscure identities corresponding to face regions in an image is disclosed. A face detector is applied to detect a set of possible face regions in the image. Then an identity masker is used to process the detected face regions by identity masking techniques in order to obscure identities corresponding to the regions. For example, a detected face region can be blurred as if it is in motion by a motion blur algorithm, such that the blurred region can not be recognized as the original identity. Or the detected face region can be replaced by a substitute facial image by a face replacement algorithm to obscure the corresponding identity.

...read moreread less

Journal Article•DOI•

Biological "bar codes" in human faces

[...]

Steven C. Dakin¹, Roger Watt²•Institutions (2)

UCL Institute of Ophthalmology¹, University of Stirling²

01 Apr 2009-Journal of Vision

TL;DR: This work shows empirically that facial identity information is conveyed largely via mechanisms tuned to horizontal visual structure, and shows that such structure affords computational advantages for face detection and decoding, including robustness to normal environmental image degradation.

...read moreread less

Abstract: The structure of the human face allows it to signal a wide range of useful information about a person's gender, identity, mood, etc. We show empirically that facial identity information is conveyed largely via mechanisms tuned to horizontal visual structure. Specifically observers perform substantially better at identifying faces that have been filtered to contain just horizontal information compared to any other orientation band. We then show, computationally, that horizontal structures within faces have an unusual tendency to fall into vertically co-aligned clusters compared with images of natural scenes. We call these clusters "bar codes" and propose that they have important computational properties. We propose that it is this property makes faces "special" visual stimuli because they are able to transmit information as reliable spatial sequence: a highly constrained one-dimensional code. We show that such structure affords computational advantages for face detection and decoding, including robustness to normal environmental image degradation, but makes faces vulnerable to certain classes of transformation that change the sequence of bars such as spatial inversion or contrast-polarity reversal.

...read moreread less

Patent•

Detecting orientation of digital images using face detection information

[...]

Eran Steinberg, Yury Prilutsky, Peter Corcoran, Petronel Bigioi, Leo Blonk, Mihnea Gângea, Constantin Vertan - Show less +3 more

10 Jun 2009

TL;DR: In this paper, a method of automatically establishing the correct orientation of an image using facial information is proposed, which is based on the exploitation of the inherent property of image recognition algorithms in general and face detection in particular.

...read moreread less

Abstract: A method of automatically establishing the correct orientation of an image using facial information. This method is based on the exploitation of the inherent property of image recognition algorithms in general and face detection in particular, where the recognition is based on criteria that is highly orientation sensitive. By applying a detection algorithm to images in various orientations, or alternatively by rotating the classifiers, and comparing the number of successful faces that are detected in each orientation, one may conclude as to the most likely correct orientation. Such method can be implemented as an automated method or a semi automatic method to guide users in viewing, capturing or printing of images.

...read moreread less

Journal Article•DOI•

3-D Face Detection, Landmark Localization, and Registration Using a Point Distribution Model

[...]

P. Nair¹, Andrea Cavallaro¹•Institutions (1)

University of London¹

01 Jun 2009-IEEE Transactions on Multimedia

TL;DR: An accurate and robust framework for detecting and segmenting faces, localizing landmarks, and achieving fine registration of face meshes based on the fitting of a facial model based on a 3-D Point Distribution Model that is fitted without relying on texture, pose, or orientation information is presented.

...read moreread less

Abstract: We present an accurate and robust framework for detecting and segmenting faces, localizing landmarks, and achieving fine registration of face meshes based on the fitting of a facial model. This model is based on a 3-D Point Distribution Model (PDM) that is fitted without relying on texture, pose, or orientation information. Fitting is initialized using candidate locations on the mesh, which are extracted from low-level curvature-based feature maps. Face detection is performed by classifying the transformations between model points and candidate vertices based on the upper-bound of the deviation of the parameters from the mean model. Landmark localization is performed on the segmented face by finding the transformation that minimizes the deviation of the model from the mean shape. Face registration is obtained using prior anthropometric knowledge and the localized landmarks. The performance of face detection is evaluated on a database of faces and non-face objects where we achieve an accuracy of 99.6%. We also demonstrate face detection and segmentation on objects with different scale and pose. The robustness of landmark localization is evaluated with noisy data and by varying the number of shapes and model points used in the model learning phase. Finally, face registration is compared with the traditional Iterative Closest Point (ICP) method and evaluated through a face retrieval and recognition framework on the GavabDB dataset, where we achieve a recognition rate of 87.4% and a retrieval rate of 83.9%.

...read moreread less

Proceedings Article•DOI•

Fpga-based face detection system using Haar classifiers

[...]

Jung Uk Cho¹, Shahnam Mirzaei², Jason Oberg², Ryan Kastner¹•Institutions (2)

University of California, San Diego¹, University of California, Santa Barbara²

22 Feb 2009

TL;DR: The hardware design techniques including image scaling, integral image generation, pipelined processing as well as classifier, and parallel processing multiple classifiers to accelerate the processing speed of the face detection system are described.

...read moreread less

Abstract: This paper presents a hardware architecture for face detection based system on AdaBoost algorithm using Haar features. We describe the hardware design techniques including image scaling, integral image generation, pipelined processing as well as classifier, and parallel processing multiple classifiers to accelerate the processing speed of the face detection system. Also we discuss the optimization of the proposed architecture which can be scalable for configurable devices with variable resources. The proposed architecture for face detection has been designed using Verilog HDL and implemented in Xilinx Virtex-5 FPGA. Its performance has been measured and compared with an equivalent software implementation. We show about 35 times increase of system performance over the equivalent software implementation.

...read moreread less

Journal Article•DOI•

Character Identification in Feature-Length Films Using Global Face-Name Matching

[...]

Yifan Zhang¹, Changsheng Xu¹, Hanqing Lu¹, Yeh-Min Huang²•Institutions (2)

Chinese Academy of Sciences¹, National Cheng Kung University²

01 Nov 2009-IEEE Transactions on Multimedia

TL;DR: A graph matching method is utilized to build face-name association between a face affinity network and a name affinity network which are, respectively, derived from their own domains (video and script) and mined using social network analysis.

...read moreread less

Abstract: Identification of characters in films, although very intuitive to humans, still poses a significant challenge to computer methods. In this paper, we investigate the problem of identifying characters in feature-length films using video and film script. Different from the state-of-the-art methods on naming faces in the videos, most of which used the local matching between a visible face and one of the names extracted from the temporally local video transcript, we attempt to do a global matching between names and clustered face tracks under the circumstances that there are not enough local name cues that can be found. The contributions of our work include: 1) A graph matching method is utilized to build face-name association between a face affinity network and a name affinity network which are, respectively, derived from their own domains (video and script). 2) An effective measure of face track distance is presented for face track clustering. 3) As an application, the relationship between characters is mined using social network analysis. The proposed framework is able to create a new experience on character-centered film browsing. Experiments are conducted on ten feature-length films and give encouraging results.

...read moreread less

Patent•

Automatic face and skin beautification using face detection

[...]

Mihai Ciuc, Adrian Capata, Valentin Mocanu, Alexei Pososin, Corneliu Florea, Peter Corcoran - Show less +2 more

30 Jul 2009

TL;DR: In this article, a localized smoothing kernel is applied to luminance data corresponding to the sub-regions of the face image to generate an enhanced face image, which includes the original pixels in combination with pixels corresponding to one or more enhanced subregions.

...read moreread less

Abstract: Sub-regions within a face image are identified to be enhanced by applying a localized smoothing kernel to luminance data corresponding to the sub-regions of the face image. An enhanced face image is generated including an enhanced version of the face that includes certain original pixels in combination with pixels corresponding to the one or more enhanced sub-regions of the face.

...read moreread less

Journal Article•DOI•

Face Relighting from a Single Image under Arbitrary Unknown Lighting Conditions

[...]

Yang Wang¹, Lei Zhang², Zicheng Liu³, Gang Hua³, Zhen Wen, Zhengyou Zhang³, Dimitris Samaras² - Show less +3 more•Institutions (3)

Princeton University¹, Stony Brook University², Microsoft³

01 Nov 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A subregion-based framework that uses a Markov random field to model the statistical distribution and spatial coherence of face texture, which makes the approach not only robust to extreme lighting conditions, but also insensitive to partial occlusions.

...read moreread less

Abstract: In this paper, we present a new method to modify the appearance of a face image by manipulating the illumination condition, when the face geometry and albedo information is unknown. This problem is particularly difficult when there is only a single image of the subject available. Recent research demonstrates that the set of images of a convex Lambertian object obtained under a wide variety of lighting conditions can be approximated accurately by a low-dimensional linear subspace using a spherical harmonic representation. Moreover, morphable models are statistical ensembles of facial properties such as shape and texture. In this paper, we integrate spherical harmonics into the morphable model framework by proposing a 3D spherical harmonic basis morphable model (SHBMM). The proposed method can represent a face under arbitrary unknown lighting and pose simply by three low-dimensional vectors, i.e., shape parameters, spherical harmonic basis parameters, and illumination coefficients, which are called the SHBMM parameters. However, when the image was taken under an extreme lighting condition, the approximation error can be large, thus making it difficult to recover albedo information. In order to address this problem, we propose a subregion-based framework that uses a Markov random field to model the statistical distribution and spatial coherence of face texture, which makes our approach not only robust to extreme lighting conditions, but also insensitive to partial occlusions. The performance of our framework is demonstrated through various experimental results, including the improved rates for face recognition under extreme lighting conditions.

...read moreread less

Patent•

Method and Apparatus to Incorporate Automatic Face Recognition in Digital Image Collections

[...]

Hartwig Adam¹, Johannes Steffens¹, Keith Shoji Kiyohara¹, Hartmut Neven¹, Brian Westphal¹, Tobias Magnusson¹, Gavin Doughtie¹, Henry Benjamin¹, Michael Horowitz¹, Hong-Kien Kenneth Ong¹ - Show less +6 more•Institutions (1)

Google¹

01 Apr 2009

TL;DR: In this article, a method and apparatus for creating and updating a facial image database from a collection of digital images is disclosed, where a set of detected faces from a digital image collection is stored in a database, along with data pertaining to them.

...read moreread less

Abstract: A method and apparatus for creating and updating a facial image database from a collection of digital images is disclosed. A set of detected faces from a digital image collection is stored in a facial image database, along with data pertaining to them. At least one facial recognition template for each face in the first set is computed, and the images in the set are grouped according to the facial recognition template into similarity groups. Another embodiment is a naming tool for assigning names to a plurality of faces detected in a digital image collection. A facial image database stores data pertaining to facial images detected in images of a digital image collection. In addition, the naming tool may include a graphical user interface, a face detection module that detects faces in images of the digital image collection and stores data pertaining to the detected faces in the facial image database, a face recognition module that computes at least one facial recognition template for each facial image in the facial image database, and a similarity grouping module that groups facial images in the facial image database according to the respective templates such that similar facial images belong to one similarity group.

...read moreread less

Proceedings Article•DOI•

Learning mappings for face synthesis from near infrared to visual light images

[...]

Jie Chen¹, Dong Yi², Jimei Yang², Guoying Zhao¹, Stan Z. Li², Matti Pietikäinen¹ - Show less +2 more•Institutions (2)

University of Oulu¹, Chinese Academy of Sciences²

20 Jun 2009

TL;DR: A novel method for synthesizing VIS images from NIR images based on learning the mappings between images of different spectra is proposed, which reduces the inter-spectral differences significantly, thus allowing effective matching between faces taken under different imaging conditions.

...read moreread less

Abstract: This paper deals with a new problem in face recognition research, in which the enrollment and query face samples are captured under different lighting conditions. In our case, the enrollment samples are visual light (VIS) images, whereas the query samples are taken under near infrared (NIR) condition. It is very difficult to directly match the face samples captured under these two lighting conditions due to their different visual appearances. In this paper, we propose a novel method for synthesizing VIS images from NIR images based on learning the mappings between images of different spectra (i.e., NIR and VIS). In our approach, we reduce the inter-spectral differences significantly, thus allowing effective matching between faces taken under different imaging conditions. Face recognition experiments clearly show the efficacy of the proposed approach.

...read moreread less

Patent•

Face tracking for controlling imaging parameters

[...]

Peter Corcoran, Eran Steinberg, Petronel Bigioi

05 Jun 2009

Abstract: A method of tracking faces in an image stream with a digital image acquisition device includes receiving images from an image stream including faces, calculating corresponding integral images, and applying different subsets of face detection rectangles to the integral images to provide sets of candidate regions. The different subsets include candidate face regions of different sizes and/or locations within the images. The different candidate face regions from different images of the image stream are each tracked.

...read moreread less

Proceedings Article•DOI•

A framework for automated measurement of the intensity of non-posed Facial Action Units

[...]

Mohammad H. Mahoor¹, Steven Cadavid², Daniel S. Messinger², Jeffrey F. Cohn³•Institutions (3)

University of Denver¹, University of Miami², University of Pittsburgh³

20 Jun 2009

TL;DR: This paper develops a framework to measure the intensity of AU12 and AU6 in videos captured from infant-mother live face-to-face communications and shows significant agreement between a human FACS coder and the approach, which makes it an efficient approach for automated measurement of theintensity of non-posed facial action units.

...read moreread less

Abstract: This paper presents a framework to automatically measure the intensity of naturally occurring facial actions. Naturalistic expressions are non-posed spontaneous actions. The facial action coding system (FACS) is the gold standard technique for describing facial expressions, which are parsed as comprehensive, nonoverlapping action units (Aus). AUs have intensities ranging from absent to maximal on a six-point metric (i.e., 0 to 5). Despite the efforts in recognizing the presence of non-posed action units, measuring their intensity has not been studied comprehensively. In this paper, we develop a framework to measure the intensity of AU12 (lip corner puller) and AU6 (cheek raising) in videos captured from infant-mother live face-to-face communications. The AU12 and AU6 are the most challenging case of infant's expressions (e.g., low facial texture in infant's face). One of the problems in facial image analysis is the large dimensionality of the visual data. Our approach for solving this problem is to utilize the spectral regression technique to project high dimensionality facial images into a low dimensionality space. Represented facial images in the low dimensional space are utilized to train support vector machine classifiers to predict the intensity of action units. Analysis of 18 minutes of captured video of non-posed facial expressions of several infants and mothers shows significant agreement between a human FACS coder and our approach, which makes it an efficient approach for automated measurement of the intensity of non-posed facial action units.

...read moreread less

Journal Article•DOI•

Detection, Localization, and Sex Classification of Faces from Arbitrary Viewpoints and under Occlusion

[...]

Matthew Toews, Tal Arbel¹•Institutions (1)

McGill University¹

01 Sep 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A comparison with the geometry-free bag-of-words model shows that geometrical information provided by the framework improves classification, and a comparison with support vector machines demonstrates that Bayesian classification results in superior performance.

...read moreread less

Abstract: This paper presents a novel framework for detecting, localizing, and classifying faces in terms of visual traits, e.g., sex or age, from arbitrary viewpoints and in the presence of occlusion. All three tasks are embedded in a general viewpoint-invariant model of object class appearance derived from local scale-invariant features, where features are probabilistically quantified in terms of their occurrence, appearance, geometry, and association with visual traits of interest. An appearance model is first learned for the object class, after which a Bayesian classifier is trained to identify the model features indicative of visual traits. The framework can be applied in realistic scenarios in the presence of viewpoint changes and partial occlusion, unlike other techniques assuming data that are single viewpoint, upright, prealigned, and cropped from background distraction. Experimentation establishes the first result for sex classification from arbitrary viewpoints, an equal error rate of 16.3 percent, based on the color FERET database. The method is also shown to work robustly on faces in cluttered imagery from the CMU profile database. A comparison with the geometry-free bag-of-words model shows that geometrical information provided by our framework improves classification. A comparison with support vector machines demonstrates that Bayesian classification results in superior performance.

...read moreread less

Journal Article•DOI•

Combining appearance and motion for face and gender recognition from videos

[...]

Abdenour Hadid¹, Matti Pietikäinen¹•Institutions (1)

University of Oulu¹

01 Nov 2009-Pattern Recognition

TL;DR: This paper proposes and study an approach for spatiotemporal face and gender recognition from videos using an extended set of volume LBP features and a boosting scheme, and assesses the promising performance of the LBP-based spatiotsemporal representations for describing and analyzing faces in videos.

...read moreread less

Collapse