scispace - formally typeset
Search or ask a question

Showing papers on "Face detection published in 2010"


Journal ArticleDOI
TL;DR: This review shows that, despite their apparent simplicity, the development of a general eye detection technique involves addressing many challenges, requires further theoretical developments, and is consequently of interest to many other domains problems in computer vision and beyond.
Abstract: Despite active research and significant progress in the last 30 years, eye detection and tracking remains challenging due to the individuality of eyes, occlusion, variability in scale, location, and light conditions. Data on eye location and details of eye movements have numerous applications and are essential in face detection, biometric identification, and particular human-computer interaction tasks. This paper reviews current progress and state of the art in video-based eye detection and tracking in order to identify promising techniques as well as issues to be further addressed. We present a detailed review of recent eye models and techniques for eye detection and tracking. We also survey methods for gaze estimation and compare them based on their geometric properties and reported accuracies. This review shows that, despite their apparent simplicity, the development of a general eye detection technique involves addressing many challenges, requires further theoretical developments, and is consequently of interest to many other domains problems in computer vision and beyond.

1,514 citations


01 Jan 2010
TL;DR: A new data set of face images with more faces and more accurate annotations for face regions than in previous data sets is presented and two rigorous and precise methods for evaluating the performance of face detection algorithms are proposed.
Abstract: Despite the maturity of face detection research, it remains difficult to compare different algorithms for face detection. This is partly due to the lack of common evaluation schemes. Also, existing data sets for evaluating face detection algorithms do not capture some aspects of face appearances that are manifested in real-world scenarios. In this work, we address both of these issues. We present a new data set of face images with more faces and more accurate annotations for face regions than in previous data sets. We also propose two rigorous and precise methods for evaluating the performance of face detection algorithms. We report results of several standard algorithms on the new benchmark.

963 citations


01 Jun 2010
TL;DR: This technical report surveys the recent advances in face detection for the past decade and surveys the various techniques according to how they extract features and what learning algorithms are adopted.
Abstract: Face detection has been one of the most studied topics in the computer vision literature. In this technical report, we survey the recent advances in face detection for the past decade. The seminal Viola-Jones face detector is first reviewed. We then survey the various techniques according to how they extract features and what learning algorithms are adopted. It is our hope that by reviewing the many existing algorithms, we will see even better algorithms developed to solve this fundamental computer vision problem. 1

607 citations


Journal ArticleDOI
TL;DR: A 3D aging modeling technique is proposed and it is shown how it can be used to compensate for the age variations to improve the face recognition performance.
Abstract: One of the challenges in automatic face recognition is to achieve temporal invariance. In other words, the goal is to come up with a representation and matching scheme that is robust to changes due to facial aging. Facial aging is a complex process that affects both the 3D shape of the face and its texture (e.g., wrinkles). These shape and texture changes degrade the performance of automatic face recognition systems. However, facial aging has not received substantial attention compared to other facial variations due to pose, lighting, and expression. We propose a 3D aging modeling technique and show how it can be used to compensate for the age variations to improve the face recognition performance. The aging modeling technique adapts view-invariant 3D face models to the given 2D face aging database. The proposed approach is evaluated on three different databases (i.g., FG-NET, MORPH, and BROWNS) using FaceVACS, a state-of-the-art commercial face recognition engine.

417 citations


Journal ArticleDOI
TL;DR: Results impose very serious constraints on the sorts of processing model that can be invoked and demonstrate that face-selective behavioral responses can be generated extremely rapidly.
Abstract: Previous work has demonstrated that the human visual system can detect animals in complex natural scenes very efficiently and rapidly. In particular, using a saccadic choice task, H. Kirchner and S. J. Thorpe ( 2006) found that when two images are simultaneously flashed in the left and right visual fields, saccades toward the side with an animal can be initiated in as little as 120-130 ms. Here we show that saccades toward human faces are even faster, with the earliest reliable saccades occurring in just 100-110 ms, and mean reaction times of roughly 140 ms. Intriguingly, it appears that these very fast saccades are not completely under instructional control, because when faces were paired with photographs of vehicles, fast saccades were still biased toward faces even when the subject was targeting vehicles. Finally, we tested whether these very fast saccades might only occur in the simple case where the images are presented left and right of fixation by showing they also occur when the images are presented above and below fixation. Such results impose very serious constraints on the sorts of processing model that can be invoked and demonstrate that face-selective behavioral responses can be generated extremely rapidly.

415 citations


Journal ArticleDOI
TL;DR: This paper proposes local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator, and introduces block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power.
Abstract: Gabor features have been known to be effective for face recognition. However, only a few approaches utilize phase feature and they usually perform worse than those using magnitude feature. To investigate the potential of Gabor phase and its fusion with magnitude for face recognition, in this paper, we first propose local Gabor XOR patterns (LGXP), which encodes the Gabor phase by using the local XOR pattern (LXP) operator. Then, we introduce block-based Fisher's linear discriminant (BFLD) to reduce the dimensionality of the proposed descriptor and at the same time enhance its discriminative power. Finally, by using BFLD, we fuse local patterns of Gabor magnitude and phase for face recognition. We evaluate our approach on FERET and FRGC 2.0 databases. In particular, we perform comparative experimental studies of different local Gabor patterns. We also make a detailed comparison of their combinations with BFLD, as well as the fusion of different descriptors by using BFLD. Extensive experimental results verify the effectiveness of our LGXP descriptor and also show that our fusion approach outperforms most of the state-of-the-art approaches.

390 citations


Journal ArticleDOI
TL;DR: This model represents faces in each age group by a hierarchical And-or graph, in which And nodes decompose a face into parts to describe details crucial for age perception and Or nodes represent large diversity of faces by alternative selections.
Abstract: In this paper, we present a compositional and dynamic model for face aging. The compositional model represents faces in each age group by a hierarchical And-or graph, in which And nodes decompose a face into parts to describe details (e.g., hair, wrinkles, etc.) crucial for age perception and Or nodes represent large diversity of faces by alternative selections. Then a face instance is a transverse of the And-or graph-parse graph. Face aging is modeled as a Markov process on the parse graph representation. We learn the parameters of the dynamic model from a large annotated face data set and the stochasticity of face aging is modeled in the dynamics explicitly. Based on this model, we propose a face aging simulation and prediction algorithm. Inversely, an automatic age estimation algorithm is also developed under this representation. We study two criteria to evaluate the aging results using human perception experiments: (1) the accuracy of simulation: whether the aged faces are perceived of the intended age group, and (2) preservation of identity: whether the aged faces are perceived as the same person. Quantitative statistical analysis validates the performance of our aging model and age estimation algorithm.

306 citations


Journal ArticleDOI
TL;DR: Experimental results show that the use of soft biometric traits is able to improve the face-recognition performance of a state-of-the-art commercial matcher.
Abstract: Soft biometric traits embedded in a face (e.g., gender and facial marks) are ancillary information and are not fully distinctive by themselves in face-recognition tasks. However, this information can be explicitly combined with face matching score to improve the overall face-recognition accuracy. Moreover, in certain application domains, e.g., visual surveillance, where a face image is occluded or is captured in off-frontal pose, soft biometric traits can provide even more valuable information for face matching or retrieval. Facial marks can also be useful to differentiate identical twins whose global facial appearances are very similar. The similarities found from soft biometrics can also be useful as a source of evidence in courts of law because they are more descriptive than the numerical matching scores generated by a traditional face matcher. We propose to utilize demographic information (e.g., gender and ethnicity) and facial marks (e.g., scars, moles, and freckles) for improving face image matching and retrieval performance. An automatic facial mark detection method has been developed that uses (1) the active appearance model for locating primary facial features (e.g., eyes, nose, and mouth), (2) the Laplacian-of-Gaussian blob detection, and (3) morphological operators. Experimental results based on the FERET database (426 images of 213 subjects) and two mugshot databases from the forensic domain (1225 images of 671 subjects and 10 000 images of 10 000 subjects, respectively) show that the use of soft biometric traits is able to improve the face-recognition performance of a state-of-the-art commercial matcher.

239 citations


Journal ArticleDOI
TL;DR: A novel automatic framework to perform 3D face recognition using a simulated annealing-based approach for range image registration with the surface interpenetration measure (SIM), as similarity measure, in order to match two face images.
Abstract: This paper presents a novel automatic framework to perform 3D face recognition. The proposed method uses a simulated annealing-based approach (SA) for range image registration with the surface interpenetration measure (SIM), as similarity measure, in order to match two face images. The authentication score is obtained by combining the SIM values corresponding to the matching of four different face regions: circular and elliptical areas around the nose, forehead, and the entire face region. Then, a modified SA approach is proposed taking advantage of invariant face regions to better handle facial expressions. Comprehensive experiments were performed on the FRGC v2 database, the largest available database of 3D face images composed of 4,007 images with different facial expressions. The experiments simulated both verification and identification systems and the results compared to those reported by state-of-the-art works. By using all of the images in the database, a verification rate of 96.5 percent was achieved at a false acceptance rate (FAR) of 0.1 percent. In the identification scenario, a rank-one accuracy of 98.4 percent was achieved. To the best of our knowledge, this is the highest rank-one score ever achieved for the FRGC v2 database when compared to results published in the literature.

213 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work uses a biologically-inspired filters model that combines sequential visual attention using fixations with sparse coding and unsupervised learning applied to natural image patches to overcome challenges in classification of images in many category datasbets.
Abstract: Classification of images in many category datasbets has rapidly improved in recent years. However, systems that perform well on particular datasets typically have one or more limitations such as a failure to generalize across visual tasks (e.g., requiring a face detector or extensive retuning of parameters), insufficient translation invariance, inability to cope with partial views and occlusion, or significant performance degradation as the number of classes is increased. Here we attempt to overcome these challenges using a model that combines sequential visual attention using fixations with sparse coding. The model's biologically-inspired filters are acquired using unsupervised learning applied to natural image patches. Using only a single feature type, our approach achieves 78.5% accuracy on Caltech-101 and 75.2% on the 102 Flowers dataset when trained on 30 instances per class and it achieves 92.7% accuracy on the AR Face database with 1 training instance per person. The same features and parameters are used across these datasets to illustrate its robust performance.

192 citations


Journal ArticleDOI
TL;DR: The results on the plastic surgery database suggest that it is an arduous research challenge and the current state-of-art face recognition algorithms are unable to provide acceptable levels of identification performance, so that future face recognition systems will be able to address this important problem.
Abstract: Advancement and affordability is leading to the popularity of plastic surgery procedures. Facial plastic surgery can be reconstructive to correct facial feature anomalies or cosmetic to improve the appearance. Both corrective as well as cosmetic surgeries alter the original facial information to a large extent thereby posing a great challenge for face recognition algorithms. The contribution of this research is 1) preparing a face database of 900 individuals for plastic surgery, and 2) providing an analytical and experimental underpinning of the effect of plastic surgery on face recognition algorithms. The results on the plastic surgery database suggest that it is an arduous research challenge and the current state-of-art face recognition algorithms are unable to provide acceptable levels of identification performance. Therefore, it is imperative to initiate a research effort so that future face recognition systems will be able to address this important problem.

Journal ArticleDOI
TL;DR: It is demonstrated that high attack detection accuracy can be achieved by using Conditional Random Fields and high efficiency by implementing the Layered Approach and the proposed system is robust and is able to handle noisy data without compromising performance.
Abstract: Intrusion detection faces a number of challenges; an intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper, we address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach. We demonstrate that high attack detection accuracy can be achieved by using Conditional Random Fields and high efficiency by implementing the Layered Approach. Experimental results on the benchmark KDD '99 intrusion data set show that our proposed system based on Layered Conditional Random Fields outperforms other well-known methods such as the decision trees and the naive Bayes. The improvement in attack detection accuracy is very high, particularly, for the U2R attacks (34.8 percent improvement) and the R2L attacks (34.5 percent improvement). Statistical Tests also demonstrate higher confidence in detection accuracy for our method. Finally, we show that our system is robust and is able to handle noisy data without compromising performance.

Journal ArticleDOI
01 Oct 2010
TL;DR: The goal was to develop an automatic process to be embedded in a face recognition system using only depth information as input, and the segmentation approach combines edge detection, region clustering, and shape analysis to extract the face region.
Abstract: We present a methodology for face segmentation and facial landmark detection in range images. Our goal was to develop an automatic process to be embedded in a face recognition system using only depth information as input. To this end, our segmentation approach combines edge detection, region clustering, and shape analysis to extract the face region, and our landmark detection approach combines surface curvature information and depth relief curves to find the nose and eye landmarks. The experiments were performed using the two available versions of the Face Recognition Grand Challenge database and the BU-3DFE database, in order to validate our proposed methodology and its advantages for 3-D face recognition purposes. We present an analysis regarding the accuracy of our segmentation and landmark detection approaches. Our results were better compared to state-of-the-art works published in the literature. We also performed an evaluation regarding the influence of the segmentation process in our 3-D face recognition system and analyzed the improvements obtained when applying landmark-based techniques to deal with facial expressions.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper proposes an algorithm to address the novel problem of human identity recognition over a set of unordered low quality aerial images by implementing a weighted voter-candidate formulation and identifies the candidate with the highest weighted vote as the target.
Abstract: Human identity recognition is an important yet under-addressed problem. Previous methods were strictly limited to high quality photographs, where the principal techniques heavily rely on body details such as face detection. In this paper, we propose an algorithm to address the novel problem of human identity recognition over a set of unordered low quality aerial images. Assuming a user was able to manually locate a target in some images of the set, we find the target in each other query image by implementing a weighted voter-candidate formulation. In the framework, every manually located target is a voter, and the set of humans in a query image are candidates. In order to locate the target, we detect and align blobs of voters and candidates. Consequently, we use PageRank to extract distinguishing regions, and then match multiple regions of a voter to multiple regions of a candidate using Earth Mover Distance (EMD). This generates a robust similarity measure between every voter-candidate pair. Finally, we identify the candidate with the highest weighted vote as the target. We tested our technique over several aerial image sets that we collected, along with publicly available sets, and have obtained promising results.

Journal ArticleDOI
TL;DR: The appearance-based approach to face detection has seen great advances in the last several years, but this approach has had limited success in providing an accurate and detailed description of the internal facial features, i.e., eyes, brows, nose, and mouth.
Abstract: The appearance-based approach to face detection has seen great advances in the last several years. In this approach, we learn the image statistics describing the texture pattern (appearance) of the object class we want to detect, e.g., the face. However, this approach has had limited success in providing an accurate and detailed description of the internal facial features, i.e., eyes, brows, nose, and mouth. In general, this is due to the limited information carried by the learned statistical model. While the face template is relatively rich in texture, facial features (e.g., eyes, nose, and mouth) do not carry enough discriminative information to tell them apart from all possible background images. We resolve this problem by adding the context information of each facial feature in the design of the statistical model. In the proposed approach, the context information defines the image statistics most correlated with the surroundings of each facial component. This means that when we search for a face or facial feature, we look for those locations which most resemble the feature yet are most dissimilar to its context. This dissimilarity with the context features forces the detector to gravitate toward an accurate estimate of the position of the facial feature. Learning to discriminate between feature and context templates is difficult, however, because the context and the texture of the facial features vary widely under changing expression, pose, and illumination, and may even resemble one another. We address this problem with the use of subclass divisions. We derive two algorithms to automatically divide the training samples of each facial feature into a set of subclasses, each representing a distinct construction of the same facial component (e.g., closed versus open eyes) or its context (e.g., different hairstyles). The first algorithm is based on a discriminant analysis formulation. The second algorithm is an extension of the AdaBoost approach. We provide extensive experimental results using still images and video sequences for a total of 3,930 images. We show that the results are almost as good as those obtained with manual detection.

Journal ArticleDOI
TL;DR: A framework for improving the quality of personal photos by using a person's favorite photographs as examples, using face detection to align faces between “good” and “bad” photos such that properties of the good examples can be used to correct a bad photo.
Abstract: We describe a framework for improving the quality of personal photos by using a person's favorite photographs as examples. We observe that the majority of a person's photographs include the faces of a photographer's family and friends and often the errors in these photographs are the most disconcerting. We focus on correcting these types of images and use common faces across images to automatically perform both global and face-specific corrections. Our system achieves this by using face detection to align faces between “good” and “bad” photos such that properties of the good examples can be used to correct a bad photo. These “personal” photos provide strong guidance for a number of operations and, as a result, enable a number of high-quality image processing operations. We illustrate the power and generality of our approach by presenting a novel deblurring algorithm, and we show corrections that perform sharpening, superresolution, in-painting of over- and underexposured regions, and white-balancing.

Proceedings ArticleDOI
02 May 2010
TL;DR: This work presents a multi-GPU implementation of the Viola-Jones face detection algorithm that meets the performance of the fastest known FPGA implementation, and discusses the performance programming required to realize the design.
Abstract: Face detection is an important aspect for biometrics, video surveillance and human computer interaction. We present a multi-GPU implementation of the Viola-Jones face detection algorithm that meets the performance of the fastest known FPGA implementation. The GPU design offers far lower development costs, but the FPGA implementation consumes less power. We discuss the performance programming required to realize our design, and describe future research directions.

Proceedings ArticleDOI
29 Mar 2010
TL;DR: A new system for editing personal photo collections, inspired by search-and-replace editing for text, that builds on tools from computer vision for image matching to propagate local edits specified by the user in a single photo by matching the edited region across photos.
Abstract: We propose a new system for editing personal photo collections, inspired by search-and-replace editing for text. In our system, local edits specified by the user in a single photo (e.g., using the “clone brush” tool) can be propagated automatically to other photos in the same collection, by matching the edited region across photos. To achieve this, we build on tools from computer vision for image matching. Our experimental results on real photo collections demonstrate the feasibility and potential benefits of our approach.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: A novel face representation in which a face is represented in terms of dense Scale Invariant Feature Transform (d-SIFT) and shape contexts of the face image and AdaBoost is adopted to select features and form a strong classifier to solve the problem of gender recognition.
Abstract: In this paper, we propose a novel face representation in which a face is represented in terms of dense Scale Invariant Feature Transform (d-SIFT) and shape contexts of the face image. The application of the representation in gender recognition has been investigated. There are four problems when applying the SIFT to facial gender recognition. (1) There may be only a few keypoints that can be found in a face image due to the missing texture and poorly illuminated faces; (2) The SIFT descriptors at the keypoints (we called it sparse SIFT) are distinctive whereas alternative descriptors at non-keypoints (e.g. grid) could cause negative impact on the accuracy; (3) Relatively larger image size is required to obtain sufficient keypoints support the matching and (4) The matching assumes that the faces are properly registered. This paper addresses these difficulties using a combination of SIFT descriptors and shape contexts of face images. Instead of extracting descriptors around interest points only, local feature descriptors are extracted at regular image grid points that allow for a dense description of the face images. In addition, the global shape contexts of the face images are fused with the dense SIFT to improve the accuracy. AdaBoost is adopted to select features and form a strong classifier. The proposed approach is then applied to solve the problem of gender recognition. The experimental results on a large set of faces showed that the proposed method can achieve high accuracies even for faces that are not aligned.

Patent
05 Apr 2010
TL;DR: Within a digital acquisition device with a built-in flash unit, the exposure of an acquired digital image is perfected using face detection in the acquired image as discussed by the authors, and groups of pixels that correspond to plural images of faces are identified within a digitally acquired image, and corresponding image attributes to the group of pixels are determined.
Abstract: Within a digital acquisition device with a built in flash unit, the exposure of an acquired digital image is perfected using face detection in the acquired image is provided. Groups of pixels that correspond to plural images of faces are identified within a digitally acquired image, and corresponding image attributes to the group of pixels are determined. An analysis is performed of the corresponding attributes of the groups of pixels. It is then determined to activate the built-in flash unit based on the analysis. An intensity of the built-in flash unit is determined based on the analysis. Alternatively based on similar analysis, a digital simulation of the fill flash is performed on the image.

Journal ArticleDOI
01 Sep 2010-Cortex
TL;DR: These cases provide evidence for familial transmission of high-level visual recognition deficits with normal intermediate-level form vision in parents and daughters from one family.

01 Jan 2010
TL;DR: This paper presents a novel adaptive algorithm to detect the center of pupil in frontal view faces that employs the viola-Jones face detector to find the approximate location of face in an image.
Abstract: This paper presents a novel adaptive algorithm to detect the center of pupil in frontal view faces. This algorithm, at first, employs the viola-Jones face detector to find the approximate location of face in an image. The knowledge of the face structure is exploited to detect the eye region. The histogram of the detected region is calculated and its CDF is employed to extract the eyelids and iris region in an adaptive way. The center of this region is considered as the pupil center. The experimental results show ninety one percent's accuracy in detecting pupil center.

01 Jan 2010
TL;DR: Two new encoding schemes for representation of the intensity function in a local neigh- borhood are presented, which are complementary to the standard local binary patterns (LBPs) and preserve an important property of the LBP, the invariance to monotonic transformations of the in- tensity.
Abstract: The paper presents two new encoding schemes for representation of the intensity function in a local neigh- borhood. The encoding produces binary codes, which are complementary to the standard local binary patterns (LBPs). Both new schemes preserve an important property of the LBP, the invariance to monotonic transformations of the in- tensity. Moreover, one of the schemes possesses invariance to gray scale inversion. The utility of the new encodings is demonstrated in the framework of AdaBoost learning. The new LBP encoding schemes were tested on the face detection, car detection and gender recognition problems us- ing the CMU-MIT frontal face dataset, the UIUC Car dataset and the FERET dataset respectively. Experimental results show that the proposed encoding methods improve both the accuracy and the speed of the fi- nal classifier. In all tested tasks, a combination of the encod- ing schemes outperforms the original one. No LBP encoding scheme dominates, the relative importance of the schemes is problem-specific.

Proceedings ArticleDOI
05 Jul 2010
TL;DR: A simple and fast motion history image based method to classify the dynamic hand gestures that is suitable to control most home appliances and demonstrated the feasibility of the proposed system.
Abstract: Hand gesture recognition based man-machine interface is being developed vigorously in recent years. Due to the effect of lighting and complex background, most visual hand gesture recognition systems work only under restricted environment. An adaptive skin color model based on face detection is utilized to detect skin color regions like hands. To classify the dynamic hand gestures, we developed a simple and fast motion history image based method. Four groups of haar-like directional patterns were trained for the up, down, left, and right hand gestures classifiers. Together with fist hand and waving hand gestures, there were totally six hand gestures defined. In general, it is suitable to control most home appliances. Five persons doing 250 hand gestures at near, medium, and far distances in front of the web camera were tested. Experimental results show that the accuracy is 94.1% in average and the processing time is 3.81 ms per frame. These demonstrated the feasibility of the proposed system.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work proposes a method that extends the integral image to do fast integration over the interior of any polygon that is not necessarily rectilinear, and applies it to Viola and Jones' object detection framework, in which it is shown that the extended feature set improves object detection's performance.
Abstract: The integral image is typically used for fast integrating a function over a rectangular region in an image. We propose a method that extends the integral image to do fast integration over the interior of any polygon that is not necessarily rectilinear. The integration time of the method is fast, independent of the image resolution, and only linear to the polygon's number of vertices. We apply the method to Viola and Jones' object detection framework, in which we propose to improve classical Haar-like features with polygonal Haar-like features. We show that the extended feature set improves object detection's performance. The experiments are conducted in three domains: frontal face detection, fixed-pose hand detection, and rock detection for Mars' surface terrain assessment.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This work evaluates random forest based skin detection and compares it to Bayesian network, Multilayer Perceptron, SVM, AdaBoost, Naive Bayes and RBF network, and shows that with the IHLS color space, the random forest approach outperforms other approaches.
Abstract: Skin detection is used in applications ranging from face detection, tracking body parts and hand gesture analysis, to retrieval and blocking objectionable content. For robust skin segmentation and detection, we investigate color classification based on random forest. A random forest is a statistical framework with a very high generalization accuracy and quick training times. The random forest approach is used with the IHLS color space for raw pixel based skin detection. We evaluate random forest based skin detection and compare it to Bayesian network, Multilayer Perceptron, SVM, AdaBoost, Naive Bayes and RBF network. Results on a database of 8991 images with manually annotated pixel-level ground truth show that with the IHLS color space, the random forest approach outperforms other approaches. We also show the effect of increasing the number of trees grown for random forest. With fewer trees we get faster training times and with 10 trees we get the highest F-score.

Journal ArticleDOI
TL;DR: A novel search technique which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images is provided.
Abstract: We provide a novel search technique which uses a hierarchical model and a mutual information gain heuristic to efficiently prune the search space when localizing faces in images. We show exponential gains in computation over traditional sliding window approaches, while keeping similar performance levels.

Journal ArticleDOI
TL;DR: A new subspace learning method, called uncorrelated discriminant nearest feature line analysis (UDNFLA), for face recognition using the NFL metric to seek a feature subspace such that the within-class feature line (FL) distances are minimized and between-class FL distances are maximized simultaneously in the reduced subspace.
Abstract: We propose in this letter a new subspace learning method, called uncorrelated discriminant nearest feature line analysis (UDNFLA), for face recognition. Motivated by the fact that existing nearest feature line (NFL) can effectively characterize the geometrical information of face samples, and uncorrelated features are desirable for many pattern analysis applications, we propose using the NFL metric to seek a feature subspace such that the within-class feature line (FL) distances are minimized and between-class FL distances are maximized simultaneously in the reduced subspace, and impose an uncorrelated constraint to make the extracted features statistically uncorrelated. Experimental results on two widely used face databases demonstrate the efficacy of the proposed method.

Journal ArticleDOI
TL;DR: A new location-based saliency map which is generated based on camera motion parameters is combined with other saliency maps generated using features such as color contrast, object motion and face detection to determine the ROIs.
Abstract: In this paper we propose a system for the analysis of user generated video (UGV). UGV often has a rich camera motion structure that is generated at the time the video is recorded by the person taking the video, i.e., the ?camera person.? We exploit this structure by defining a new concept known as camera view for temporal segmentation of UGV. The segmentation provides a video summary with unique properties that is useful in applications such as video annotation. Camera motion is also a powerful feature for identification of keyframes and regions of interest (ROIs) since it is an indicator of the camera person's interests in the scene and can also attract the viewers' attention. We propose a new location-based saliency map which is generated based on camera motion parameters. This map is combined with other saliency maps generated using features such as color contrast, object motion and face detection to determine the ROIs. In order to evaluate our methods we conducted several user studies. A subjective evaluation indicated that our system produces results that is consistent with viewers' preferences. We also examined the effect of camera motion on human visual attention through an eye tracking experiment. The results showed a high dependency between the distribution of fixation points of the viewers and the direction of camera movement which is consistent with our location-based saliency map.

Journal ArticleDOI
TL;DR: This article proposes a mixed saliency map model based on Itti's model and face detection that shows that the performance of the image quality assessment on full subsets is enhanced.
Abstract: Region saliency has not been fully considered in most previous image quality assessment models In this article, the contribution of any region to the global quality measure of an image is weighted with variable weights computed as a function of its saliency In salient regions, the differences between distorted and original images are emphasized as if the authors are observing the difference image with a magnifying glass Here a mixed saliency map model based on Itti's model and face detection is proposed Both low-level features including intensity, color, orientation, and high-level features such as face are used in the mixed model Differences in salient regions are then given more importance and thus contribute more to the image quality score The experiments done on the 1700 distorted images of the TID2008 database show that the performance of the image quality assessment on full subsets is enhanced