scispace - formally typeset
Search or ask a question

Showing papers on "Object-class detection published in 2009"


Journal ArticleDOI
TL;DR: A discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has been provided.
Abstract: Face recognition presents a challenging problem in the field of image analysis and computer vision, and as such has received a great deal of attention over the last few years because of its many applications in various domains. Face recognition techniques can be broadly divided into three categories based on the face data acquisition methodology: methods that operate on intensity images; those that deal with video sequences; and those that require other sensory data such as 3D information or infra-red imagery. In this paper, an overview of some of the well-known methods in each of these categories is provided and some of the benefits and drawbacks of the schemes mentioned therein are examined. Furthermore, a discussion outlining the incentive for using face recognition, the applications of this technology, and some of the difficulties plaguing current systems with regard to this task has also been provided. This paper also mentions some of the most recent algorithms developed for this purpose and attempts to give an idea of the state of the art of face recognition technology.

751 citations


Book
20 Apr 2009
TL;DR: This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications.
Abstract: The detection and recognition of objects in images is a key research topic in the computer vision community Within this area, face recognition and interpretation has attracted increasing attention owing to the possibility of unveiling human perception mechanisms, and for the development of practical biometric systems This book and the accompanying website, focus on template matching, a subset of object recognition techniques of wide applicability, which has proved to be particularly effective for face recognition applications Using examples from face processing tasks throughout the book to illustrate more general object recognition approaches, Roberto Brunelli: examines the basics of digital image formation, highlighting points critical to the task of template matching; presents basic and advanced template matching techniques, targeting grey-level images, shapes and point sets; discusses recent pattern classification paradigms from a template matching perspective; illustrates the development of a real face recognition system; explores the use of advanced computer graphics techniques in the development of computer vision algorithms Template Matching Techniques in Computer Vision is primarily aimed at practitioners working on the development of systems for effective object recognition such as biometrics, robot navigation, multimedia retrieval and landmark detection It is also of interest to graduate students undertaking studies in these areas

721 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: It is demonstrated that Hough forests improve the results of the Hough-transform object detection significantly and achieve state-of-the-art performance for several classes and datasets.
Abstract: We present a method for the detection of instances of an object class, such as cars or pedestrians, in natural images Similarly to some previous works, this is accomplished via generalized Hough transform, where the detections of individual object parts cast probabilistic votes for possible locations of the centroid of the whole object; the detection hypotheses then correspond to the maxima of the Hough image that accumulates the votes from all parts However, whereas the previous methods detect object parts using generative codebooks of part appearances, we take a more discriminative approach to object part detection Towards this end, we train a class-specific Hough forest, which is a random forest that directly maps the image patch appearance to the probabilistic vote about the possible location of the object centroid We demonstrate that Hough forests improve the results of the Hough-transform object detection significantly and achieve state-of-the-art performance for several classes and datasets

518 citations


Journal ArticleDOI
TL;DR: The comparative analysis of various Image Edge Detection methods is presented and it has been shown that the Canny's edge detection algorithm performs better than all these operators under almost all scenarios.
Abstract: —Edges characterize boundaries and are therefore considered for prime importance in image processing. Edge detection filters out useless data, noise and frequencies while preserving the important structural properties in an image. Since edge detection is in the forefront of image processing for object detection, it is crucial to have a good understanding of edge detection methods. In this paper the comparative analysis of various Image Edge Detection methods is presented. The evidence for the best detector type is judged by studying the edge maps relative to each other through statistical evaluation. Upon this evaluation, an edge detection method can be employed to characterize edges to represent the image for further analysis and implementation. It has been shown that the Canny’s edge detection algorithm performs better than all these operators under almost all scenarios. Index Terms —About four key words or phrases in alphabetical order, separated by commas. I. I NTRODUCTION

200 citations


Proceedings Article
07 Dec 2009
TL;DR: This work proposes a hierarchical region-based approach to joint object detection and image segmentation that simultaneously reasons about pixels, regions and objects in a coherent probabilistic model and gives a single unified description of the scene.
Abstract: Object detection and multi-class image segmentation are two closely related tasks that can be greatly improved when solved jointly by feeding information from one task to the other [10, 11]. However, current state-of-the-art models use a separate representation for each task making joint inference clumsy and leaving the classification of many parts of the scene ambiguous. In this work, we propose a hierarchical region-based approach to joint object detection and image segmentation. Our approach simultaneously reasons about pixels, regions and objects in a coherent probabilistic model. Pixel appearance features allow us to perform well on classifying amorphous background classes, while the explicit representation of regions facilitate the computation of more sophisticated features necessary for object detection. Importantly, our model gives a single unified description of the scene—we explain every pixel in the image and enforce global consistency between all random variables in our model. We run experiments on the challenging Street Scene dataset [2] and show significant improvement over state-of-the-art results for object detection accuracy.

179 citations


Patent
30 Jul 2009
TL;DR: In this article, a localized smoothing kernel is applied to luminance data corresponding to the sub-regions of the face image to generate an enhanced face image, which includes the original pixels in combination with pixels corresponding to one or more enhanced subregions.
Abstract: Sub-regions within a face image are identified to be enhanced by applying a localized smoothing kernel to luminance data corresponding to the sub-regions of the face image. An enhanced face image is generated including an enhanced version of the face that includes certain original pixels in combination with pixels corresponding to the one or more enhanced sub-regions of the face.

143 citations


Journal ArticleDOI
TL;DR: This work shows how histogram-based image descriptors can be combined with a boosting classifier to provide a state of the art object detector and introduces a weak learner for multi-valued histogram features and shows how to overcome problems of limited training sets.

127 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: A novel feature extraction scheme which computes implicit ‘soft segmentations’ of image regions into foreground/background and yields stronger object/background edges than gray-scale gradient alone, suppresses textural and shading variations, and captures local coherence of object appearance.
Abstract: We investigate the problem of pedestrian detection in still images. Sliding window classifiers, notably using the Histogram-of-Gradient (HOG) features proposed by Dalal and Triggs are the state-of-the-art for this task, and we base our method on this approach. We propose a novel feature extraction scheme which computes implicit ‘soft segmentations’ of image regions into foreground/background. The method yields stronger object/background edges than gray-scale gradient alone, suppresses textural and shading variations, and captures local coherence of object appearance. The main contributions of our work are: (i) incorporation of segmentation cues into object detection; (ii) integration with classifier learning cf. a post-processing filter; (iii) high computational efficiency. We report results on the INRIA person detection dataset, achieving state-of-the-art results considerably exceeding those of the original HOG detector. Preliminary results for generic object detection on the PASCAL VOC2006 dataset also show substantial improvements in accuracy.

100 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: A multiscale local descriptor-based face representation that constrains the quantization regions to be localized not just in feature space but also in image space, allowing us to achieve an implicit elastic matching for face images.
Abstract: We present a new approach to robust pose-variant face recognition, which exhibits excellent generalization ability even across completely different datasets due to its weak dependence on data. Most face recognition algorithms assume that the face images are very well-aligned. This assumption is often violated in real-life face recognition tasks, in which face detection and rectification have to be performed automatically prior to recognition. Although great improvements have been made in face alignment recently, significant pose variations may still occur in the aligned faces. We propose a multiscale local descriptor-based face representation to mitigate this issue. First, discriminative local image descriptors are extracted from a dense set of multiscale image patches. The descriptors are expanded by their spatial locations. Each expanded descriptor is quantized by a set of random projection trees. The final face representation is a histogram of the quantized descriptors. The location expansion constrains the quantization regions to be localized not just in feature space but also in image space, allowing us to achieve an implicit elastic matching for face images. Our experiments on challenging face recognition benchmarks demonstrate the advantages of the proposed approach for handling large pose variations, as well as its superb generalization ability.

93 citations


Patent
Jay Yagnik1, Ming Zhao1
14 Jul 2009
TL;DR: In this article, a method of identifying faces in a video includes the stages of generating face tracks from input video streams, selecting key face images for each face track, clustering the face tracks to generate face clusters, creating face models from the face clusters and correlating face models with a face model database.
Abstract: Methods and systems for automated annotation of persons in video content are disclosed. In one embodiment, a method of identifying faces in a video includes the stages of: generating face tracks from input video streams; selecting key face images for each face track; clustering the face tracks to generate face clusters; creating face models from the face clusters; and correlating face models with a face model database. In another embodiment, a system for identifying faces in a video includes a face model database having face entries with face models and corresponding names, and a video face identifier module. In yet another embodiment, the system for identifying faces in a video can also have a face model generator.

91 citations


Journal ArticleDOI
01 Jan 2009
TL;DR: An integrated system for emotion detection is presented, taking into account the fact that emotions are most widely represented with eye and mouth expressions, and it is consisted of three modules.
Abstract: This paper presents an integrated system for emotion detection. In this research effort, we have taken into account the fact that emotions are most widely represented with eye and mouth expressions. The proposed system uses color images and it is consisted of three modules. The first module implements skin detection, using Markov random fields models for image segmentation and skin detection. A set of several colored images with human faces have been considered as the training set. A second module is responsible for eye and mouth detection and extraction. The specific module uses the HLV color space of the specified eye and mouth region. The third module detects the emotions pictured in the eyes and mouth, using edge detection and measuring the gradient of eyes' and mouth's region figure. The paper provides results from the system application, along with proposals for further research.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: Experiments show that the adaptive contour feature outperform several well-known existing features due to its stronger discriminative power rooted in the nature of its flexibility and adaptability to describe an object contour element.
Abstract: In this paper, a novel feature named adaptive contour feature (ACF) is proposed for human detection and segmentation. This feature consists of a chain of a number of granules in oriented granular space (OGS) that is learnt via the AdaBoost algorithm. Three operations are defined on the OGS to mine object contour feature and feature co-occurrences automatically. A heuristic learning algorithm is proposed to generate an ACF that at the same time define a weak classifier for human detection or segmentation. Experiments on two open datasets show that the ACF outperform several well-known existing features due to its stronger discriminative power rooted in the nature of its flexibility and adaptability to describe an object contour element.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper shows the applicability of a well known, local- feature based object detector for the case of people detection in thermal data and shows how this local-feature based detector can be used to recognize specific object parts, i.e., body parts of detected people.
Abstract: One of the main challenges in computer vision is the automatic detection of specific object classes in images. Recent advances of object detection performance in the visible spectrum encourage the application of these approaches to data beyond the visible spectrum. In this paper, we show the applicability of a well known, local-feature based object detector for the case of people detection in thermal data. We adapt the detector to the special conditions of infrared data and show the specifics relevant for feature based object detection. For that, we employ the SURF feature detector and descriptor that is well suited for infrared data. We evaluate the performance of our adapted object detector in the task of person detection in different real-world scenarios where people occur at multiple scales. Finally, we show how this local-feature based detector can be used to recognize specific object parts, i.e., body parts of detected people.

Proceedings ArticleDOI
02 Sep 2009
TL;DR: This paper achieves face tracking combining a new scale invariant Kalman filter with kernel based tracking algorithm and an Earth Mover's Distance-based K-NN classification discriminates true face trajectories from the false ones.
Abstract: Vision-based people counting systems have wide potential applications including video surveillance and public resources management. Most works in the literature rely on moving object detection and tracking, assuming that all moving objects are people. In this paper, we present our people counting approach based on face detection, tracking and trajectory classification. While we have used a standard face detector, we achieve face tracking combining a new scale invariant Kalman filter with kernel based tracking algorithm. From each potential face trajectory an angle histogram of neighboring points is then extracted. Finally, an Earth Mover's Distance-based K-NN classification discriminates true face trajectories from the false ones. Experimented on a video dataset of more than 160 potential people trajectories, our approach displays an accuracy rate up to 93%.

Patent
02 Dec 2009
TL;DR: In this article, a motion detection method, apparatus and system are disclosed in the present invention, which relates to the video image processing field, which can effectively overcome the influence of the background on motion detection and the problem of object "conglutination" to avoid false detection, thereby accomplishing object detection in complex scenes with a high precision.
Abstract: A motion detection method, apparatus and system are disclosed in the present invention, which relates to the video image processing field. The present invention can effectively overcome the influence of the background on motion detection and the problem of object “conglutination” to avoid false detection, thereby accomplishing object detection in complex scenes with a high precision. The motion detection method disclosed in embodiments of the present invention comprises: acquiring detection information of the background scene and detection information of the current scene, wherein the current scene is a scene comprising an object(s) to be detected and the same background scene; and calculating the object(s) to be detected according to the detection information of the background scene and the detection information of the current scene. The present invention is applicable to any scenes where moving objects need to be detected, e.g., automatic passenger flow statistical systems in railway, metro and bus sectors, and is particularly applicable to detection and calibration of objects in places where brightness varies greatly.

Book ChapterDOI
04 Jun 2009
TL;DR: This paper presents a new method, called face analogy, in the analysis-by-synthesis framework, for heterogeneous face mapping, that is, transforming face images from one type to another, and thereby performingheterogeneous face matching.
Abstract: Face images captured in different spectral bands, e.g. , in visual (VIS) and near infrared (NIR), are said to be heterogeneous. Although a person's face looks different in heterogeneous images, it should be classified as being from the same individual. In this paper, we present a new method, called face analogy , in the analysis-by-synthesis framework, for heterogeneous face mapping, that is, transforming face images from one type to another, and thereby performing heterogeneous face matching. Experiments show promising results.

01 Jan 2009
TL;DR: A novel technique for eye detection using color and morphological image processing that is found to be highly efficient and accurate for detecting eyes in frontal face images.
Abstract: Eye detection is required in many applications like eye-gaze tracking, iris detection, video conferencing, auto-stereoscopic displays, face detection and face recognition. This paper proposes a novel technique for eye detection using color and morphological image processing. It is observed that eye regions in an image are characterized by low illumination, high density edges and high contrast as compared to other parts of the face. The method proposed is based on assumption that a frontal face image (full frontal) is available. Firstly, the skin region is detected using a color based training algorithm and six-sigma technique operated on RGB, HSV and NTSC scales. Further analysis involves morphological processing using boundary region detection and detection of light source reflection by an eye, commonly known as an eye dot. This gives a finite number of eye candidates from which noise is subsequently removed. This technique is found to be highly efficient and accurate for detecting eyes in frontal face images.

Journal ArticleDOI
TL;DR: An effective, efficient face live detection method which uses physiological motion detected by estimating the eye blinks from a captured video sequence and an eye contour extraction algorithm and discriminates a live human face from a photograph of the registered person's face to increase the face recognition system reliability.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: This paper designs and develops optimized parallel implementations of face detection and tracking algorithms on graphics processors using the Compute Unified Device Architecture (CUDA), a C-based programming model from NVIDIA.
Abstract: Processing of human faces finds application in various domains like law enforcement and surveillance, entertainment (interactive video games), information security, smart cards etc. Several of these applications are interactive and require reliable and fast face processing. A generic face processing system may comprise of face detection, recognition, tracking and rendering. In this paper, we develop a GPU accelerated real-time and robust face processing system that does face detection and tracking. Face detection is done by adapting the Viola and Jones algorithm that is based on the Adaboost learning system. For robust tracking of faces across real-life illumination conditions, we leverage the algorithm proposed by Thota and others, that combines the strengths of Adaboost and an image based parametric illumination model. We design and develop optimized parallel implementations of these algorithms on graphics processors using the Compute Unified Device Architecture (CUDA), a C-based programming model from NVIDIA. We evaluate our face processing system using both static image databases as well as using live frames captured from a firewire camera under realistic conditions. Our experimental results indicate that our parallel face detector and tracker achieve much greater detection speeds as compared to existing work, while maintaining accuracy. We also demonstrate that our tracking system is robust to extreme illumination conditions.

Posted Content
TL;DR: A biometric system of face detection and recognition in color images based on skin color information and fuzzy classification is presented and it is shown that Gabor coefficients are more powerful than geometric distances.
Abstract: We present in this paper a biometric system of face detection and recognition in color images. The face detection technique is based on skin color information and fuzzy classification. A new algorithm is proposed in order to detect automatically face features (eyes, mouth and nose) and extract their correspondent geometrical points. These fiducial points are described by sets of wavelet components which are used for recognition. To achieve the face recognition, we use neural networks and we study its performances for different inputs. We compare the two types of features used for recognition: geometric distances and Gabor coefficients which can be used either independently or jointly. This comparison shows that Gabor coefficients are more powerful than geometric distances. We show with experimental results how the importance recognition ratio makes our system an effective tool for automatic face detection and recognition.

Journal ArticleDOI
TL;DR: In this paper, a dynamical face model based on a combination of Hidden Markov Models is presented and the underlying architecture closely recalls the neural patterns activated in the perception of moving faces.

Journal ArticleDOI
TL;DR: The proposed approach for the detection of faces in three dimensional scenes is tolerant against partial occlusions produced by the presence of any kind of object and can be used to improve the robustness of all those systems requiring a face detection stage in non-controlled scenarios.
Abstract: This paper presents an innovative approach for the detection of faces in three dimensional scenes. The method is tolerant against partial occlusions produced by the presence of any kind of object. The detection algorithm uses invariant properties of the surfaces to segment salient facial features, namely the eyes and the nose. At least two facial features must be clearly visible in order to perform face detection. Candidate faces are then registered using an ICP (Iterative Correspondent Point) based approach aimed to avoid those samples which belong to the occluding objects. The final face versus non-face discrimination is computed by a Gappy PCA (GPCA) classifier which is able to classify candidate faces using only those regions of the surface which are considered to be non-occluded. The algorithm has been tested using the UND database obtaining 100% of correct detection and only one false alarm. The database has been then processed with an artificial occlusions generator producing realistic acquisitions that emulate unconstrained scenarios. A rate of 89.8% of correct detections shows that 3D data is particularly suited for handling occluding objects. The results have been also verified on a small test set containing real world occlusions obtaining 90.4% of correctly detected faces. The proposed approach can be used to improve the robustness of all those systems requiring a face detection stage in non-controlled scenarios.

Patent
31 Aug 2009
TL;DR: In this article, a face tracker identifies face regions within a series of one or more relatively low resolution reference images, and predicts face region within a main digital image, each including at least one eye.
Abstract: An image acquisition device includes a flash and optical system for capturing digital images. A face tracker identifies face regions within a series of one or more relatively low resolution reference images, and predicts face regions within a main digital image. A face analyzer determines one or more partial face regions within the one or more face regions each including at least one eye. A red-eye filter modifies an area within the main digital image indicative of a red-eye phenomenon based on an analysis of one or more partial face regions within the one or more face regions identified and predicted by the face tracker.

Patent
27 Jul 2009
TL;DR: In this paper, a fake face detection method using range information was proposed, which includes detecting face range information and face features from an input face image; matching the face image with the range information; and distinguishing the fake face by analyzing the matched range information.
Abstract: A fake-face detection method using range information includes: detecting face range information and face features from an input face image; matching the face image with the range information; and distinguishing a fake face by analyzing the matched range information.

Proceedings ArticleDOI
01 Jan 2009
TL;DR: A novel feature called Haar Local Binary Pattern (HLBP) feature for fast and reliable face detection, particularly in adverse imaging conditions, which compares bin values of Local Binary pattern histograms calculated over two adjacent image subregions.
Abstract: Face detection is the first step in many visual processing systems like face recognition, emotion recognition and lip reading. In this paper, we propose a novel feature called Haar Local Binary Pattern (HLBP) feature for fast and reliable face detection, particularly in adverse imaging conditions. This binary feature compares bin values of Local Binary Pattern histograms calculated over two adjacent image subregions. These subregions are similar to those in the Haar masks, hence the name of the feature. They capture the region-specific variations of local texture patterns and are boosted using AdaBoost in a framework similar to that proposed by Viola and Jones. Preliminary results obtained on several standard databases show that it competes well with other face detection systems, especially in adverse illumination conditions.

Patent
08 Jun 2009
TL;DR: In this paper, context-driven assisted face recognition tagging (CDAFRT) tools can access face images associated with a photo gallery to identify individual face images at a specified probability.
Abstract: The described implementations relate to assisted face recognition tagging of digital images, and specifically to context-driven assisted face recognition tagging. In one case, context-driven assisted face recognition tagging (CDAFRT) tools can access face images associated with a photo gallery. The CDAFRT tools can perform context-driven face recognition to identify individual face images at a specified probability. In such a configuration, the probability that the individual face images are correctly identified can be higher than attempting to identify individual face images in isolation.

Patent
Kenji Matsuo1, Kazunori Matsumoto1
15 Sep 2009
TL;DR: The apparatus for registering face identification features can eliminate time and effort for manually retrieving and preparing face images by being provided with a face image retrieving portion 11 for retrieving face image of a person via a network using the person's name as a keyword.
Abstract: The apparatus for registering face identification features can eliminate time and effort for manually retrieving and preparing face images by being provided with a face image retrieving portion 11 for retrieving a face image of a person via a network using the person's name as a keyword, a face feature extracting portion 12 for extracting features, which greatly influence identification of a person, from the face images retrieved by the face image retrieving portion 11, and a celebrity name, face image and feature database 13 for registering the face images retrieved by the face image retrieving portion 11 and the face features extracted by the face feature extracting portion 12 in a state where they are associated with the person's names.

Patent
12 Jan 2009
TL;DR: A face image processing method is applied to an electronic device, such that the electronic device can perform a face detection to a digital image to obtain a face image in the digital image automatically, and perform a skin color detection to the face image to exclude non-skin features such as eyes, eyeglasses, eyebrows, a moustache, a mouth and nostrils on the face images as mentioned in this paper.
Abstract: A face image processing method is applied to an electronic device, such that the electronic device can perform a face detection to a digital image to obtain a face image in the digital image automatically, and perform a skin color detection to the face image to exclude non-skin features such as eyes, eyeglasses, eyebrows, a moustache, a mouth and nostrils on the face image, and form a skin mask in an area range of the face image belonging to skin color, and finally perform a filtering process to the area range of the face image corresponding to the skin mask to filter high-frequency, mid-frequency and low-frequency noises of an abnormal skin color in the area range of the face image, so as to quickly remove blemishes and dark spots existed in the area range of the face image.

Journal ArticleDOI
TL;DR: A survey of 3D reconstruction methods used for generating the 3D appearance of a face using either a single or multiple 2D images captured with ordinary equipment such as digital cameras and camcorders is provided.
Abstract: The use of 3D data in face image processing applications has received considerable attention during the last few years. A major issue for the implementation of 3D face processing systems is the accurate and real time acquisition of 3D faces using low cost equipment. In this paper we provide a survey of 3D reconstruction methods used for generating the 3D appearance of a face using either a single or multiple 2D images captured with ordinary equipment such as digital cameras and camcorders. In this context we discuss various issues pertaining to the general problem of 3D face reconstruction such as the existence of suitable 3D face databases, correspondence of 3D faces, feature detection, deformable 3D models and typical assumptions used during the reconstruction process. Different approaches to the problem of 3D reconstruction are presented and for each category the most important advantages and disadvantages are outlined. In particular we describe example-based methods, stereo methods, video-based methods and silhouette-based methods. The issue of performance evaluation of 3D face reconstruction algorithms, the state of the art and future trends are also discussed.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: Improved method of obtaining background image based on common region is cited, and the introduction of the smoothing coefficient can avoid the mutation of current threshold.
Abstract: Moving object detection is a very important research topic of computer vision and video processing areas. The process of moving object detection based on the background extraction is divided into two steps, background extraction and moving object detection. Improved method of obtaining background image based on common region is cited. The basic idea is to capture a series of video pictures of the scene at regular intervals, the picture is divided into of m*m blocks which expectation and variance are calculated respectively to describe the vector information of the region. A new acquiring threshold method is brought forward when extracting the moving object. The arithmetic mean of original iterative method is replaced by weighted mean and the average gray of the foreground is higher than the average gray of background. Then, the threshold is also increased to some extent. The introduction of the smoothing coefficient can avoid the mutation of current threshold. The experiments show that the scheme can realize the moving object detection effectively, and it has high definition.