scispace - formally typeset
Search or ask a question

Showing papers on "Object-class detection published in 2007"


Proceedings ArticleDOI
26 Dec 2007
TL;DR: This paper describes face data as resulting from a generative model which incorporates both within- individual and between-individual variation, and calculates the likelihood that the differences between face images are entirely due to within-individual variability.
Abstract: Many current face recognition algorithms perform badly when the lighting or pose of the probe and gallery images differ. In this paper we present a novel algorithm designed for these conditions. We describe face data as resulting from a generative model which incorporates both within-individual and between-individual variation. In recognition we calculate the likelihood that the differences between face images are entirely due to within-individual variability. We extend this to the non-linear case where an arbitrary face manifold can be described and noise is position-dependent. We also develop a "tied" version of the algorithm that allows explicit comparison across quite different viewing conditions. We demonstrate that our model produces state of the art results for (i) frontal face recognition (ii) face recognition under varying pose.

1,099 citations


Journal ArticleDOI
TL;DR: An active near infrared (NIR) imaging system is presented that is able to produce face images of good condition regardless of visible lights in the environment, and it is shown that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone.
Abstract: Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups

598 citations


Journal ArticleDOI
TL;DR: A series of innovative methods are proposed to construct a high-performance rotation invariant multiview face detector, including the width-first-search (WFS) tree detector structure, the vector boosting algorithm for learning vector-output strong classifiers, the domain-partition-based weak learning method, the sparse feature in granular space, and the heuristic search for sparse feature selection.
Abstract: Rotation invariant multiview face detection (MVFD) aims to detect faces with arbitrary rotation-in-plane (RIP) and rotation-off-plane (ROP) angles in still images or video sequences. MVFD is crucial as the first step in automatic face processing for general applications since face images are seldom upright and frontal unless they are taken cooperatively. In this paper, we propose a series of innovative methods to construct a high-performance rotation invariant multiview face detector, including the width-first-search (WFS) tree detector structure, the vector boosting algorithm for learning vector-output strong classifiers, the domain-partition-based weak learning method, the sparse feature in granular space, and the heuristic search for sparse feature selection. As a result of that, our multiview face detector achieves low computational complexity, broad detection scope, and high detection accuracy on both standard testing sets and real-life images

411 citations


Proceedings ArticleDOI
26 Dec 2007
TL;DR: The alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images.
Abstract: Many recognition algorithms depend on careful positioning of an object into a canonical pose, so the position of features relative to a fixed coordinate system can be examined. Currently, this positioning is done either manually or by training a class-specialized learning algorithm with samples of the class that have been hand-labeled with parts or poses. In this paper, we describe a novel method to achieve this positioning using poorly aligned examples of a class with no additional labeling. Given a set of unaligned examplars of a class, such as faces, we automatically build an alignment mechanism, without any additional labeling of parts or poses in the data set. Using this alignment mechanism, new members of the class, such as faces resulting from a face detector, can be precisely aligned for the recognition process. Our alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images. We also demonstrate its use on an entirely different class of objects (cars), again without providing any information about parts or pose to the learning algorithm.

375 citations


Proceedings ArticleDOI
Zhan Chaohui1, Duan Xiao-hui1, Xu Shuoyu1, Song Zheng1, Luo Min1 
22 Aug 2007
TL;DR: An improved algorithm based on frame difference and edge detection is presented for moving object detection that has a high recognition rate and a high detection speed, which has a broad market prospect.
Abstract: Moving object detection is very important in intelligent surveillance. In this paper, an improved algorithm based on frame difference and edge detection is presented for moving object detection. First of all, it detects the edges of each two continuous frames by Canny detector and gets the difference between the two edge images. And then, it divides the edge difference image into several small blocks and decides if they are moving areas by comparing the number of non-zero pixels to a threshold. At last, it does the block-connected component labeling to get the smallest rectangle that contains the moving object. Experimental results show the improved algorithm overcomes the shortcomings of the frame difference method. It has a high recognition rate and a high detection speed, which has a broad market prospect.

236 citations


Journal ArticleDOI
TL;DR: The system described in this paper has been designed taking into account the temporal coherence contained in a video stream in order to build a robust detector, which goes beyond traditional face detection approaches normally designed for still images.

191 citations


Proceedings ArticleDOI
01 Oct 2007
TL;DR: This paper provides a comprehensive classification of a collision detection literature into the two phases: broad-phase and narrow-phase.
Abstract: A process of determining whether two or more bodies are making contact at one or more points is called collision detection or intersection detection Collision detection is inseparable part of the computer graphics, surgical simulations, and robotics There are varieties of methods for collision detection We will review some of the most common ones Algorithms for contact determination can be grouped into two general parts: broad-phase and narrow-phase This paper provides a comprehensive classification of a collision detection literature into the two phases Moreover, we have attempted to explain some of the existing algorithms which are not easy to interpret Also, we have tried to keep sections self-explanatory without sacrificing depth of coverage

187 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: An object class detection approach which fully integrates the complementary strengths offered by shape matchers, and can localize object boundaries accurately, while needing no segmented examples for training (only bounding-boxes).
Abstract: We present an object class detection approach which fully integrates the complementary strengths offered by shape matchers. Like an object detector, it can learn class models directly from images, and localize novel instances in the presence of intra-class variations, clutter, and scale changes. Like a shape matcher, it finds the accurate boundaries of the objects, rather than just their bounding-boxes. This is made possible by 1) a novel technique for learning a shape model of an object class given images of example instances; 2) the combination of Hough-style voting with a non-rigid point matching algorithm to localize the model in cluttered images. As demonstrated by an extensive evaluation, our method can localize object boundaries accurately, while needing no segmented examples for training (only bounding-boxes).

158 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: It is found that it is possible to get detection performance in IR images that is comparable to state-of-the-art results for visible spectrum images, and that the two domains share many features, likely originating from the silhouettes, in spite of the starkly different appearances of the two modalities.
Abstract: Use of IR images is advantageous for many surveillance applications where the systems must operate around the clock and external illumination is not always available. We investigate the methods derived from visible spectrum analysis for the task of human detection. Two feature classes (edgelets and HOG features) and two classification models(AdaBoost and SVM cascade) are extended to IR images. We find out that it is possible to get detection performance in IR images that is comparable to state-of-the-art results for visible spectrum images. It is also shown that the two domains share many features, likely originating from the silhouettes, in spite of the starkly different appearances of the two modalities.

152 citations


Patent
31 Jan 2007
TL;DR: In this paper, a 3D face reconstruction technique using 2D images, such as photographs of a face, is described, where prior face knowledge or a generic face is used to extract sparse 3D information from the images and to identify image pairs.
Abstract: A 3D face reconstruction technique using 2D images, such as photographs of a face, is described. Prior face knowledge or a generic face is used to extract sparse 3D information from the images and to identify image pairs. Bundle adjustment is carried out to determine more accurate 3D camera positions, image pairs are rectified, and dense 3D face information is extracted without using the prior face knowledge. Outliers are removed, e.g., by using tensor voting. A 3D surface is extracted from the dense 3D information and surface detail is extracted from the images.

152 citations


Proceedings ArticleDOI
26 Dec 2007
TL;DR: This work proposes a multi-resolution framework inspired by human visual search for general object detection that produced better performance for pedestrian detection than state-of-the-art methods, and was faster during both training and testing.
Abstract: We propose a multi-resolution framework inspired by human visual search for general object detection. Different resolutions are represented using a coarse-to-fine feature hierarchy. During detection, the lower resolution features are initially used to reject the majority of negative windows at relatively low cost, leaving a relatively small number of windows to be processed in higher resolutions. This enables the use of computationally more expensive higher resolution features to achieve high detection accuracy. We applied this framework on Histograms of Oriented Gradient (HOG) features for object detection. Our multi-resolution detector produced better performance for pedestrian detection than state-of-the-art methods (Dalal and Triggs, 2005), and was faster during both training and testing. Testing our method on motorbikes and cars from the VOC database revealed similar improvements in both speed and accuracy, suggesting that our approach is suitable for realtime general object detection applications.

Book ChapterDOI
18 Dec 2007
TL;DR: The proposed approach consists of two parts: object detection and the use of a fall model, which uses an adaptive background subtraction method to detect a moving object and mark it with its minimum-bounding box.
Abstract: In this paper, we present an approach for human fall detection, which has important applications in the field of safety and security. The proposed approach consists of two parts: object detection and the use of a fall model. We use an adaptive background subtraction method to detect a moving object and mark it with its minimum-bounding box. The fall model uses a set of extracted features to analyze, detect and confirm a fall. We implement a two-state finite state machine (FSM) to continuously monitor people and their activities. Experimental results show that our method can detect most of the possible types of single human falls quite accurately.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This paper proposes an approach for object class localization which goes beyond bounding boxes, as it also determines the outline of the object, and directly generates, evaluates and clusters shape masks.
Abstract: This paper proposes an approach for object class localization which goes beyond bounding boxes, as it also determines the outline of the object. Unlike most current localization methods, our approach does not require any hypothesis parameter space to be defined. Instead, it directly generates, evaluates and clusters shape masks. Thus, the presented framework produces more informative results for object class localization. For example, it easily learns and detects possible object viewpoints and articulations, which are often well characterized by the object outline. We evaluate the proposed approach on the challenging natural-scene Graz-02 object classes dataset. The results demonstrate the extended localization capabilities of our method.

Journal ArticleDOI
TL;DR: A stereo system for the detection of pedestrians using far-infrared cameras that exploits three different detection approaches: warm area detection, edge-based detection, and disparity computation.

Proceedings ArticleDOI
22 Oct 2007
TL;DR: The experimental results show good face detection performance and average authentication rates of 82% for small-sized faces and 96% for faces of 80times80 pixels, and the obtained results are very promising and assess the feasibility of face authentication in mobile phones.
Abstract: Computer vision applications for mobile phones are gaining increasing attention due to several practical needs resulting from the popularity of digital cameras in today's mobile phones. In this work, we consider the task of face detection and authentication in mobile phones and experimentally analyze a face authentication scheme using Haar-like features with Ad-aBoost for face and eye detection, and local binary pattern (LBP) approach for face authentication. For comparison, another approach to face detection using skin color for fast processing is also considered and implemented. Despite the limited CPU and memory capabilities of today's mobile phones, our experimental results show good face detection performance and average authentication rates of 82% for small-sized faces (40times40 pixels) and 96% for faces of 80times80 pixels. The system is running at 2 frames per second for images of 320times240 pixels. The obtained results are very promising and assess the feasibility of face authentication in mobile phones. Directions for further enhancing the performance of the system are also discussed.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: A novel object class detection method based on 3D object modeling that establishes spatial connections between multiple 2D training views by mapping them directly to the surface of 3D model.
Abstract: In this paper, a novel object class detection method based on 3D object modeling is presented. Instead of using a complicated mechanism for relating multiple 2D training views, the proposed method establishes spatial connections between these views by mapping them directly to the surface of 3D model. The 3D shape of an object is reconstructed by using a homographic framework from a set of model views around the object and is represented by a volume consisting of binary slices. Features are computed in each 2D model view and mapped to the 3D shape model using the same homographic framework. To generalize the model for object class detection, features from supplemental views are also considered. A codebook is constructed from all of these features and then a 3D feature model is built. Given a 2D test image, correspondences between the 3D feature model and the testing view are identified by matching the detected features. Based on the 3D locations of the corresponding features, several hypotheses of viewing planes can be made. The one with the highest confidence is then used to detect the object using feature location matching. Performance of the proposed method has been evaluated by using the PASCAL VOC challenge dataset and promising results are demonstrated.

Patent
19 Mar 2007
TL;DR: In this article, a method of modifying the viewing parameters of digital images using face detection for achieving a desired spatial parameters based on one or more sub-groups of pixels that correspond to the facial features of the face is presented.
Abstract: A method of modifying the viewing parameters of digital images using face detection for achieving a desired spatial parameters based on one or more sub-groups of pixels that correspond to one or more facial features of the face. Such methods may be used for animating still images, automating and streamlining application such as the creation of slide shows and screen savers of images containing faces.

Journal ArticleDOI
01 Oct 2007
TL;DR: A face mosaicing scheme that generates a composite face image during enrollment based on the evidence provided by frontal and semiproflle face images of an individual is described.
Abstract: Mosaicing entails the consolidation of information represented by multiple images through the application of a registration and blending procedure. We describe a face mosaicing scheme that generates a composite face image during enrollment based on the evidence provided by frontal and semiproflle face images of an individual. Face mosaicing obviates the need to store multiple face templates representing multiple poses of a user's face image. In the proposed scheme, the side profile images are aligned with the frontal image using a hierarchical registration algorithm that exploits neighborhood properties to determine the transformation relating the two images. Multiresolution splining is then used to blend the side profiles with the frontal image, thereby generating a composite face image of the user. A texture-based face recognition technique that is a slightly modified version of the C2 algorithm proposed by Serre et al. is used to compare a probe face image with the gallery face mosaic. Experiments conducted on three different databases indicate that face mosaicing, as described in this paper, offers significant benefits by accounting for the pose variations that are commonly observed in face images.

Patent
12 Jun 2007
TL;DR: In this article, an active appearance model (AAM) is applied including an interchannel-decorrelated color space and one or more parameters of the model are matched to the image.
Abstract: A face detection and/or detection method includes acquiring a digital color image. An active appearance model (AAM) is applied including an interchannel-decorrelated color space. One or more parameters of the model are matched to the image. Face detection results based on the matching and/or different results incorporating the face detection result are communicated.

Patent
23 Jul 2007
TL;DR: In this paper, face detection is performed within a second window at a second location, wherein the second location is determined based on the confidence level of the first window. But face detection can be performed at any location in the image.
Abstract: A method of detecting a face in an image includes performing face detection within a first window of the image at a first location. A confidence level is obtained from the face detection indicating a probability of the image including a face at or in the vicinity of the first location. Face detection is then performed within a second window at a second location, wherein the second location is determined based on the confidence level.

Patent
10 Apr 2007
TL;DR: In this article, an electronic camera includes an image pickup device, a memory, face detecting section, a face recognizing section, and an object specifying section, where the face detection section detects face areas in a shooting image plane based on the image signal and extracts characterizing points of faces of objects from the face areas.
Abstract: An electronic camera includes an image pickup device, a memory, a face detecting section, a face recognizing section, and an object specifying section. The image pickup device photo-electrically converts an image of an object into an electric signal and generates an image signal as the electric signal. The memory has recorded registration data representing characterizing points of faces as recognizing targets. The face detecting section detects face areas in a shooting image plane based on the image signal and extracts characterizing points of faces of objects from the face areas. The face recognizing section determines whether or not the face areas are the recognizing targets based on data of the characterizing points corresponding to the face areas and on the registration data. The object specifying section specifies as a main object an object present on nearest side of the electronic camera of objects as the recognizing targets.

Proceedings ArticleDOI
10 Sep 2007
TL;DR: The goal was to develop an automatic process to be embedded in a face recognition system, using only range images as input, and the approach combines traditional image segmentation techniques for face segmentation and detect facial features by combining an adapted method for 2D facial features extraction with the surface curvature information.
Abstract: This paper presents our methodology for face and facial features detection to improve 3D face recognition in a presence of facial expression variation. Our goal was to develop an automatic process to be embedded in a face recognition system, using only range images as input. To do that, our approach combines traditional image segmentation techniques for face segmentation and detect facial features by combining an adapted method for 2D facial features extraction with the surface curvature information. The experiments were performed in a large, well-known face image database available on the Biometric Experimentation Environment (BEE), including 4,950 images. The results confirms that our method is efficient for the proposed application.

Patent
Jay Yagnik1
29 Nov 2007
TL;DR: In this article, a method for identifying a named entity, retrieving images associated with the named entity and using a face detection algorithm to perform face detection on retrieved images to detect faces in the retrieved images.
Abstract: A method includes identifying a named entity, retrieving images associated with the named entity, and using a face detection algorithm to perform face detection on the retrieved images to detect faces in the retrieved images. At least one representative face image from the retrieved images is identified, and the representative face image is used to identify one or more additional images representing the at least one named entity.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This work presents a generative object model that is capable to scale from a general object class model to a more specific object-instance model that allows to detect class instances as well as to distinguish between individual object instances reliably.
Abstract: Object class detection in scenes of realistic complexity remains a challenging task in computer vision. Most recent approaches focus on a single and general model for object class detection. However, in particular in the context of image sequences, it may be advantageous to adapt the general model to a more object-instance specific model in order to detect this particular object reliably within the image sequence. In this work we present a generative object model that is capable to scale from a general object class model to a more specific object-instance model. This allows to detect class instances as well as to distinguish between individual object instances reliably. We experimentally evaluate the performance of the proposed system on both still images and image sequences.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This method seeks to balance the skewness of the labels presented to the weak classifiers, allowing them to be trained more equally, and introduces an extra constraint when propagating the weights of the data points from one weak classifier to another, allowing the algorithm to converge faster.
Abstract: We present an integrated framework for learning asymmetric boosted classifiers and online learning to address the problem of online learning asymmetric boosted classifiers, which is applicable to object detection problems. In particular, our method seeks to balance the skewness of the labels presented to the weak classifiers, allowing them to be trained more equally. In online learning, we introduce an extra constraint when propagating the weights of the data points from one weak classifier to another, allowing the algorithm to converge faster. In compared with the Online Boosting algorithm recently applied to object detection problems, we observed about 0-10% increase in accuracy, and about 5-30% gain in learning speed.

Proceedings ArticleDOI
22 Aug 2007
TL;DR: A novel real-time human detection system based on Viola's face detection framework and Histograms of Oriented Gradients (HOG) features is presented, which keeps both the discriminative power of HOG features for human detection and the real- time property of Viola’s face Detection framework.
Abstract: In this paper, a novel real-time human detection system based on Viola's face detection framework and Histograms of Oriented Gradients (HOG) features is presented. Each bin of the histogram is treated as a feature and used as the basic building element of the cascade classifier. The system keeps both the discriminative power of HOG features for human detection and the real-time property of Viola's face detection framework. Experiments on Daimler Chrysler pedestrian benchmark data set and INRIA human database demonstrate that this framework is more powerful than Viola's object detection framework on human detection.

Proceedings ArticleDOI
05 Sep 2007
TL;DR: The approach is inspired by human's visual cognition processes and builds upon a multi-tier video tracking paradigm with main layers being the spatially based "peripheral tracking" loosely corresponding to the peripheral vision and the object based "vision tunnels " for focused attention and analysis of tracked objects.
Abstract: This paper presents an approach to detect stationary foreground objects in naturally busy surveillance video scenes with several moving objects. Our approach is inspired by human's visual cognition processes and builds upon a multi-tier video tracking paradigm with main layers being the spatially based "peripheral tracking" loosely corresponding to the peripheral vision and the object based "vision tunnels " for focused attention and analysis of tracked objects. Humans allocate their attention to different aspects of objects and scenes based on a defined task. In our model, a specific processing layer corresponding to allocation of attention is used for detection of objects that become stationary. The static object layer, a natural extension of this framework, detects and maintains the stationary foreground objects by using the moving object and scene information from Peripheral Tracker and the Scene Description layers. Simple event detection modules then use the enduring stationary objects to determine events such as Parked Vehicles or Abandoned Bags.

Patent
03 Sep 2007
TL;DR: In this paper, a video object segmentation method takes advantage of edge and color features in conjunction with edge detection and change detection to improve the accuracy of video segmentation for rainy situations.
Abstract: A video object segmentation method takes advantage of edge and color features in conjunction with edge detection and change detection to improve the accuracy of video object segmentation for rainy situations. The video object segmentation method of the present invention includes analyzing HSI-color information of the initially extracted objects to obtain features of the moving object; performing edge detection to obtain edges of the moving object for reducing the effect of confusing raindrops with moving objects in rainy dynamic background; performing object region detection to generate an accurate object mask for solving the uncovered-background problem and the still-object problem; and employing a bounding-box based matching method for solving the reflection problem of the moving object in the rained ground.

Journal ArticleDOI
TL;DR: New features based on anisotropic Gaussian filters for detecting frontal faces in complex images for face detection in face recognition or facial expression analysis are proposed.

Book ChapterDOI
27 Aug 2007
TL;DR: This work proposes to overcome the pose problem by automatically reconstructing a 3D face model from multiple non-frontal frames in a video, generating a frontal view from the derived 3D model, and using a commercial 2D face recognition engine to recognize the synthesized frontal view.
Abstract: Face recognition in video has gained wide attention due to its role in designing surveillance systems One of the main advantages of video over still frames is that evidence accumulation over multiple frames can provide better face recognition performance However, surveillance videos are generally of low resolution containing faces mostly in non-frontal poses Consequently, face recognition in video poses serious challenges to state-of-the-art face recognition systems Use of 3D face models has been suggested as a way to compensate for low resolution, poor contrast and non-frontal pose We propose to overcome the pose problem by automatically (i) reconstructing a 3D face model from multiple non-frontal frames in a video, (ii) generating a frontal view from the derived 3D model, and (iii) using a commercial 2D face recognition engine to recognize the synthesized frontal view A factorization-based structure from motion algorithm is used for 3D face reconstruction The proposed scheme has been tested on CMU's Face In Action (FIA) video database with 221 subjects Experimental results show a 40% improvement in matching performance as a result of using the 3D models