scispace - formally typeset
Search or ask a question

Showing papers on "Object-class detection published in 2006"


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper investigates the application of the SIFT approach in the context of face authentication, and proposes and tests different matching schemes using the BANCA database and protocol, showing promising results.
Abstract: Several pattern recognition and classification techniques have been applied to the biometrics domain. Among them, an interesting technique is the Scale Invariant Feature Transform (SIFT), originally devised for object recognition. Even if SIFT features have emerged as a very powerful image descriptors, their employment in face analysis context has never been systematically investigated. This paper investigates the application of the SIFT approach in the context of face authentication. In order to determine the real potential and applicability of the method, different matching schemes are proposed and tested using the BANCA database and protocol, showing promising results.

386 citations


Book ChapterDOI
07 May 2006
TL;DR: The BFM detector is able to represent and detect object classes principally defined by their shape, rather than their appearance, and to achieve this with less supervision (such as the number of training images).
Abstract: The objective of this work is the detection of object classes, such as airplanes or horses. Instead of using a model based on salient image fragments, we show that object class detection is also possible using only the object's boundary. To this end, we develop a novel learning technique to extract class-discriminative boundary fragments. In addition to their shape, these “codebook” entries also determine the object's centroid (in the manner of Leibe et al. [19]). Boosting is used to select discriminative combinations of boundary fragments (weak detectors) to form a strong “Boundary-Fragment-Model” (BFM) detector. The generative aspect of the model is used to determine an approximate segmentation. We demonstrate the following results: (i) the BFM detector is able to represent and detect object classes principally defined by their shape, rather than their appearance; and (ii) in comparison with other published results on several object classes (airplanes, cars-rear, cows) the BFM detector is able to exceed previous performances, and to achieve this with less supervision (such as the number of training images).

376 citations


Journal ArticleDOI
TL;DR: The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality, and a representative single performance value is computed from the graphs.
Abstract: Evaluation of object detection algorithms is a non-trivial task: a detection result is usually evaluated by comparing the bounding box of the detected object with the bounding box of the ground truth object. The commonly used precision and recall measures are computed from the overlap area of these two rectangles. However, these measures have several drawbacks: they don't give intuitive information about the proportion of the correctly detected objects and the number of false alarms, and they cannot be accumulated across multiple images without creating ambiguity in their interpretation. Furthermore, quantitative and qualitative evaluation is often mixed resulting in ambiguous measures. In this paper we propose a new approach which tackles these problems. The performance of a detection algorithm is illustrated intuitively by performance graphs which present object level precision and recall depending on constraints on detection quality. In order to compare different detection algorithms, a representative single performance value is computed from the graphs. The influence of the test database on the detection performance is illustrated by performance/generality graphs. The evaluation method can be applied to different types of object detection algorithms. It has been tested on different text detection algorithms, among which are the participants of the ICDAR 2003 text detection competition.

353 citations


Dissertation
17 Jul 2006
TL;DR: This thesis introduces grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images and proposes descriptors based on oriented histograms of differential optical flow to detect moving humans in videos.
Abstract: This thesis targets the detection of humans and other object classes in images and videos. Our focus is on developing robust feature extraction algorithms that encode image regions as highdimensional feature vectors that support high accuracy object/non-object decisions. To test our feature sets we adopt a relatively simple learning framework that uses linear Support Vector Machines to classify each possible image region as an object or as a non-object. The approach is data-driven and purely bottom-up using low-level appearance and motion vectors to detect objects. As a test case we focus on person detection as people are one of the most challenging object classes with many applications, for example in film and video analysis, pedestrian detection for smart cars and video surveillance. Nevertheless we do not make any strong class specific assumptions and the resulting object detection framework also gives state-of-the-art performance for many other classes including cars, motorbikes, cows and sheep. This thesis makes four main contributions. Firstly, we introduce grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images. The HOG descriptors are computed over dense and overlapping grids of spatial blocks, with image gradient orientation features extracted at fixed resolution and gathered into a highdimensional feature vector. They are designed to be robust to small changes in image contour locations and directions, and significant changes in image illumination and colour, while remaining highly discriminative for overall visual form. We show that unsmoothed gradients, fine orientation voting, moderately coarse spatial binning, strong normalisation and overlapping blocks are all needed for good performance. Secondly, to detect moving humans in videos, we propose descriptors based on oriented histograms of differential optical flow. These are similar to static HOG descriptors, but instead of image gradients, they are based on local differentials of dense optical flow. They encode the noisy optical flow estimates into robust feature vectors in a manner that is robust to the overall camera motion. Several variants are proposed, some capturing motion boundaries while others encode the relative motions of adjacent image regions. Thirdly, we propose a general method based on kernel density estimation for fusing multiple overlapping detections, that takes into account the number of detections, their confidence scores and the scales of the detections. Lastly, we present work in progress on a parts based approach to person detection that first detects local body parts like heads, torso, and legs and then fuses them to create a global overall person detector.

340 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper addresses the problem of detecting and segmenting partially occluded objects of a known category by defining a part labelling which densely covers the object and imposing asymmetric local spatial constraints on these labels to ensure the consistent layout of parts whilst allowing for object deformation.
Abstract: This paper addresses the problem of detecting and segmenting partially occluded objects of a known category. We first define a part labelling which densely covers the object. Our Layout Consistent Random Field (LayoutCRF) model then imposes asymmetric local spatial constraints on these labels to ensure the consistent layout of parts whilst allowing for object deformation. Arbitrary occlusions of the object are handled by avoiding the assumption that the whole object is visible. The resulting system is both efficient to train and to apply to novel images, due to a novel annealed layout-consistent expansion move algorithm paired with a randomised decision tree classifier. We apply our technique to images of cars and faces and demonstrate state-of-the-art detection and segmentation performance even in the presence of partial occlusion.

318 citations


Book ChapterDOI
07 May 2006
TL;DR: An extensive experimental evaluation on detecting five diverse object classes over hundreds of images demonstrates that the proposed method works in very cluttered images, allows for scale changes and considerable intra-class shape variation, is robust to interrupted contours, and is computationally efficient.
Abstract: We propose a method for object detection in cluttered real images, given a single hand-drawn example as model. The image edges are partitioned into contour segments and organized in an image representation which encodes their interconnections: the Contour Segment Network. The object detection problem is formulated as finding paths through the network resembling the model outlines, and a computationally efficient detection technique is presented. An extensive experimental evaluation on detecting five diverse object classes over hundreds of images demonstrates that our method works in very cluttered images, allows for scale changes and considerable intra-class shape variation, is robust to interrupted contours, and is computationally efficient.

317 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: The Implicit Shape Model for object class detection is combined with the multi-view specific object recognition system of Ferrari et al. to detect object instances from arbitrary viewpoints.
Abstract: We present a novel system for generic object class detection. In contrast to most existing systems which focus on a single viewpoint or aspect, our approach can detect object instances from arbitrary viewpoints. This is achieved by combining the Implicit Shape Model for object class detection proposed by Leibe and Schiele with the multi-view specific object recognition system of Ferrari et al. After learning single-view codebooks, these are interconnected by so-called activation links, obtained through multi-view region tracks across different training views of individual object instances. During recognition, these integrated codebooks work together to determine the location and pose of the object. Experimental results demonstrate the viability of the approach and compare it to a bank of independent single-view detectors

268 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: The performance of the proposed multi-object class detection approach is competitive to state of the art approaches dedicated to a single object class recognition problem.
Abstract: In this paper we propose an approach capable of simultaneous recognition and localization of multiple object classes using a generative model. A novel hierarchical representation allows to represent individual images as well as various objects classes in a single, scale and rotation invariant model. The recognition method is based on a codebook representation where appearance clusters built from edge based features are shared among several object classes. A probabilistic model allows for reliable detection of various objects in the same image. The approach is highly efficient due to fast clustering and matching methods capable of dealing with millions of high dimensional features. The system shows excellent performance on several object categories over a wide range of scales, in-plane rotations, background clutter, and partial occlusions. The performance of the proposed multi-object class detection approach is competitive to state of the art approaches dedicated to a single object class recognition problem.

266 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work presents an approach based on representing humans as an assembly of four body parts and detection of the body parts in single frames which makes the method insensitive to camera motions.
Abstract: Tracking of humans in videos is important for many applications. A major source of difficulty in performing this task is due to inter-human or scene occlusion. We present an approach based on representing humans as an assembly of four body parts and detection of the body parts in single frames which makes the method insensitive to camera motions. The responses of the body part detectors and a combined human detector provide the "observations" used for tracking. Trajectory initialization and termination are both fully automatic and rely on the confidences computed from the detection responses. An object is tracked by data association if its corresponding detection response can be found; otherwise it is tracked by a meanshift style tracker. Our method can track humans with both inter-object and scene occlusions. The system is evaluated on three sets of videos and compared with previous method.

263 citations


Journal ArticleDOI
TL;DR: This work presents an innovative method that combines a feature-based approach with a holistic one for three-dimensional (3D) face detection, which has been tested, with good results, on some 150 3D faces acquired by a laser range scanner.

246 citations


Journal ArticleDOI
TL;DR: A circle detection method based on genetic algorithms that uses the encoding of three edge points as the chromosome of candidate circles in the edge image of the scene to detect circles with sub-pixellic accuracy on synthetic images and on natural images.

Journal ArticleDOI
TL;DR: It can be shown experimentally that smoothing the face trajectories leads to a significant reduction of false detections compared to the static detector without the presented tracking extension, which is useful for improving speed and accuracy of the system.

Proceedings ArticleDOI
20 Aug 2006
TL;DR: This paper proposes a license plate detection algorithm using both global statistical features and local Haar-like features, which makes the final classifier invariant to the brightness, color, size and position of license plates.
Abstract: This paper proposes a license plate detection algorithm using both global statistical features and local Haar-like features. Classifiers using global statistical features are constructed firstly through simple learning procedures. Using these classifiers, more than 70% of background area can be excluded from further training or detecting. Then the AdaBoost learning algorithm is used to build up the other classifiers based on selected local Haar-like features. Combining the classifiers using the global features and the local features, we obtain a cascade classifier. The classifiers based on global features decrease the complexity of the system. They are followed by the classifiers based on local Haar-like features, which makes the final classifier invariant to the brightness, color, size and position of license plates. The encouraging detection rate is achieved in the experiments.

Journal Article
TL;DR: In this article, the image edges are partitioned into contour segments and organized in an image representation which encodes their interconnections, and the object detection problem is formulated as finding paths through the network resembling the model outlines, and a computationally efficient detection technique is presented.
Abstract: We propose a method for object detection in cluttered real images, given a single hand-drawn example as model. The image edges are partitioned into contour segments and organized in an image representation which encodes their interconnections: the Contour Segment Network. The object detection problem is formulated as finding paths through the network resembling the model outlines, and a computationally efficient detection technique is presented. An extensive experimental evaluation on detecting five diverse object classes over hundreds of images demonstrates that our method works in very cluttered images, allows for scale changes and considerable intra-class shape variation, is robust to interrupted contours, and is computationally efficient.

Patent
Setsuo Tokoro1, Jun Tsuchida1
31 Oct 2006
TL;DR: In this article, an object detection device, including an imaging unit mounted on a movable body, calculates an image displacement of a partial image between two images captured by the imaging unit at different times, and performs detection processing to detect an object in an image based on at least the image displacement.
Abstract: An object detection device, including: an imaging unit (400) that is mounted on a movable body; an object detection unit (201) that calculates an image displacement of a partial image between two images captured by the imaging unit (400) at different times, and performs detection processing to detect an object in an image based on at least the image displacement; and a control unit (201) that changes a manner of performing the detection processing based on a position in the image in a lateral direction of the movable body.

Proceedings Article
01 Jan 2006
TL;DR: This paper presents a practical and scalable method to efficiently detect many adult-content images, specifically pornographic images, in a search engine that covers a large fraction of the images on the WWW.
Abstract: As more people start using the Internet and more content is placed online, the chances that individuals will encounter inappropriate or unwanted adult-oriented content increases. This paper presents a practical and scalable method to efficiently detect many adult-content images, specifically pornographic images. We currently use this system in a search engine that covers a large fraction of the images on the WWW. For each image, face detection is applied and a number of summary features are computed; the results are then fed to a support vector machine for classification. The results show that a significant fraction of adult-content images can be detected.

Patent
22 Nov 2006
TL;DR: In this paper, a face detection section detects a face area inside a shooting screen based on the moving image data, and a controlling section adjusts shooting parameters of the shooting optical system, depending on a position detected at the face area.
Abstract: An image sensor of an electronic camera photoelectrically converts a subject image obtained by a shooting optical system to generate an image signal. A image processing section generates face registration image data and moving image data. A face detecting section detects a face area inside a shooting screen based on the moving image data. A controlling section adjusts shooting parameters of the shooting optical system, depending on a position detected at the face area. A face image generating section cuts out an image of the face area to generate face image data. A face recognizing data generating section extracts feature points of the face of a captured person from a part of the face area of the face registration image data and generates face recognizing data. A recording section records the face recognizing data or face image data.

Journal ArticleDOI
TL;DR: An improved face region extraction algorithm and a light dots detection algorithm are proposed for better eye detection performance.

Journal ArticleDOI
TL;DR: This paper presents a novel face detection method by applying discriminating feature analysis (DFA) and support vector machine (SVM), which achieves 98.2% correct face detection rate with two false detections.

Journal ArticleDOI
TL;DR: This survey presents a brief analysis of single camera object detection and tracking methods and gives a comparison of their computational complexities.
Abstract: In this survey, we present a brief analysis of single camera object detection and tracking methods. We also give a comparison of their computational complexities. These methods are designed to accurately perform under difficult conditions such as erratic motion, drastic illumination change, and noise contamination.

Patent
Yunqian Ma1, Qian Yu1, Isaac Cohen1
21 Nov 2006
TL;DR: In this article, a detection system fusing motion detection and object detection for various applications such as tracking, identification, and so forth is proposed, where model information is developed and used to reduce the false alarm rate.
Abstract: A detection system fusing motion detection and object detection for various applications such as tracking, identification, and so forth. In an application framework, model information may be developed and used to reduce the false alarm rate. With a background model, motion likelihood for each pixel, for instance, of a surveillance image, may be acquired. With a target model, object likelihood for each pixel of the image may also be acquired. By joining these two likelihood distributions, detection accuracy may be significantly improved over the use of just one likelihood distribution for detection in applications such as tracking.

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper proposes a novel nonparametric method that first extracts discriminating local models via clustering according to some distance measure on the appearance manifold to cluster similar face images together and shows that out method significantly outperforms other methods.
Abstract: Recently, there has been a flurry of research on face recognition based on multiple images or shots from either a video sequence or an image set. This paper is also such an attempt in multiple-shot face recognition. Specifically, we propose a novel nonparametric method that first extracts discriminating local models via clustering. We apply a hierarchical distance-based clustering procedure according to some distance measure on the appearance manifold to cluster similar face images together. Based on the local models extracted, we then construct the intrapersonal and extrapersonal subspaces. Given a new test image, the angle between the projections of the image onto the two subspaces is used as a distance measure for classification. Since a test example contains multiple face images in multiple-shot face recognition, the final classification combines the classification decisions of all individual test images via a majority voting scheme. We compare our method empirically with some previous methods based on a database of video sequences of human faces, showing that out method significantly outperforms other methods.

Proceedings ArticleDOI
02 Mar 2006
TL;DR: Experimental results show that the proposed architecture can detect faces with the same accuracy as the software implementation, on real-time video at a frame rate of 52 frames per second.
Abstract: Face detection is a very important application in the field of machine vision. In this paper, we present a scalable parallel architecture which performs face detection using the AdaBoost algorithm. Experimental results show that the proposed architecture can detect faces with the same accuracy as the software implementation, on real-time video at a frame rate of 52 frames per second.

Proceedings ArticleDOI
07 Jun 2006
TL;DR: By studying face geometry, this work is able to determine which type of facial expression has been carried out, thus building an expression classifier which is capable of recognizing faces with different expressions.
Abstract: Face recognition is one of the most intensively studied topics in computer vision and pattern recognition. Facial expression, which changes face geometry, usually has an adverse effect on the performance of a face recognition system. On the other hand, face geometry is a useful cue for recognition. Taking these into account, we utilize the idea of separating geometry and texture information in a face image and model the two types of information by projecting them into separate PCA spaces which are specially designed to capture the distinctive features among different individuals. Subsequently, the texture and geometry attributes are re-combined to form a classifier which is capable of recognizing faces with different expressions. Finally, by studying face geometry, we are able to determine which type of facial expression has been carried out, thus build an expression classifier. Numerical validations of the proposed method are given.

Book ChapterDOI
Stan Z. Li1, Rufeng Chu1, Meng Ao1, Lun Zhang1, Ran He1 
05 Jan 2006
TL;DR: In this article, a real-time face recognition system for cooperative user applications is presented, which is based on local feature representation and statistical learning is applied to learn most effective features and classifiers for building face detection and recognition engines.
Abstract: In this paper, we present a highly accurate, realtime face recognition system for cooperative user applications. The novelties are: (1) a novel design of camera hardware, and (2) a learning based procedure for effective face and eye detection and recognition with the resulting imagery. The hardware minimizes environmental lighting and delivers face images with frontal lighting. This avoids many problems in subsequent face processing to a great extent. The face detection and recognition algorithms are based on a local feature representation. Statistical learning is applied to learn most effective features and classifiers for building face detection and recognition engines. The novel imaging system and the detection and recognition engines are integrated into a powerful face recognition system. Evaluated in real-world user scenario, a condition that is harder than a technology evaluation such as Face Recognition Vendor Tests (FRVT), the system has demonstrated excellent accuracy, speed and usability.

Journal Article
Stan Z. Li1, Rufeng Chu1, Meng Ao1, Lun Zhang1, Ran He1 
TL;DR: A novel design of camera hardware, and a learning based procedure for effective face and eye detection and recognition with the resulting imagery, which has demonstrated excellent accuracy, speed and usability.
Abstract: In this paper, we present a highly accurate, realtime face recognition system for cooperative user applications. The novelties are: (1) a novel design of camera hardware, and (2) a learning based procedure for effective face and eye detection and recognition with the resulting imagery. The hardware minimizes environmental lighting and delivers face images with frontal lighting. This avoids many problems in subsequent face processing to a great extent. The face detection and recognition algorithms are based on a local feature representation. Statistical learning is applied to learn most effective features and classifiers for building face detection and recognition engines. The novel imaging system and the detection and recognition engines are integrated into a powerful face recognition system. Evaluated in real-world user scenario, a condition that is harder than a technology evaluation such as Face Recognition Vendor Tests (FRVT), the system has demonstrated excellent accuracy, speed and usability.

Proceedings ArticleDOI
05 Jul 2006
TL;DR: The face detection system presented in this paper is a hybrid of known algorithms, applying skin detection algorithm to specify all skin locations in the image and a verification step is applied to ensure that the extracted features are facial features.
Abstract: Human face detection is concerned with finding the location and size of every human face in a given image. Face detection plays a very important role in human computer interaction field. It represents the first step in a fully automatic face re cognition, facial features detection, and expression recognition. There are many techniques used in face detection, each one has its advantages and disadvantages. The face detection system presented in this paper is a hybrid of known algorithms. First stag e of the proposed method is applying skin detection algorithm to specify all skin locations in the image. Second, extract face features like eyes, mouth and nose. At the last, a verification step is applied to ensure that the extracted features are facial features. In experiments on images having upright frontal faces with any background our system has achieved high detection rates and low false positives.

Proceedings ArticleDOI
20 Aug 2006
TL;DR: A novel and robust eye location algorithm is proposed, based on a low level, context free generalized symmetry transform, that can be used to improve detection results.
Abstract: A novel and robust eye location algorithm is proposed in this paper. The algorithm is based on a low level, context free generalized symmetry transform. Once the regions of interest are detected, characteristics of eyes can be used to improve detection results. The algorithm is tested using 1460 face images of the BioID database and 2730 images of the BANCA database. A fully automatic face verification system has also been developed using this eye location algorithm. The system was one of the top performers in the 2004 International Face Verification Competition.

Proceedings ArticleDOI
09 Jul 2006
TL;DR: A fast face detection algorithm that is effective on facial variations such as dark/bright vision, close eyes, open mouth, a half-profile face, and pseudo faces, and can also discriminate cartoon and human face correctly.
Abstract: Human face detection plays an important role in many applications such as video surveillance, face recognition, and face image database management. This paper describes a fast face detection algorithm with accurate results. We use lighting compensation to improve the performance of color-based scheme, and reduce the computation complexity of feature-based scheme. Our method is effective on facial variations such as dark/bright vision, close eyes, open moth, a half-profile face, and pseudo faces. It is worth stressing that our algorithm can also discriminate cartoon and human face correctly. The experimental results show that our approach can detect a frame in 111 msecs with the 92.3% detection rate.

Journal Article
TL;DR: An up-to-date survey on the history and state-of-the-art face recognition research is presented, systematically classifying face recognition methods into several categories, and expatiates on the evolution of the recent algorithms used to deal with the illumination variation problem and the pose variation problem.
Abstract: Due to various applications in the areas of pattern recognition,image processing,computer vision,cognitive science etc.,face recognition has drawn much attention in recent years.This paper presents an up-to-date survey on the history and state-of-the-art face recognition research, systematically classifying face recognition methods into several categories.Furthermore,this paper expatiates on the evolution of the recent algorithms which are used to deal with the illumination variation problem and the pose variation problem.In addition,several major issues for further exploration are also pointed out at the end of this paper.