scispace - formally typeset
Search or ask a question

Showing papers on "Object-class detection published in 2004"


Journal ArticleDOI
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Abstract: This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.

13,037 citations


Book
01 Dec 2004
TL;DR: This work focuses on low-level processing on the three-dimensional world tackling the pespective n-point problem motion invariants and their applications and the need for speed - real-time electronic hardware systems.
Abstract: Introduction - vision, the challenge. Part 1 Low-level processing: images and imaging operations basic image filtering operations thresholding techniques locating objects via their edges binary shape analysis boundary pattern analysis. Part 2 Intermediate-level processing: line detection circle detection the Hough transform and its nature ellipse detection hole detection polygon and corner detection. Part 3 Application level processing: abstract pattern matching techniques the three-dimensional world tackling the pespective n-point problem motion invariants and their applications automated visual inspection statistical pattern recognition biologically inspired recognition schemes texture image acquisition the need for speed - real-time electronic hardware systems. Part 4 Perspectives on vision: machine vision, art or science?.

1,198 citations


Journal ArticleDOI
TL;DR: A learning-based approach to the problem of detecting objects in still, gray-scale images that makes use of a sparse, part-based representation is developed and a critical evaluation of the approach under the proposed standards is presented.
Abstract: We study the problem of detecting objects in still, gray-scale images. Our primary focus is the development of a learning-based approach to the problem that makes use of a sparse, part-based representation. A vocabulary of distinctive object parts is automatically constructed from a set of sample images of the object class of interest; images are then represented using parts from this vocabulary, together with spatial relations observed among the parts. Based on this representation, a learning algorithm is used to automatically learn to detect instances of the object class in new images. The approach can be applied to any object with distinguishable parts in a relatively fixed spatial configuration; it is evaluated here on difficult sets of real-world images containing side views of cars, and is seen to successfully detect objects in varying conditions amidst background clutter and mild occlusion. In evaluating object detection approaches, several important methodological issues arise that have not been satisfactorily addressed in the previous work. A secondary focus of this paper is to highlight these issues, and to develop rigorous evaluation standards for the object detection problem. A critical evaluation of our approach under the proposed standards is presented.

970 citations


Journal ArticleDOI
TL;DR: It is shown that an efficient face detection system does not require any costly local preprocessing before classification of image areas, and provides very high detection rate with a particularly low level of false positives, demonstrated on difficult test sets, without requiring the use of multiple networks for handling difficult cases.
Abstract: In this paper, we present a novel face detection approach based on a convolutional neural architecture, designed to robustly detect highly variable face patterns, rotated up to /spl plusmn/20 degrees in image plane and turned up to /spl plusmn/60 degrees, in complex real world images. The proposed system automatically synthesizes simple problem-specific feature extractors from a training set of face and nonface patterns, without making any assumptions or using any hand-made design concerning the features to extract or the areas of the face pattern to analyze. The face detection procedure acts like a pipeline of simple convolution and subsampling modules that treat the raw input image as a whole. We therefore show that an efficient face detection system does not require any costly local preprocessing before classification of image areas. The proposed scheme provides very high detection rate with a particularly low level of false positives, demonstrated on difficult test sets, without requiring the use of multiple networks for handling difficult cases. We present extensive experimental results illustrating the efficiency of the proposed approach on difficult test sets and including an in-depth sensitivity analysis with respect to the degrees of variability of the face patterns.

610 citations


Proceedings ArticleDOI
25 Aug 2004
TL;DR: In this article, an effective live face detection algorithm is presented based on the analysis of Fourier spectra of a single face image or face image sequences using structure and movement information of live face.
Abstract: Biometrics is a rapidly developing technology that is to identify a person based on his or her physiological or behavioral characteristics. To ensure the correction of authentication, the biometric system must be able to detect and reject the use of a copy of a biometric instead of the live biometric. This function is usually termed "liveness detection". This paper describes a new method for live face detection. Using structure and movement information of live face, an effective live face detection algorithm is presented. Compared to existing approaches, which concentrate on the measurement of 3D depth information, this method is based on the analysis of Fourier spectra of a single face image or face image sequences. Experimental results show that the proposed method has an encouraging performance.

417 citations


Proceedings ArticleDOI
18 Dec 2004
TL;DR: A novel face detection approach using improved local binary patterns (ILBP) as facial representation that considers both local shape and texture information instead of raw grayscale information and it is robust to illumination variation.
Abstract: In this paper, we present a novel face detection approach using improved local binary patterns (ILBP) as facial representation. ILBP feature is an improvement of LBP feature that considers both local shape and texture information instead of raw grayscale information and it is robust to illumination variation. We model the face and non-face class using multivariable Gaussian model and classify them under Bayesian framework. Extensive experiments show that the proposed method has an encouraging performance.

358 citations


Proceedings ArticleDOI
28 Sep 2004
TL;DR: A new method is presented for detecting triangular, square and octagonal road signs efficiently and robustly using the symmetric nature of these shapes, together with the pattern of edge orientations exhibited by equiangular polygons with a known number of sides to establish possible shape centroid locations.
Abstract: A new method is presented for detecting triangular, square and octagonal road signs efficiently and robustly. The method uses the symmetric nature of these shapes, together with the pattern of edge orientations exhibited by equiangular polygons with a known number of sides, to establish possible shape centroid locations in the image. This approach is invariant to in-plane rotation and returns the location and size of the shape detected. Results on still images show a detection rate of over 95%. The method is efficient enough for real-time applications, such as on-board-vehicle sign detection.

331 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: This paper presents a frequency analysis-based method for instantaneous estimation of class separability, without the need for any training, and builds detectors for the most promising candidates, their receiver operating characteristics confirming the estimates.
Abstract: Vision-based hand gesture interfaces require fast and extremely robust hand detection. Here, we study view-specific hand posture detection with an object recognition method proposed by Viola and Jones. Training with this method is computationally very expensive, prohibiting the evaluation of many hand appearances for their suitability to detection. In this paper, we present a frequency analysis-based method for instantaneous estimation of class separability, without the need for any training. We built detectors for the most promising candidates, their receiver operating characteristics confirming the estimates. Next, we found that classification accuracy increases with a more expressive feature type. Lastly, we show that further optimization of training parameters yields additional detection rate improvements. In summary, we present a systematic approach to building an extremely robust hand appearance detector, providing an important step towards easily deployable and reliable vision-based hand gesture interfaces.

320 citations


Proceedings ArticleDOI
27 Jun 2004
TL;DR: A novel discriminative feature space which is efficient not only for face detection but also for recognition is introduced, and the same facial representation can be efficiently used for both detection and recognition.
Abstract: We introduce a novel discriminative feature space which is efficient not only for face detection but also for recognition. The face representation is based on local binary patterns (LBP) and consists of encoding both local and global facial characteristics into a compact feature histogram. The proposed representation is invariant with respect to monotonic gray scale transformations and can be derived in a single scan through the image. Considering the derived feature space, a second-degree polynomial kernel SVM classifier was trained to detect frontal faces in gray scale images. Experimental results using several complex images show that the proposed approach performs favorably compared to the state-of-the-art methods. Additionally, experiments with detecting and recognizing low-resolution faces from video sequences were carried out, demonstrating that the same facial representation can be efficiently used for both detection and recognition.

293 citations


Journal ArticleDOI
TL;DR: A theory of appearance-based object recognition from light-fields is developed, which leads directly to an algorithm for face recognition across pose that uses as many images of the face as are available, from one upwards.
Abstract: Arguably the most important decision to be made when developing an object recognition algorithm is selecting the scene measurements or features on which to base the algorithm. In appearance-based object recognition, the features are chosen to be the pixel intensity values in an image of the object. These pixel intensities correspond directly to the radiance of light emitted from the object along certain rays in space. The set of all such radiance values over all possible rays is known as the plenoptic function or light-field. In this paper, we develop a theory of appearance-based object recognition from light-fields. This theory leads directly to an algorithm for face recognition across pose that uses as many images of the face as are available, from one upwards. All of the pixels, whichever image they come from, are treated equally and used to estimate the (eigen) light-field of the object. The eigen light-field is then used as the set of features on which to base recognition, analogously to how the pixel intensities are used in appearance-based face and object recognition.

292 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: An improved multi-scale corner detector with dynamic region of support based on curvature scale space (CSS) technique, which uses an adaptive local curvature threshold instead of a single global threshold as in the original and enhanced CSS methods.
Abstract: Corners play an important role in object identification methods used in machine vision and image processing systems. Single-scale feature detection finds it hard to detect both fine and coarse features at the same time. On the other hand, multi-scale feature detection is inherently able to solve this problem. This paper proposes an improved multi-scale corner detector with dynamic region of support, which is based on curvature scale space (CSS) technique. The proposed detector first uses an adaptive local curvature threshold instead of a single global threshold as in the original and enhanced CSS methods. Second, the angles of corner candidates are checked in a dynamic region of support for eliminating falsely detected corners. The proposed method has been evaluated over a number of images and compared with some popular corner detectors. The results showed that the proposed method offers a robust and effective solution to images containing widely different size features.

Journal ArticleDOI
TL;DR: A novel algorithm for face detection is developed by combining the Eigenface and SVM methods which performs almost as fast as theEigenface method but with a significant improved speed.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: A cascaded method for object detection using a novel organization of the first cascade stage called "feature-centric" evaluation which re-uses feature evaluations across multiple candidate windows achieves both computational efficiency and accuracy.
Abstract: We describe a cascaded method for object detection. This approach uses a novel organization of the first cascade stage called "feature-centric" evaluation which re-uses feature evaluations across multiple candidate windows. We minimize the cost of this evaluation through several simplifications: (1) localized lighting normalization, (2) representation of the classifier as an additive model and (3) discrete-valued features. Such a method also incorporates a unique feature representation. The early stages in the cascade use simple fast feature evaluations and the later stages use more complex discriminative features. In particular, we propose features based on sparse coding and ordinal relationships among filter responses. This combination of cascaded feature-centric evaluation with features of increasing complexity achieves both computational efficiency and accuracy. We describe object detection experiments on ten objects including faces and automobiles. These results include 97% recognition at equal error rate on the UIUC image database for car detection.

Proceedings ArticleDOI
17 May 2004
TL;DR: An efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination and the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE.
Abstract: An analysis-by-synthesis framework for face recognition with variant pose, illumination and expression (PIE) is proposed in this paper. First, an efficient 2D-to-3D integrated face reconstruction approach is introduced to reconstruct a personalized 3D face model from a single frontal face image with neutral expression and normal illumination. Then, realistic virtual faces with different PIE are synthesized based on the personalized 3D face to characterize the face subspace. Finally, face recognition is conducted based on these representative virtual faces. Compared with other related works, this framework has the following advantages: 1) only one single frontal face is required for face recognition, which avoids the burdensome enrollment work; 2) the synthesized face samples provide the capability to conduct recognition under difficult conditions like complex PIE; and 3) the proposed 2D-to-3D integrated face reconstruction approach is fully automatic and more efficient. The extensive experimental results show that the synthesized virtual faces significantly improve the accuracy of face recognition with variant PIE.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: This survey focuses on face recognition using three-dimensional data, either alone or in combination with two-dimensional intensity images, to identify challenges involved in developing more accurate three- dimensional face recognition.
Abstract: The vast majority of face recognition research has focused on the use of two-dimensional intensity images, and is covered in existing survey papers. This survey focuses on face recognition using three-dimensional data, either alone or in combination with two-dimensional intensity images. Challenges involved in developing more accurate three-dimensional face recognition are identified.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: This work describes a procedure for constructing a database of 3D face models and matching this database to 2.5D face scans which are captured from different views, using coordinate system invariant properties of the facial surface.
Abstract: The performance of face recognition systems that use two-dimensional (2D) images is dependent on consistent conditions such as lighting, pose and facial expression. We are developing a multi-view face recognition system that utilizes three-dimensional (3D) information about the face to make the system more robust to these variations. This work describes a procedure for constructing a database of 3D face models and matching this database to 2.5D face scans which are captured from different views, using coordinate system invariant properties of the facial surface. 2.5D is a simplified 3D (x, y, z) surface representation that contains at most one depth value (z direction) for every point in the (x, y) plane. A robust similarity metric is defined for matching, based on an iterative closest point (ICP) registration process. Results are given for matching a database of 18 3D face models with 113 2.5D face scans.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: This work presents a framework that learns the classifier online with automatically labeled data for the specific case of detecting moving objects from video with an online learner based on the Winnow algorithm.
Abstract: Object detection with a learned classifier has been applied successfully to difficult tasks such as detecting faces and pedestrians. Systems using this approach usually learn the classifier offline with manually labeled training data. We present a framework that learns the classifier online with automatically labeled data for the specific case of detecting moving objects from video. Motion information is used to automatically label training examples collected directly from the live detection task video. An online learner based on the Winnow algorithm incrementally trains a task-specific classifier with these examples. Since learning occurs online and without manual help, it can continue in parallel with detection and adapt the classifier over time. The framework is demonstrated on a person detection task for an office corridor scene. In this task, we use background subtraction to automatically label training examples. After the initial manual effort of implementing the labeling method, the framework runs by itself on the scene video stream to gradually train an accurate detector.

Patent
Nobuo Higaki1, Takamichi Shimada1
20 Dec 2004
TL;DR: In this paper, the authors proposed a method to detect a moving object by generating the distance information of the moving object, detecting the object motion, determining the object distance, detecting object image area and the object contour from the video image that includes the object image and contour.
Abstract: The present invention detects a moving object by generating the distance information of the moving object, detecting the object motion, determining the object distance, detecting the object image area and the object contour from the video image that includes the object image and contour, and provides a moving object detection apparatus to carry out such detection as well as detecting a contour of the specific moving object by detecting the center of the moving object in high precision.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a novel three-step face detection approach to address the problem of automatic human face detection from images in surveillance and biometric applications, which adopts a simple-to-complex strategy.
Abstract: Automatic human face detection from images in surveillance and biometric applications is a challenging task due to variations in image background, view, illumination, articulation, and facial expression. We propose a novel three-step face detection approach to addressing this problem. The approach adopts a simple-to-complex strategy. First, a linear-filtering algorithm is applied to enhance detection performance by removing most nonface-like candidates rapidly. Second, a boosting chain algorithm is adopted to combine the boosting classifiers into a hierarchical "chain" structure. By utilizing the inter-layer discriminative information, this algorithm reveals a higher efficiency than traditional approaches. Last, a postfiltering algorithm, consisting of image preprocessing; support vector machine-filter and color-filter, is applied to refine the final prediction. As only a few candidate windows remain in the final stage, this algorithm greatly improves detection accuracy with small computation cost. Compared with conventional approaches, this three-step approach is shown to be more effective and capable of handling more pose variations. Moreover, together with a two-level hierarchy in-plane pose estimator, a rapid multiview face detector is built. Experimental results demonstrate a significant performance improvement for the proposed approach over others.

Dissertation
01 Jan 2004
TL;DR: A smart visual surveillance system with real-time moving object detection, classification and tracking capabilities is presented, which operates on both color and gray scale video imagery from a stationary camera.
Abstract: MOVING OBJECT DETECTION, TRACKING AND CLASSIFICATION FOR SMART VIDEO SURVEILLANCE Yiğithan Dedeoğlu M.S. in Computer Engineering Supervisor: Assist. Prof. Dr. Uğur Güdükbay August, 2004 Video surveillance has long been in use to monitor security sensitive areas such as banks, department stores, highways, crowded public places and borders. The advance in computing power, availability of large-capacity storage devices and high speed network infrastructure paved the way for cheaper, multi sensor video surveillance systems. Traditionally, the video outputs are processed online by human operators and are usually saved to tapes for later use only after a forensic event. The increase in the number of cameras in ordinary surveillance systems overloaded both the human operators and the storage devices with high volumes of data and made it infeasible to ensure proper monitoring of sensitive areas for long times. In order to filter out redundant information generated by an array of cameras, and increase the response time to forensic events, assisting the human operators with identification of important events in video by the use of “smart” video surveillance systems has become a critical requirement. The making of video surveillance systems “smart” requires fast, reliable and robust algorithms for moving object detection, classification, tracking and activity analysis. In this thesis, a smart visual surveillance system with real-time moving object detection, classification and tracking capabilities is presented. The system operates on both color and gray scale video imagery from a stationary camera. It can handle object detection in indoor and outdoor environments and under changing illumination conditions. The classification algorithm makes use of the shape of the detected objects and temporal tracking results to successfully categorize objects into pre-defined classes like human, human group and vehicle. The system is also able to detect the natural phenomenon fire in various scenes reliably. The proposed tracking algorithm successfully tracks video objects even in full occlusion cases. In addition to these, some important needs of a robust iii

Patent
12 Nov 2004
TL;DR: In this article, a face detection and recognition system using weighted subtracting and thresholding to distinguish human skin in a sensed image is presented. But the system is not suitable for face recognition.
Abstract: A face detection and recognition system having several arrays imaging a scene at different bands of the infrared spectrum. The system may use weighted subtracting and thresholding to distinguish human skin in a sensed image. A feature selector may locate a face in the image. The face may be framed or the image cropped with a frame or border to incorporate essentially only the face. The border may be superimposed on an image direct from an imaging array. A sub-image containing the face may be extracted from within the border and compared with a database of face information to attain recognition of the face. A level of recognition of the face may be established. Infrared lighting may be used as needed to illuminate the scene.

Proceedings ArticleDOI
17 May 2004
TL;DR: A skin detection approach that uses neighborhood information is proposed that is robust enough for dealing with some real-world conditions, like changing lighting conditions and complex background containing surfaces and objects with skin-like colors.
Abstract: Skin detection is employed in tasks like face detection and tracking, naked people detection, hand detection and tracking, people retrieval in databases and Internet, etc. However, skin detection is not robust enough for dealing with some real-world conditions, like changing lighting conditions and complex background containing surfaces and objects with skin-like colors. This situation can be improved by incorporating context information in the skin detection process. For this reason in this article a skin detection approach that uses neighborhood information is proposed. A pixel will belong to the skin class only if a direct neighbor does. This idea is implemented through a diffusion process. Two new algorithms implementing these ideas are described and compared with state-of-the-art skin detection algorithms.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: Analysis of the in-plane rotational robustness of the Viola-Jones object detection method when used for hand appearance detection finds that randomly rotating the training data within these bounds allows for detection rates about one order of magnitude better than those trained on strictly aligned data.
Abstract: The research described in this paper analyzes the in-plane rotational robustness of the Viola-Jones object detection method when used for hand appearance detection We determine the rotational bounds for training and detection for achieving undiminished performance without an increase in classifier complexity The result - up to 15/spl deg/ total - differs from the method's performance on faces (30/spl deg/ total) We found that randomly rotating the training data within these bounds allows for detection rates about one order of magnitude better than those trained on strictly aligned data The implications of the results effect both savings in training costs as well as increased naturalness and comfort of vision-based hand gesture interfaces

Proceedings ArticleDOI
14 Jun 2004
TL;DR: A scheme that looks at the motion patterns of crowd in the spatio-temporal domain and gives an efficient implementation that can detect crowd in real-time that detects moving crowd in a video sequence.
Abstract: We present a real-time system that detects moving crowd in a video sequence. Crowd detection differs from pedestrian detection in that we assume that no individual pedestrian can be properly segmented in the image. We propose a scheme that looks at the motion patterns of crowd in the spatio-temporal domain and give an efficient implementation that can detect crowd in real-time. In our experiments we detected crowd at distances of up to 70 m.

Proceedings ArticleDOI
20 Sep 2004
TL;DR: A multi-primitive skin-tone and edge-based detection module embedded in a tracking module for efficient and robust face detection and tracking and a continuous density HMM based pose estimation is developed for an accurate estimate of the face orientation motions.
Abstract: Robust human face analysis has been recognized as a crucial part in intelligent systems. In this paper, we present the development of a computational framework for robust detection, tracking, and pose estimation of faces captured by video arrays. We discuss the development of a multi-primitive skin-tone and edge-based detection module embedded in a tracking module for efficient and robust face detection and tracking. A continuous density HMM based pose estimation is developed for an accurate estimate of the face orientation motions. Experimental evaluations of these algorithms suggest the validity of the proposed framework and its computational modules.

Proceedings ArticleDOI
24 Oct 2004
TL;DR: The multiple object tracking method keeps a graph structure where it maintains multiple hypotheses about the number and the trajectories of the objects in the video, and integrates object detection and tracking tightly.
Abstract: In this paper we describe a method for tracking multiple objects whose number is unknown and varies during tracking. Based on preliminary results of object detection in each image which may have missing and/or false detection, the multiple object tracking method keeps a graph structure where it maintains multiple hypotheses about the number and the trajectories of the objects in the video. The image information drives the process of extending and pruning the graph, and determines the best hypothesis to explain the video. While the image-based object detection makes a local decision, the tracking process confirms and validates the detection through time, therefore, it can be regarded as temporal detection which makes a global decision across time. The multiple object tracking method gives feedbacks which are predictions of object locations to the object detection module. Therefore, the method integrates object detection and tracking tightly. The most possible hypothesis provides the multiple object tracking result. The experimental results are presented.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: A new algorithm for eyes detection is proposed that uses iris geometrical information for determining in the whole image the region candidate to contain an eye, and then the symmetry for selecting the couple of eyes.
Abstract: The problem of eye detection in face images is very important for a large number of applications ranging from face recognition to gaze tracking. In this paper, we propose a new algorithm for eyes detection that uses iris geometrical information for determining in the whole image the region candidate to contain an eye, and then the symmetry for selecting the couple of eyes. The novelty of this work is that the algorithm works on complex images without constraints on the background, skin color segmentation and so on. Different experiments, carried out on images of subjects with different eyes colors, some of them wearing glasses, demonstrate the effectiveness and robustness of the proposed algorithm.

Patent
12 Aug 2004
TL;DR: In this paper, a face detector is used to roughly select face candidates from an input image by template matching and detect face candidates by means of a support vector machine for face recognition, and a non-face judging section 120 that detects nonface candidates that are judged to be non-faces and removes them from the face candidates selected by the face detecting section 110 and a skin tracker section 114 for tracking a face region after the nonface judgment.
Abstract: An object detector, an object detecting method and a robot can reduce diction errors of detecting wrong objects without increasing the volume of the computational operation to be performed to detect the right object. A face detector 101 comprises a face detecting section 110 that operates like a conventional face detecting section and is adapted to roughly select face candidates from an input image by template matching and detect face candidates by means of a support vector machine for face recognition, a non-face judging section 120 that detects non-face candidates that are judged to be non-faces and removes them from the face candidates selected by the face detecting section 110 and a skin tracker section 114 for tracking a face region after the non-face judgment. When the assumed distance between the face detector and the face as computed from the input image and the measured distance as measured by a distance sensor show a large difference, when the color variance of the face candidate is small, when the occupancy ratio of the skin color region is large and when the change in the size of the face region is large after the elapse of a predetermined time, the non-face judging section 120 judges such face candidates as non-faces and removes them from the face candidates.

Patent
07 Dec 2004
TL;DR: In this article, a computer-aided image comparison, evaluation and retrieval system compares objects and object clusters, or images, prior to object definition/detection, and scores to suspected biological, medical, chemical, physical or clinical condition may be performed based on retrieved objects or images and their relative similarities to the unknown.
Abstract: A computer-aided image comparison, evaluation and retrieval system compares objects and object clusters, or images. User controlled or automatic filtering to enhance object features may be performed prior to object definition/detection. The query image may be substantially continuously displayed during the image filtering and object definition processes. Scoring to suspected biological, medical, chemical, physical or clinical condition may be performed based on retrieved objects or images and their relative similarities to the unknown.

Proceedings ArticleDOI
27 Sep 2004
TL;DR: A machine learning approach for visual object detection and recognition which is capable of processing images rapidly and achieving high detection and recognized faces at 10.9 frames per second is described.
Abstract: This paper describes a machine learning approach for visual object detection and recognition which is capable of processing images rapidly and achieving high detection and recognition rates. This framework is demonstrated on, and in part motivated by, the task of human-robot interaction. There are three main parts on this framework. The first is the person's face detection used as a preprocessing system to the second stage which is the recognition of the face of the person interacting with the robot, and the third one is the hand detection. The detection technique is based on Haar-like features introduced by Viola et al. and then improved by Lienhart et al. The eigenimages and PCA are used in the recognition stage of the system. Used in real-time human-robot interaction applications the system is able to detect and recognise faces at 10.9 frames per second in a PIV 2.2 GHz equipped with a USB camera.