Showing papers on "Object-class detection published in 2005"

PDF

Open Access

Proceedings Article•DOI•

Behavior recognition via sparse spatio-temporal features

[...]

Piotr Dollár¹, Vincent Rabaud¹, Garrison W. Cottrell¹, Serge Belongie¹•Institutions (1)

15 Oct 2005

TL;DR: It is shown that the direct 3D counterparts to commonly used 2D interest point detectors are inadequate, and an alternative is proposed, and a recognition algorithm based on spatio-temporally windowed data is devised.

...read moreread less

Abstract: A common trend in object recognition is to detect and leverage the use of sparse, informative feature points. The use of such features makes the problem more manageable while providing increased robustness to noise and pose variation. In this work we develop an extension of these ideas to the spatio-temporal case. For this purpose, we show that the direct 3D counterparts to commonly used 2D interest point detectors are inadequate, and we propose an alternative. Anchoring off of these interest points, we devise a recognition algorithm based on spatio-temporally windowed data. We present recognition results on a variety of datasets including both human and rodent behavior.

...read moreread less

2,699 citations

Proceedings Article•DOI•

Pedestrian detection in crowded scenes

[...]

Bastian Leibe¹, Edgar Seemann¹, Bernt Schiele¹•Institutions (1)

Technische Universität Darmstadt¹

20 Jun 2005

TL;DR: Qualitative and quantitative results on a large data set confirm that the core part of the method is the combination of local and global cues via probabilistic top-down segmentation that allows examining and comparing object hypotheses with high precision down to the pixel level.

...read moreread less

Abstract: In this paper, we address the problem of detecting pedestrians in crowded real-world scenes with severe overlaps. Our basic premise is that this problem is too difficult for any type of model or feature alone. Instead, we present an algorithm that integrates evidence in multiple iterations and from different sources. The core part of our method is the combination of local and global cues via probabilistic top-down segmentation. Altogether, this approach allows examining and comparing object hypotheses with high precision down to the pixel level. Qualitative and quantitative results on a large data set confirm that our method is able to reliably detect pedestrians in crowded scenes, even when they overlap and partially occlude each other. In addition, the flexible nature of our approach allows it to operate on very small training sets.

...read moreread less

952 citations

Journal Article•DOI•

Bayesian modeling of dynamic scenes for object detection

[...]

Yaser Sheikh¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

01 Nov 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An object detection scheme that has three innovations over existing approaches that is based on a model of the background as a single probability density, and the posterior function is maximized efficiently by finding the minimum cut of a capacitated graph.

...read moreread less

Abstract: Accurate detection of moving objects is an important precursor to stable tracking or recognition. In this paper, we present an object detection scheme that has three innovations over existing approaches. First, the model of the intensities of image pixels as independent random variables is challenged and it is asserted that useful correlation exists in intensities of spatially proximal pixels. This correlation is exploited to sustain high levels of detection accuracy in the presence of dynamic backgrounds. By using a nonparametric density estimation method over a joint domain-range representation of image pixels, multimodal spatial uncertainties and complex dependencies between the domain (location) and range (color) are directly modeled. We propose a model of the background as a single probability density. Second, temporal persistence is proposed as a detection criterion. Unlike previous approaches to object detection which detect objects by building adaptive models of the background, the foregrounds modeled to augment the detection of objects (without explicit tracking) since objects detected in the preceding frame contain substantial evidence for detection in the current frame. Finally, the background and foreground models are used competitively in a MAP-MRF decision framework, stressing spatial context as a condition of detecting interesting objects and the posterior function is maximized efficiently by finding the minimum cut of a capacitated graph. Experimental validation of the proposed method is performed and presented on a diverse set of dynamic scenes.

...read moreread less

685 citations

Proceedings Article•DOI•

Efficient visual event detection using volumetric features

[...]

Yan Ke¹, Rahul Sukthankar¹, Martial Hebert¹•Institutions (1)

Carnegie Mellon University¹

17 Oct 2005

TL;DR: This paper constructs a realtime event detector for each action of interest by learning a cascade of filters based on volumetric features that efficiently scans video sequences in space and time and confirms that it achieves performance comparable to a current interest point based human activity recognizer on a standard database of human activities.

...read moreread less

Abstract: This paper studies the use of volumetric features as an alternative to popular local descriptor approaches for event detection in video sequences. Motivated by the recent success of similar ideas in object detection on static images, we generalize the notion of 2D box features to 3D spatio-temporal volumetric features. This general framework enables us to do real-time video analysis. We construct a realtime event detector for each action of interest by learning a cascade of filters based on volumetric features that efficiently scans video sequences in space and time. This event detector recognizes actions that are traditionally problematic for interest point methods - such as smooth motions where insufficient space-time interest points are available. Our experiments demonstrate that the technique accurately detects actions on real-world sequences and is robust to changes in viewpoint, scale and action speed. We also adapt our technique to the related task of human action classification and confirm that it achieves performance comparable to a current interest point based human activity recognizer on a standard database of human activities.

...read moreread less

616 citations

Journal Article•DOI•

Pedestrian detection and tracking with night vision

[...]

Fengliang Xu¹, Xia Liu¹, Kikuo Fujimura²•Institutions (2)

Ohio State University¹, Honda²

01 Mar 2005-IEEE Transactions on Intelligent Transportation Systems

TL;DR: In this article, a two-step detection/tracking method is proposed to deal with the nonrigid nature of human appearance on the road, where the detection phase is performed by a support vector machine (SVM) with size-normalized pedestrian candidates and the tracking phase is a combination of Kalman filter prediction and mean shift tracking.

...read moreread less

Abstract: This paper presents a method for pedestrian detection and tracking using a single night-vision video camera installed on the vehicle. To deal with the nonrigid nature of human appearance on the road, a two-step detection/tracking method is proposed. The detection phase is performed by a support vector machine (SVM) with size-normalized pedestrian candidates and the tracking phase is a combination of Kalman filter prediction and mean shift tracking. The detection phase is further strengthened by information obtained by a road-detection module that provides key information for pedestrian validation. Experimental comparisons (e.g., grayscale SVM recognition versus binary SVM recognition and entire-body detection versus upper-body detection) have been carried out to illustrate the feasibility of our approach.

...read moreread less

431 citations

Proceedings Article•DOI•

Contour-based learning for object detection

[...]

Jamie Shotton¹, Andrew Blake², Roberto Cipolla¹•Institutions (2)

University of Cambridge¹, Microsoft²

17 Oct 2005

TL;DR: The major contributions are the application of boosted local contour-based features for object detection in a partially supervised learning framework, and an efficient new boosting procedure for simultaneously selecting features and estimating per-feature parameters.

...read moreread less

Abstract: We present a novel categorical object detection scheme that uses only local contour-based features. A two-stage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of un-segmented images; the second stage bootstraps these detections to learn an improved classifier while explicitly training against clutter. The detectors are learned with a boosting algorithm which creates a location-sensitive classifier using a discriminative set of features from a randomly chosen dictionary of contour fragments. We present results that are very competitive with other state-of-the-art object detection schemes and show robustness to object articulations, clutter, and occlusion. Our major contributions are the application of boosted local contour-based features for object detection in a partially supervised learning framework, and an efficient new boosting procedure for simultaneously selecting features and estimating per-feature parameters.

...read moreread less

349 citations

Proceedings Article•DOI•

Joint Haar-like features for face detection

[...]

T. Mita¹, Toshimitsu Kaneko¹, O. Hori¹•Institutions (1)

Toshiba¹

17 Oct 2005

TL;DR: Experimental results show that the proposed joint Haar-like feature for detecting faces in images yields higher classification performance than Viola and Jones' detector; which uses a single feature for each weak classifier.

...read moreread less

Abstract: In this paper, we propose a new distinctive feature, called joint Haar-like feature, for detecting faces in images. This is based on co-occurrence of multiple Haar-like features. Feature co-occurrence, which captures the structural similarities within the face class, makes it possible to construct an effective classifier. The joint Haar-like feature can be calculated very fast and has robustness against addition of noise and change in illumination. A face detector is learned by stagewise selection of the joint Haar-like features using AdaBoost. A small number of distinctive features achieve both computational efficiency and accuracy. Experimental results with 5, 676 face images and 30,000 nonface images show that our detector yields higher classification performance than Viola and Jones' detector; which uses a single feature for each weak classifier. Given the same number of features, our method reduces the error by 37%. Our detector is 2.6 times as fast as Viola and Jones' detector to achieve the same performance

...read moreread less

331 citations

Proceedings Article•DOI•

Fully automatic facial feature point detection using Gabor feature based boosted classifiers

[...]

Danijela Vukadinovic¹, Maja Pantic¹•Institutions (1)

Delft University of Technology¹

10 Oct 2005

TL;DR: This paper presents a method for fully automatic detection of 20 facial feature points in images of expressionless faces using Gabor feature based boosted classifiers using GentleBoost templates built from both gray level intensities and Gabor wavelet features.

...read moreread less

Abstract: Locating facial feature points in images of faces is an important stage for numerous facial image interpretation tasks. In this paper we present a method for fully automatic detection of 20 facial feature points in images of expressionless faces using Gabor feature based boosted classifiers. The method adopts fast and robust face detection algorithm, which represents an adapted version of the original Viola-Jones face detector. The detected face region is then divided into 20 relevant regions of interest, each of which is examined further to predict the location of the facial feature points. The proposed facial feature point detection method uses individual feature patch templates to detect points in the relevant region of interest. These feature models are GentleBoost templates built from both gray level intensities and Gabor wavelet features. When tested on the Cohn-Kanade database, the method has achieved average recognition rates of 93%.

...read moreread less

304 citations

Proceedings Article•DOI•

Automatic Eye Detection and Its Validation

[...]

Peng Wang¹, M.B. Green¹, Qiang Ji¹, James L. Wayman²•Institutions (2)

Rensselaer Polytechnic Institute¹, San Jose State University²

20 Jun 2005

TL;DR: The impact of eye locations on face recognition accuracy is studied, and an automatic technique for eye detection is introduced, and the face recognition performance is shown to be comparable to that of using manually given eye positions.

...read moreread less

Abstract: The accuracy of face alignment affects the performance of a face recognition system. Since face alignment is usually conducted using eye positions, an accurate eye localization algorithm is therefore essential for accurate face recognition. In this paper, we first study the impact of eye locations on face recognition accuracy, and then introduce an automatic technique for eye detection. The performance of our automatic eye detection technique is subsequently validated using FRGC 1.0 database. The validation shows that our eye detector has an overall 94.5% eye detection rate, with the detected eyes very close to the manually provided eye positions. In addition, the face recognition performance based on the automatic eye detection is shown to be comparable to that of using manually given eye positions.

...read moreread less

237 citations

Proceedings Article•DOI•

Fast Line, Arc/Circle and Leg Detection from Laser Scan Data in a Player Driver

[...]

J. Xavier¹, M. Pacheco, Daniel Castro², António E. Ruano², Urbano Nunes¹ - Show less +1 more•Institutions (2)

University of Coimbra¹, University of the Algarve²

18 Apr 2005

TL;DR: A feature detection system for real-time identification of lines, circles and people legs from laser range data is developed and a new method suitable for arc/circle detection is proposed: the Inscribed Angle Variance (IAV).

...read moreread less

Abstract: A feature detection system has been developed for real-time identification of lines, circles and people legs from laser range data. A new method suitable for arc/circle detection is proposed: the Inscribed Angle Variance (IAV). Lines are detected using a recursive line fitting method. The people leg detection is based on geometrical relations. The system was implemented as a plugin driver in Player, a mobile robot server. Real results are presented to verify the effectiveness of the proposed algorithms in indoor environment with moving objects.

...read moreread less

205 citations

Book Chapter•DOI•

A survey of 3d face recognition methods

[...]

Alize Scheenstra¹, Arnout C.C. Ruifrok², Remco C. Veltkamp¹•Institutions (2)

Utrecht University¹, Netherlands Forensic Institute²

20 Jul 2005-Lecture Notes in Computer Science

TL;DR: The main purpose of this overview is to describe the recent 3D face recognition algorithms, which hold more information of the face, like surface information, that can be used for face recognition or subject discrimination.

...read moreread less

Abstract: Many researches in face recognition have been dealing with the challenge of the great variability in head pose, lighting intensity and direction,facial expression, and aging. The main purpose of this overview is to describe the recent 3D face recognition algorithms. The last few years more and more 2D face recognition algorithms are improved and tested on less than perfect images. However, 3D models hold more information of the face, like surface information, that can be used for face recognition or subject discrimination. Another major advantage is that 3D face recognition is pose invariant. A disadvantage of most presented 3D face recognition methods is that they still treat the human face as a rigid object. This means that the methods aren't capable of handling facial expressions. Although 2D face recognition still seems to outperform the 3D face recognition methods, it is expected that this will change in the near future.

...read moreread less

Proceedings Article•DOI•

Text detection in images based on unsupervised classification of edge-based features

[...]

Chunmei Liu¹, Chunheng Wang¹, Ruwei Dai¹•Institutions (1)

Chinese Academy of Sciences¹

31 Aug 2005

TL;DR: Experimental results demonstrate that the proposed approach could efficiently be used as an automatic text detection system, which is robust for font size, font color, background complexity and language.

...read moreread less

Abstract: In this paper, an algorithm is proposed for detecting texts in images and video frames. It is performed by three steps: edge detection, text candidate detection and text refinement detection. Firstly, it applies edge detection to get four edge maps in horizontal, vertical, up-right, and up-left direction. Secondly, the feature is extracted from four edge maps to represent the texture property of text. Then k-means algorithm is applied to detect the initial text candidates. Finally, the text areas are identified by the empirical rules analysis and refined through project profile analysis. Experimental results demonstrate that the proposed approach could efficiently be used as an automatic text detection system, which is robust for font size, font color, background complexity and language.

...read moreread less

Proceedings Article•DOI•

Strategies and Benefits of Fusion of 2D and 3D Face Recognition

[...]

M. Husken, Michael Brauckmann, Stefan Gehlen, C. von der Malsburg¹•Institutions (1)

Ruhr University Bochum¹

20 Jun 2005

TL;DR: This paper uses the face recognition grand challenge dataset to evaluate hierarchical graph matching (HGM), an universal approach to 2D and 3D face recognition, and shows that HGM yields the best results presented at the recent FRGC workshop, and that 2d face recognition is significantly more accurate than 3DFace recognition and that the fusion of both modalities leads to a further improvement of the 2D results.

...read moreread less

Abstract: The extension of 2D image-based face recognition methods with respect to 3D shape information and the fusion of both modalities is one of the main topics in the recent development of facial recognition. In this paper we discuss different strategies and their expected benefit for the fusion of 2D and 3D face recognition. The face recognition grand challenge (FRGC) provides for the first time ever a public benchmark dataset of a suitable size to evaluate the accuracy of both 2D and 3D face recognition. We use this benchmark to evaluate hierarchical graph matching (HGM), an universal approach to 2D and 3D face recognition, and demonstrate the benefit of different fusion strategies. The results show that HGM yields the best results presented at the recent FRGC workshop, that 2D face recognition is significantly more accurate than 3D face recognition and that the fusion of both modalities leads to a further improvement of the 2D results.

...read moreread less

Proceedings Article•DOI•

Detection of Anchor Points for 3D Face Veri.cation

[...]

Dirk Colbry¹, George C. Stockman¹, Anil K. Jain¹•Institutions (1)

Michigan State University¹

20 Jun 2005

TL;DR: Two algorithms for detecting face anchor points in the context of face verification are presented; One for frontal images and one for arbitrary pose, demonstrating the challenges in 3D face recognition under arbitrary pose and expression.

...read moreread less

Abstract: This paper outlines methods to detect key anchor points in 3D face scanner data. These anchor points can be used to estimate the pose and then match the test image to a 3D face model. We present two algorithms for detecting face anchor points in the context of face verification; One for frontal images and one for arbitrary pose. We achieve 99% success in finding anchor points in frontal images and 86% success in scans with large variations in pose and changes in expression. These results demonstrate the challenges in 3D face recognition under arbitrary pose and expression. We are currently working on robust ?tting algorithms to localize more precisely the anchor points for arbitrary pose images.

...read moreread less

Proceedings Article•DOI•

Bayesian object detection in dynamic scenes

[...]

Yaser Sheikh¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

20 Jun 2005

TL;DR: Using a non-parametric density estimation method over a joint domain-range representation of image pixels, multi-modal spatial uncertainties and complex dependencies between the domain and range are directly modeled and temporal persistence is proposed as a detection criteria.

...read moreread less

Abstract: Detecting moving objects using stationary cameras is an important precursor to many activity recognition, object recognition and tracking algorithms. In this paper, three innovations are presented over existing approaches. Firstly, the model of the intensities of image pixels as independently distributed random variables is challenged and it is asserted that useful correlation exists in the intensities of spatially proximal pixels. This correlation is exploited to sustain high levels of detection accuracy in the presence of nominal camera motion and dynamic textures. By using a non-parametric density estimation method over a joint domain-range representation of image pixels, multi-modal spatial uncertainties and complex dependencies between the domain (location) and range (color) are directly modeled. Secondly, temporal persistence is proposed as a detection criteria. Unlike previous approaches to object detection which detect objects by building adaptive models of the only background, the foreground is also modeled to augment the detection of objects (without explicit tracking) since objects detected in a preceding frame contain substantial evidence for detection in a current frame. Third, the background and foreground models are used competitively in a MAP-MRF decision framework, stressing spatial context as a condition of pixel-wise labeling and the posterior function is maximized efficiently using graph cuts. Experimental validation of the proposed method is presented on a diverse set of dynamic scenes.

...read moreread less

Towards a framework for change detection based on image objects

[...]

Thomas Blaschke¹•Institutions (1)

University of Salzburg¹

01 Jan 2005

TL;DR: In this paper, a framework for image object-based change detection is proposed, which breaks down the n-dimensional problem to two main aspects, geometry and thematic content, which can be associated with the following questions: did a certain classified object change geometrically, class-wise, or both.

...read moreread less

Abstract: With the advent of high resolution satellite imagery and airborne digital camera data approaches that include contextual information are more commonly utilized. One way to include spatial dimensions in image analysis is to identify relatively homogeneous regions and to treat them as objects. Although segmentation is not a new concept, the number of image segmentation based applications is recently significantly increasing. Concurrently, new methodological challenges arise. Standard change detection and accuracy assessment techniques mainly rely on statistically assessing individual pixels. Such assessments are not satisfactory for image objects which exhibit shape, boundary, homogeneity or topological information. These additional dimensions of information describing real world objects have to be assessed in multitemporal object-based image analysis. In this paper, problems associated with multitemporal object recognition are identified and a framework for image object-based change detection is suggested. For simplicity, this framework breaks down the n-dimensional problem to two main aspects, geometry and thematic content. These two aspects can be associated with the following questions: did a certain classified object change geometrically, class-wise, or both? When can we identify an object in one data set as being the same object in another data set? Do we need user-defined or application-specific thresholds for geometric overlap, shape-area relations, centroid movements, etc? This paper elucidates some specific challenges to change detection of objects and incorporates GIS-functionality into image analysis.

...read moreread less

Proceedings Article•DOI•

Combining generative models and Fisher kernels for object recognition

[...]

Alex D. Holub¹, Max Welling², Pietro Perona¹•Institutions (2)

California Institute of Technology¹, University of California, Irvine²

17 Oct 2005

TL;DR: This work explores a hybrid generative/discriminative approach using 'Fisher kernels' by Jaakkola and Haussler (1999) which retains most of the desirable properties of generative methods, while increasing the classification performance through a discriminative setting.

...read moreread less

Abstract: Learning models for detecting and classifying object categories is a challenging problem in machine vision. While discriminative approaches to learning and classification have, in principle, superior performance, generative approaches provide many useful features, one of which is the ability to naturally establish explicit correspondence between model components and scene features - this, in turn, allows for the handling of missing data and unsupervised learning in clutter. We explore a hybrid generative/discriminative approach using 'Fisher kernels' by Jaakkola and Haussler (1999) which retains most of the desirable properties of generative methods, while increasing the classification performance through a discriminative setting. Furthermore, we demonstrate how this kernel framework can be used to combine different types of features and models into a single classifier. Our experiments, conducted on a number of popular benchmarks, show strong performance improvements over the corresponding generative approach and are competitive with the best results reported in the literature.

...read moreread less

Patent•

Selectively displaying information based on face detection

[...]

Andrew Collins, Andrew Roger Kilner, Victoria Sophia Jennings, Sebastian Aleksander Paszkowicz, Eric Rudolf Siereveld, Stephane Daniel Andre Charles Labrousse, Robert Mark Stefan Porter, Ratna Rambaruth, Clive Henry Gillard - Show less +5 more

21 Jan 2005

TL;DR: A display arrangement comprises an image display device having two or more sets of images for display, a camera directed towards positions adopted by users viewing the display, and a face detector for detecting human faces in images captured by the camera, the face detector being arranged to detect faces in at least two face categories as discussed by the authors.

...read moreread less

Abstract: A display arrangement comprises an image display device having two or more sets of images for display; a camera directed towards positions adopted by users viewing the display; a face detector for detecting human faces in images captured by the camera, the face detector being arranged to detect faces in at least two face categories; and means, responsive to the a frequency of detection of categories of faces by the face detector at one or more different periods, for selecting a set of images to be displayed on the image display device at that time of day.

...read moreread less

Proceedings Article•DOI•

Comparing ARTag and ARToolkit Plus fiducial marker systems

[...]

Mark Fiala¹•Institutions (1)

National Research Council¹

12 Dec 2005

TL;DR: This paper compares the two recently developed systems ARTag and ARToolkit Plus on their reliability, detection rates, and immunity to lighting and occlusion.

...read moreread less

Abstract: Fiducial marker systems are systems of unique patterns and computer vision algorithms that help solve the correspondence problem, automatically finding features in different camera images that belong to the same object point in the world. Fiducial marker systems consist of patterns that are mounted in the environment and automatically detected in digital images using an accompanying detection algorithm, useful for augmented reality (AR), robot navigation, 3D modeling, and other applications. This paper compares the two recently developed systems ARTag and ARToolkit Plus on their reliability, detection rates, and immunity to lighting and occlusion. Processing in fiducial systems are defined as two stages, unique feature detection and verification/identification. The systems are compared considering these stages, experimental results are shown.

...read moreread less

Proceedings Article•DOI•

3D Assisted Face Recognition: A Survey of 3D Imaging, Modelling and Recognition Approachest

[...]

J. Kittler¹, Adrian Hilton¹, M. Hamouz¹, John Illingworth¹•Institutions (1)

University of Surrey¹

20 Jun 2005

TL;DR: 3D face recognition has lately been attracting ever increasing attention and this paper complements other reviews in the face biometrics area by focusing on the sensor technology, and by detailing the efforts in 3D face modelling and 3D assisted 2D face matching.

...read moreread less

Abstract: 3D face recognition has lately been attracting ever increasing attention. In this paper we review the full spectrum of 3D face processing technology, from sensing to recognition. The review covers 3D face modelling, 3D to 3D and 3D to 2D registration, 3D based recognition and 3D assisted 2D based recognition. The fusion of 2D and 3D modalities is also addressed. The paper complements other reviews in the face biometrics area by focusing on the sensor technology, and by detailing the efforts in 3D face modelling and 3D assisted 2D face matching.

...read moreread less

Proceedings Article•DOI•

3D Face Recognition using Mapped Depth Images

[...]

Gang Pan¹, Shi Han¹, Zhaohui Wu¹, Yueming Wang¹•Institutions (1)

Zhejiang University¹

20 Jun 2005

TL;DR: This paper presents an effective method to automatically extract ROI of facial surface, which mainly depends on automatic detection of facial bilateral symmetry plane and localization of nose tip, and builds a reference plane through the nose tip for calculating the relative depth values.

...read moreread less

Abstract: This paper addresses 3D face recognition from facial shape. Firstly, we present an effective method to automatically extract ROI of facial surface, which mainly depends on automatic detection of facial bilateral symmetry plane and localization of nose tip. Then we build a reference plane through the nose tip for calculating the relative depth values. Considering the non-rigid property of facial surface, the ROI is triangulated and parameterized into an isomorphic 2D planar circle, attempting to preserve the intrinsic geometric properties. At the same time the relative depth values are also mapped. Finally we perform eigenface on the mapped relative depth image. The entire scheme is insensitive to pose variance. The experiment using FRGC database v1.0 obtains the rank-1 identification score of 95%, which outperforms the result of the PCA base-line method by 4%, which demonstrates the effectiveness of our algorithm.

...read moreread less

Proceedings Article•DOI•

Stereo Vision-based approaches for Pedestrian Detection

[...]

Massimo Bertozzi¹, E. Binelli¹, Alberto Broggi¹, Michael Rose²•Institutions (2)

University of Parma¹, United States Department of the Army²

20 Jun 2005

TL;DR: This paper describes a system for pedestrian detection in stereo infrared images based on three different underlying approaches: warm area detection, edgebased detection, and v-disparity computation.

...read moreread less

Abstract: This paper describes a system for pedestrian detection in stereo infrared images. The system is based on three different underlying approaches: warm area detection, edgebased detection, and v-disparity computation. Stereo is also used for computing the distance and size of detected objects. A final validation process is performed using head morphological and thermal characteristics. Neither temporal correlation, nor motion cues are used in this processing. The developed system has been implemented on an experimental vehicle equipped with two infrared camera and preliminarily tested in different situations.

...read moreread less

Patent•

Method and system for constructing a 3D representation of a face from a 2D representation

[...]

Dalong Jiang¹, Hong-Jiang Zhang¹, Lei Zhang¹, Shuicheng Yan¹, Yuxiao Hu¹ - Show less +1 more•Institutions (1)

Microsoft¹

29 Apr 2005

TL;DR: In this article, a method and system for generating 3D images of faces from 2D images, for generating two-dimensional images of the faces at different image conditions from the 3D image, and for recognizing a 2D image of a target face based on the generated 2-D images is provided.

...read moreread less

Abstract: A method and system for generating 3D images of faces from 2D images, for generating 2D images of the faces at different image conditions from the 3D images, and for recognizing a 2D image of a target face based on the generated 2D images is provided. The recognition system provides a 3D model of a face that includes a 3D image of a standard face under a standard image condition and parameters indicating variations of an individual face from the standard face. To generate the 3D image of a face, the recognition system inputs a 2D image of the face under a standard image condition. The recognition system then calculates parameters that map the points of the 2D image to the corresponding points of a 2D image of the standard face. The recognition system uses these parameters with the 3D model to generate 3D images of the face at different image conditions.

...read moreread less

Proceedings Article•DOI•

On-line Conservative Learning for Person Detection

[...]

Peter M. Roth¹, Helmut Grabner¹, Danijel Skočaj², H. Bischol², Ales Leonardis³ - Show less +1 more•Institutions (3)

Graz University of Technology¹, Bosch², University of Ljubljana³

15 Oct 2005

TL;DR: A novel on-line conservative learning framework for an object detection system that uses reconstructive and discriminative classifiers in an iterative co-training fashion to arrive at increasingly better object detectors.

...read moreread less

Abstract: We present a novel on-line conservative learning framework for an object detection system. All algorithms operate in an on-line mode, in particular we also present a novel on-line AdaBoost method. The basic idea is to start with a very simple object detection system and to exploit a huge amount of unlabeled video data by being very conservative in selecting training examples. The key idea is to use reconstructive and discriminative classifiers in an iterative co-training fashion to arrive at increasingly better object detectors. We demonstrate the framework on a surveillance task where we learn person detectors that are tested on two surveillance video sequences. We start with a simple moving object classifier and proceed with incremental PCA (on shape and appearance) as a reconstructive classifier, which in turn generates a training set for a discriminative on-line AdaBoost classifier

...read moreread less

Proceedings Article•DOI•

Closely coupled object detection and segmentation

[...]

Liang Zhao¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

17 Oct 2005

TL;DR: A closely coupled object detection and segmentation algorithm for enhancing both processes in a cooperative and iterative manner is proposed, which improves both segmentation and detection.

...read moreread less

Abstract: We propose a closely coupled object detection and segmentation algorithm for enhancing both processes in a cooperative and iterative manner. Figure-ground segmentation reduces the effect of background clutter on template matching; the matched template provides shape constraints on segmentation. More precisely, we estimate the probability of each pixel belonging to the foreground by a weighted sum of the estimates based on shape and color alone. The weight on the shape-based estimate is related to the probability that a familiar object is present and is updated dynamically so that we enforce shape constraints only where the object is present. Experiments on detecting people in images of cluttered scenes demonstrate that the proposed algorithm improves both segmentation and detection. More accurate object boundaries are extracted; higher object detection rates and lower false alarm rates are achieved than performing the two processes separately or sequentially.

...read moreread less

Patent•

3D object recognition

[...]

Jan Erik Solem¹, Fredrik Kahl¹•Institutions (1)

Apple Inc.¹

11 Aug 2005

TL;DR: In this article, a statistical shape model is used to recover 3D shapes from a 2D representation of the 3D object and compare the recovered 3D shape with known 3D to 2D representations of at least one object of the object class.

...read moreread less

Abstract: A method, device, system, and computer program for object recognition of a 3D object of a certain object class using a statistical shape model for recovering 3D shapes from a 2D representation of the 3D object and comparing the recovered 3D shape with known 3D to 2D representations of at least one object of the object class.

...read moreread less

Journal Article•DOI•

Automatic image orientation detection via confidence-based integration of low-level and semantic cues

[...]

Jiebo Luo¹, Matthew Boutell²•Institutions (2)

Eastman Kodak Company¹, IEEE Computer Society²

01 May 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework is developed, an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.

...read moreread less

Abstract: Automatic image orientation detection for natural images is a useful, yet challenging research topic. Humans use scene context and semantic object recognition to identify the correct image orientation. However, it is difficult for a computer to perform the task in the same way because current object recognition algorithms are extremely limited in their scope and robustness. As a result, existing orientation detection methods were built upon low-level vision features such as spatial distributions of color and texture. Discrepant detection rates have been reported for these methods in the literature. We have developed a probabilistic approach to image orientation detection via confidence-based integration of low-level and semantic cues within a Bayesian framework. Our current accuracy is 90 percent for unconstrained consumer photos, impressive given the findings of a psychophysical study conducted recently. The proposed framework is an attempt to bridge the gap between computer and human vision systems and is applicable to other problems involving semantic scene content understanding.

...read moreread less

Proceedings Article•DOI•

TemporalBoost for event recognition

[...]

Paul Smith¹, N. da Vitoria Lobo¹, Mubarak Shah¹•Institutions (1)

University of Central Florida¹

17 Oct 2005

TL;DR: A new boosting paradigm to achieve detection of events in video that has the capability to improve weak classifiers by allowing them to use previous history in evaluating the current frame and a learning mechanism built into the boosting paradigm which allows event level decisions to be made.

...read moreread less

Abstract: This paper contributes a new boosting paradigm to achieve detection of events in video. Previous boosting paradigms in vision focus on single frame detection and do not scale to video events. Thus new concepts need to be introduced to address questions such as determining if an event has occurred, localizing the event, handling same action performed at different speeds, incorporating previous classifier responses into current decision, using temporal consistency of data to aid detection and recognition. The proposed method has the capability to improve weak classifiers by allowing them to use previous history in evaluating the current frame. A learning mechanism built into the boosting paradigm is also given which allows event level decisions to be made. This is contrasted with previous work in boosting which uses limited higher level temporal reasoning and essentially makes object detection decisions at the frame level. Our approach makes extensive use of temporal continuity of video at the classifier and detector levels. We also introduce a relevant set of activity features. Features are evaluated at multiple zoom levels to improve detection. We show results for a system that is able to recognize 11 actions.

...read moreread less

Patent•

Systems and methods for face detection and recognition using infrared imaging

[...]

Maneesh Singh¹, Kazunori Okada¹, Benedicte Bascle¹, Dorin Comaniciu¹, Gregory Dr. Baratoff¹, Thorsten Köhler¹ - Show less +2 more•Institutions (1)

Princeton University¹

28 Mar 2005

TL;DR: In this paper, the image object is recognized using pose-specific object recognizers that use outputs from the pose-sensitive object detectors and the fused output of the posespecific object detectors.

...read moreread less

Abstract: Methods for image processing for detecting and recognizing an image object include detecting an image object using pose-specific object detectors, and performing fusion of the outputs from the pose-specific object detectors. The image object is recognized using pose-specific object recognizers that use outputs from the pose-specific object detectors and the fused output of the pose-specific object detectors; and by performing fusion of the outputs of the pose-specific object recognizers to recognize the image object.

...read moreread less

Proceedings Article•DOI•

A Fast Multi-Modal Approach to Facial Feature Detection

[...]

C. Boehnen¹, T. Russ²•Institutions (2)

University of Notre Dame¹, Sandia National Laboratories²

05 Jan 2005

TL;DR: This paper presents a method utilizing the registered 2D color and range image of a face to automatically identify the eyes, nose, and mouth and aims to run the algorithm as fast as possible.

...read moreread less

Abstract: As interest in 3D face recognition increases the importance of the initial alignment problem does as well. In this paper we present a method utilizing the registered 2D color and range image of a face to automatically identify the eyes, nose, and mouth. These features are important to initially align faces in both standard 2D and 3D face recognition algorithms. For our algorithm to run as fast as possible, we focus on the 2D color information. This allows the algorithm to run in approximately 4 seconds on a 640times480 image with registered range data. On a database of 1,500 images the algorithm achieved a facial feature detection rate of 99.6% with 0.4% of the images skipped due to hair obstruction of the face.

...read moreread less

Collapse