Showing papers on "3D single-object recognition published in 2006"

PDF

Open Access

Journal Article•DOI•

3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints

[...]

Fred Rothganger¹, Svetlana Lazebnik¹, Cordelia Schmid², Jean Ponce¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, French Institute for Research in Computer Science and Automation²

01 Mar 2006-International Journal of Computer Vision

TL;DR: A novel representation for three-dimensional objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches is introduced, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints.

...read moreread less

Abstract: This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented.

...read moreread less

458 citations

Proceedings Article•DOI•

On the Use of SIFT Features for Face Authentication

[...]

Manuele Bicego¹, Andrea Lagorio¹, Enrico Grosso¹, Massimo Tistarelli¹•Institutions (1)

University of Sassari¹

17 Jun 2006

TL;DR: This paper investigates the application of the SIFT approach in the context of face authentication, and proposes and tests different matching schemes using the BANCA database and protocol, showing promising results.

...read moreread less

Abstract: Several pattern recognition and classification techniques have been applied to the biometrics domain. Among them, an interesting technique is the Scale Invariant Feature Transform (SIFT), originally devised for object recognition. Even if SIFT features have emerged as a very powerful image descriptors, their employment in face analysis context has never been systematically investigated. This paper investigates the application of the SIFT approach in the context of face authentication. In order to determine the real potential and applicability of the method, different matching schemes are proposed and tested using the BANCA database and protocol, showing promising results.

...read moreread less

386 citations

Proceedings Article•DOI•

Towards Multi-View Object Class Detection

[...]

Alexander Thomas¹, V. Ferrar², Bastian Leibe³, Tinne Tuytelaars¹, B. Schiel⁴, L. Van Gool¹ - Show less +2 more•Institutions (4)

Katholieke Universiteit Leuven¹, French Institute for Research in Computer Science and Automation², ETH Zurich³, Technische Universität Darmstadt⁴

17 Jun 2006

TL;DR: The Implicit Shape Model for object class detection is combined with the multi-view specific object recognition system of Ferrari et al. to detect object instances from arbitrary viewpoints.

...read moreread less

Abstract: We present a novel system for generic object class detection. In contrast to most existing systems which focus on a single viewpoint or aspect, our approach can detect object instances from arbitrary viewpoints. This is achieved by combining the Implicit Shape Model for object class detection proposed by Leibe and Schiele with the multi-view specific object recognition system of Ferrari et al. After learning single-view codebooks, these are interconnected by so-called activation links, obtained through multi-view region tracks across different training views of individual object instances. During recognition, these integrated codebooks work together to determine the location and pose of the object. Experimental results demonstrate the viability of the approach and compare it to a bank of independent single-view detectors

...read moreread less

268 citations

Proceedings Article•DOI•

Multiple Object Class Detection with a Generative Model

[...]

Krystian Mikolajczyk¹, Bastian Leibe², Bernt Schiele³•Institutions (3)

University of Surrey¹, ETH Zurich², Technische Universität Darmstadt³

17 Jun 2006

TL;DR: The performance of the proposed multi-object class detection approach is competitive to state of the art approaches dedicated to a single object class recognition problem.

...read moreread less

Abstract: In this paper we propose an approach capable of simultaneous recognition and localization of multiple object classes using a generative model. A novel hierarchical representation allows to represent individual images as well as various objects classes in a single, scale and rotation invariant model. The recognition method is based on a codebook representation where appearance clusters built from edge based features are shared among several object classes. A probabilistic model allows for reliable detection of various objects in the same image. The approach is highly efficient due to fast clustering and matching methods capable of dealing with millions of high dimensional features. The system shows excellent performance on several object categories over a wide range of scales, in-plane rotations, background clutter, and partial occlusions. The performance of the proposed multi-object class detection approach is competitive to state of the art approaches dedicated to a single object class recognition problem.

...read moreread less

266 citations

Book Chapter•DOI•

What and Where: 3D Object Recognition with Accurate Pose

[...]

Iryna Gordon¹, David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: This chapter describes a system for constructing 3D metric models from multiple images taken with an uncalibrated handheld camera, recognizing these models in new images, and precisely solving for object pose.

...read moreread less

Abstract: Many applications of 3D object recognition, such as augmented reality or robotic manipulation, require an accurate solution for the 3D pose of the recognized objects. This is best accomplished by building a metrically accurate 3D model of the object and all its feature locations, and then fitting this model to features detected in new images. In this chapter, we describe a system for constructing 3D metric models from multiple images taken with an uncalibrated handheld camera, recognizing these models in new images, and precisely solving for object pose. This is demonstrated in an augmented reality application where objects must be recognized, tracked, and superimposed on new images taken from arbitrary viewpoints without perceptible jitter. This approach not only provides for accurate pose, but also allows for integration of features from multiple training images into a single model that provides for more reliable recognition.

...read moreread less

196 citations

Journal Article•DOI•

Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views

[...]

Vittorio Ferrari¹, Tinne Tuytelaars², Luc Van Gool•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

10 Apr 2006-International Journal of Computer Vision

TL;DR: In this paper, a novel object recognition approach based on affine invariant regions is presented, which actively counters the problems related to the limited repeatability of the region detectors, and the difficulty of matching, in the presence of large amounts of background clutter and particularly challenging viewing conditions.

...read moreread less

Abstract: We present a novel Object Recognition approach based on affine invariant regions. It actively counters the problems related to the limited repeatability of the region detectors, and the difficulty of matching, in the presence of large amounts of background clutter and particularly challenging viewing conditions. After producing an initial set of matches, the method gradually explores the surrounding image areas, recursively constructing more and more matching regions, increasingly farther from the initial ones. This process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. The approach includes a mechanism for capturing the relationships between multiple model views and exploiting these for integrating the contributions of the views at recognition time. This is based on an efficient algorithm for partitioning a set of region matches into groups lying on smooth surfaces. Integration is achieved by measuring the consistency of configurations of groups arising from different model views. Experimental results demonstrate the stronger power of the approach in dealing with extensive clutter, dominant occlusion, and large scale and viewpoint changes. Non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. All presented techniques can extend any view-point invariant feature extractor.

...read moreread less

186 citations

Journal Article•DOI•

Object Level Grouping for Video Shots

[...]

Josef Sivic¹, Frederik Schaffalitzky¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

10 Apr 2006-International Journal of Computer Vision

TL;DR: A method for automatically obtaining object representations suitable for retrieval from generic video shots that includes associating regions within a single shot to represent a deforming object and an affine factorization method that copes with motion degeneracy.

...read moreread less

Abstract: We describe a method for automatically obtaining object representations suitable for retrieval from generic video shots. The object representation consists of an association of frame regions. These regions provide exemplars of the object's possible visual appearances. Two ideas are developed: (i) associating regions within a single shot to represent a deforming object; (ii) associating regions from the multiple visual aspects of a 3D object, thereby implicitly representing 3D structure. For the association we exploit temporal continuity (tracking) and wide baseline matching of affine covariant regions. In the implementation there are three areas of novelty: First, we describe a method to repair short gaps in tracks. Second, we show how to join tracks across occlusions (where many tracks terminate simultaneously). Third, we develop an affine factorization method that copes with motion degeneracy. We obtain tracks that last throughout the shot, without requiring a 3D reconstruction. The factorization method is used to associate tracks into object-level groups, with common motion. The outcome is that separate parts of an object that are not simultaneously visible (such as the front and back of a car, or the front and side of a face) are associated together. In turn this enables object-level matching and recognition throughout a video. We illustrate the method on the feature film "Groundhog Day." Examples are given for the retrieval of deforming objects (heads, walking people) and rigid objects (vehicles, locations).

...read moreread less

162 citations

Journal Article•DOI•

On solving the face recognition problem with one training sample per subject

[...]

Jie Wang¹, Konstantinos N. Plataniotis¹, Juwei Lu¹, Anastasios N. Venetsanopoulos¹•Institutions (1)

University of Toronto¹

01 Sep 2006-Pattern Recognition

TL;DR: A recognition framework based on the concept of the so-called generic learning is introduced as an attempt to boost the performance of traditional appearance-based recognition solutions in the one training sample application scenario.

...read moreread less

108 citations

Book Chapter•DOI•

Silhouette-Based method for object classification and human action recognition in video

[...]

Yiğithan Dedeoğlu¹, B. Ugur Toreyin, Uğur Güdükbay¹, A. Enis Cetin•Institutions (1)

Bilkent University¹

13 May 2006

TL;DR: An instance based machine learning algorithm and system for real-time object classification and human action recognition which can help to build intelligent surveillance systems are presented.

...read moreread less

Abstract: In this paper we present an instance based machine learning algorithm and system for real-time object classification and human action recognition which can help to build intelligent surveillance systems. The proposed method makes use of object silhouettes to classify objects and actions of humans present in a scene monitored by a stationary camera. An adaptive background subtract-tion model is used for object segmentation. Template matching based supervised learning method is adopted to classify objects into classes like human, human group and vehicle; and human actions into predefined classes like walking, boxing and kicking by making use of object silhouettes.

...read moreread less

104 citations

Journal Article•DOI•

Combining color and shape information for illumination-viewpoint invariant object recognition

[...]

A. Diplaros¹, Theo Gevers¹, Ioannis Patras²•Institutions (2)

University of Amsterdam¹, University of York²

01 Jan 2006-IEEE Transactions on Image Processing

TL;DR: A new scheme that merges color- and shape-invariant information for object recognition and is robust against changing illumination, camera viewpoint, object pose, and noise is proposed.

...read moreread less

Abstract: In this paper, we propose a new scheme that merges color- and shape-invariant information for object recognition. To obtain robustness against photometric changes, color-invariant derivatives are computed first. Color invariance is an important aspect of any object recognition scheme, as color changes considerably with the variation in illumination, object pose, and camera viewpoint. These color invariant derivatives are then used to obtain similarity invariant shape descriptors. Shape invariance is equally important as, under a change in camera viewpoint and object pose, the shape of a rigid object undergoes a perspective projection on the image plane. Then, the color and shape invariants are combined in a multidimensional color-shape context which is subsequently used as an index. As the indexing scheme makes use of a color-shape invariant context, it provides a high-discriminative information cue robust against varying imaging conditions. The matching function of the color-shape context allows for fast recognition, even in the presence of object occlusion and cluttering. From the experimental results, it is shown that the method recognizes rigid objects with high accuracy in 3-D complex scenes and is robust against changing illumination, camera viewpoint, object pose, and noise.

...read moreread less

97 citations

Proceedings Article•DOI•

Recognizing Interaction Activities using Dynamic Bayesian Network

[...]

Youtian Du¹, Feng Chen¹, Wenli Xu¹, Yongbin Li¹•Institutions (1)

Tsinghua University¹

20 Aug 2006

TL;DR: A new DBN model structure with state duration to model human interacting activities based on dynamic Bayesian network is proposed, which combines the global features with local ones harmoniously.

...read moreread less

Abstract: Activity recognition is significant in intelligent surveillance. In this paper, we present a novel approach to the recognition of interacting activities based on dynamic Bayesian network (DBN). In this approach the features representing the object motion are divided into two classes: global features and local features, which are at two different spatial scales. Global features describe object motion at a large spatial scale and relations between objects or between the object and environment, and local ones represent the motion details of objects of interest. We propose a new DBN model structure with state duration to model human interacting activities. This DBN model structure combines the global features with local ones harmoniously. The effectiveness of this novel approach is demonstrated by experiment.

...read moreread less

Patent•

Interactivity Via Mobile Image Recognition

[...]

Ronald H. Cohen

29 Aug 2006

TL;DR: In this article, a mobile device is used to electronically capture image data of a real-world object, and the image data are used to identify information related to the real world object and interact with software to control at least one aspect of an electronic game; and a second device local to the mobile device.

...read moreread less

Abstract: Systems and methods of interacting with a virtual space, in which a mobile device is used to electronically capture image data of a real-world object, the image data is used to identify information related to the real-world object, and the information is used to interact with software to control at least one of: (a) an aspect of an electronic game; and (b) a second device local to the mobile device. Contemplated systems and methods can be used to gaming, in which the image data can be used to identify a name of the real-world object, to classify the real-world object, identify the real-world object as a player in the game, to identify the real-world object as a goal object or as having some other value in the game, to use the image data to identify the real-world object as a goal object in the game.

...read moreread less

Proceedings Article•DOI•

Expression-Invariant Face Recognition with Expression Classification

[...]

Xiaoxing Li¹, Greg Mori¹, Hao Zhang¹•Institutions (1)

Simon Fraser University¹

07 Jun 2006

TL;DR: By studying face geometry, this work is able to determine which type of facial expression has been carried out, thus building an expression classifier which is capable of recognizing faces with different expressions.

...read moreread less

Abstract: Face recognition is one of the most intensively studied topics in computer vision and pattern recognition. Facial expression, which changes face geometry, usually has an adverse effect on the performance of a face recognition system. On the other hand, face geometry is a useful cue for recognition. Taking these into account, we utilize the idea of separating geometry and texture information in a face image and model the two types of information by projecting them into separate PCA spaces which are specially designed to capture the distinctive features among different individuals. Subsequently, the texture and geometry attributes are re-combined to form a classifier which is capable of recognizing faces with different expressions. Finally, by studying face geometry, we are able to determine which type of facial expression has been carried out, thus build an expression classifier. Numerical validations of the proposed method are given.

...read moreread less

Journal Article•DOI•

SIFT-ing through features with ViPR

[...]

Mario E. Munich¹, Paolo Pirjanian¹, E. Di Bernardo¹, Luís F. Gonçalves¹, Niklas Karlsson¹, David G. Lowe¹ - Show less +2 more•Institutions (1)

Evolution Robotics¹

21 Aug 2006-IEEE Robotics & Automation Magazine

TL;DR: The application of this particular visual pattern recognition (ViPR) technology to a variety of robotics applications: object recognition, navigation, manipulation, and human-machine interaction is described.

...read moreread less

Abstract: Recent advances in computer vision have given rise to a robust and invariant visual pattern recognition technology that is based on extracting a set of characteristic features from an image. Such features are obtained with the scale invariant feature transform (SIFT) which represents the variations in brightness of the image around the point of interest. Recognition performed with these features has been shown to be quite robust in realistic settings. This paper describes the application of this particular visual pattern recognition (ViPR) technology to a variety of robotics applications: object recognition, navigation, manipulation, and human-machine interaction. The paper also describes the technology in more detail and presents a business case for visual pattern recognition in the field of robotics and automation

...read moreread less

Book Chapter•DOI•

Highly accurate and fast face recognition using near infrared images

[...]

Stan Z. Li¹, Rufeng Chu¹, Meng Ao¹, Lun Zhang¹, Ran He¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

05 Jan 2006

TL;DR: In this article, a real-time face recognition system for cooperative user applications is presented, which is based on local feature representation and statistical learning is applied to learn most effective features and classifiers for building face detection and recognition engines.

...read moreread less

Abstract: In this paper, we present a highly accurate, realtime face recognition system for cooperative user applications. The novelties are: (1) a novel design of camera hardware, and (2) a learning based procedure for effective face and eye detection and recognition with the resulting imagery. The hardware minimizes environmental lighting and delivers face images with frontal lighting. This avoids many problems in subsequent face processing to a great extent. The face detection and recognition algorithms are based on a local feature representation. Statistical learning is applied to learn most effective features and classifiers for building face detection and recognition engines. The novel imaging system and the detection and recognition engines are integrated into a powerful face recognition system. Evaluated in real-world user scenario, a condition that is harder than a technology evaluation such as Face Recognition Vendor Tests (FRVT), the system has demonstrated excellent accuracy, speed and usability.

...read moreread less

Journal Article•

Highly accurate and fast face recognition using near infrared images

[...]

Stan Z. Li¹, Rufeng Chu¹, Meng Ao¹, Lun Zhang¹, Ran He¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: A novel design of camera hardware, and a learning based procedure for effective face and eye detection and recognition with the resulting imagery, which has demonstrated excellent accuracy, speed and usability.

...read moreread less

Proceedings Article•DOI•

3D Face Recognition Using Log-Gabor Templates

[...]

Jamie A. Cook, Vinod Chandran, Clinton Fookes¹•Institutions (1)

Queensland University of Technology¹

01 Jan 2006

TL;DR: A new method for providing insensitivity to expression variation in range images based on Log-Gabor Templates is presented by decomposing a single image of a subject into 147 observations allowing high accuracy even in the presence of occulusions, distortions and facial expressions.

...read moreread less

Abstract: The use of Three Dimensional (3D) data allows new facial recognition algorithms to overcome factors such as pose and illumination variations which have plagued traditional 2D Face Recognition. In this paper a new method for providing insensitivity to expression variation in range images based on Log-Gabor Templates is presented. By decomposing a single image of a subject into 147 observations the reliance of the algorithm upon any particular part of the face is relaxed allowing high accuracy even in the presence of occulusions, distortions and facial expressions. Using the 3D database collected by University of Notre Dame for the Face Recognition Grand Challenge (FRGC), benchmarking results are presented showing superior performance of the proposed method. Comparisons showing the relative strength of the algorithm against two commercial and two academic 3D face recognition algorithms are also presented. algoritms are also presented. 1 Introduction

...read moreread less

Journal Article•DOI•

A weighted probabilistic approach to face recognition from multiple images and video sequences

[...]

Yongbin Zhang¹, Aleix M. Martinez¹•Institutions (1)

Ohio State University¹

01 Jun 2006-Image and Vision Computing

TL;DR: The formulation of a probabilistic appearance-based face recognition approach is extended to work with multiple images and video sequences and it is shown that regardless of the algorithm used, the recognition results improve considerably when one uses a video sequence rather than a single still.

...read moreread less

Book Chapter•DOI•

Face recognition from video using the generic shape-illumination manifold

[...]

Ognjen Arandjelovic¹, Roberto Cipolla¹•Institutions (1)

University of Cambridge¹

07 May 2006

TL;DR: A fully automatic recognition system based on the proposed method and an extensive evaluation on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation that consistently demonstrated a nearly perfect recognition rate is described.

...read moreread less

Abstract: In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. In particular there are three areas of novelty: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation, learnt offline, to generalize in the presence of extreme illumination changes; (ii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve invariance to unseen head poses; and (iii) we introduce an accurate video sequence “reillumination” algorithm to achieve robustness to face motion patterns in video. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our system consistently demonstrated a nearly perfect recognition rate (over 99.7%), significantly outperforming state-of-the-art commercial software and methods from the literature.

...read moreread less

Patent•

Object Detection and Recognition System

[...]

John Winn¹, Jamie Shotton¹•Institutions (1)

Microsoft¹

21 Sep 2006

TL;DR: In this article, a conditional random field is used to force a global part labeling which is substantially layout-consistent and a part label map is inferred from this, which can be used to estimate belief distributions over parts for each image element of a test image.

...read moreread less

Abstract: During a training phase we learn parts of images which assist in the object detection and recognition task. A part is a densely represented area of an image of an object to which we assign a unique label. Parts contiguously cover an image of an object to give a part label map for that object. The parts do not necessarily correspond to semantic object parts. During the training phase a classifier is learnt which can be used to estimate belief distributions over parts for each image element of a test image. A conditional random field is used to force a global part labeling which is substantially layout-consistent and a part label map is inferred from this. By recognizing parts we enable object detection and recognition even for partially occluded objects, for multiple-objects of different classes in the same scene, for unstructured and structured objects and allowing for object deformation.

...read moreread less

Proceedings Article•DOI•

Classification of Unattended and Stolen Objects in Video-Surveillance System

[...]

Silvia Ferrando¹, Gianluca Gera, Carlo S. Regazzoni¹•Institutions (1)

University of Genoa¹

22 Nov 2006

TL;DR: A video surveillance system aimed at the automatic identification of events of interest, especially of abandoned and stolen objects in a guarded indoor environment, which combines three phases of data processing: object extraction, object recognition and tracking, and decision about actions.

...read moreread less

Abstract: This paper describes a video surveillance system aimed at the automatic identification of events of interest, especially of abandoned and stolen objects in a guarded indoor environment. In particular the implemented system combines three phases of data processing: object extraction, object recognition and tracking, and decision about actions. Extracted objects are classified as "human" or "non-human" and static or dynamic, an event of interest following from a split between a "human" and a static "non-human" object, finally static "nonhuman" is analyzed to discriminate between abandoned or stolen object.

...read moreread less

Proceedings Article•DOI•

3-D Object Map Building Using Dense Object Models with SIFT-based Recognition Features

[...]

Masahiro Tomono¹•Institutions (1)

Toyo University¹

01 Oct 2006

TL;DR: This paper proposes a framework to integrate dense shape and recognition features into an object model, and shows that an object map of a room was built successfully using the proposed object models.

...read moreread less

Abstract: This paper presents a method of object map building using object models created from image sequences captured by a single camera. Object map is a highly structured map, which is built by placing 3-D object models on the floor plane according to object recognition results. To increase the efficiency of object map building, we propose a framework to integrate dense shape and recognition features into an object model. Experimental results show that an object map of a room was built successfully using the proposed object models.

...read moreread less

Proceedings Article•DOI•

Contour-Based Object Detection in Range Images

[...]

Stefan Stiene¹, Kai Lingemann¹, Andreas Nüchter¹, Joachim Hertzberg¹•Institutions (1)

University of Osnabrück¹

14 Jun 2006

TL;DR: A complete object recognition system, based on a 3D laser scanner, reliable contour extraction with floor interpretation, feature extraction using a new, fast eigen-CSS method, and a supervised learning algorithm is proposed.

...read moreread less

Abstract: This paper presents a novel object recognition approach based on range images. Due to its insensitivity to illumination, range data is well suited for reliable silhouette extraction. Silhouette or contour descriptions are good sources of information for object recognition. We propose a complete object recognition system, based on a 3D laser scanner, reliable contour extraction with floor interpretation, feature extraction using a new, fast eigen-CSS method, and a supervised learning algorithm. The recognition system was successfully tested on range images acquired with a mobile robot, and the results are compared to standard techniques, i.e., geometric features, Hu and Zernike moments, the border signature method and the angular radial transformation. An evaluation using the receiver operating characteristic analysis completes this paper. The eigen-CSS method has proved to be comparable in detection performance to the top competitors, yet faster than the best one by an order of magnitude in feature extraction time.

...read moreread less

Proceedings Article•DOI•

Data Association Using Visual Object Recognition for EKF-SLAM in Home Environment

[...]

Sunghwan Ahn¹, Minyong Choi¹, Jinwoo Choi¹, Wan Kyun Chung¹•Institutions (1)

Pohang University of Science and Technology¹

01 Oct 2006

TL;DR: This paper proposes a SLAM scheme based on visual object recognition, not just a scene matching, in home environment is proposed without using artificial landmarks, and shows that the final pose error was bounded after battery-run-out autonomous navigation for 50 minutes.

...read moreread less

Abstract: Reliable data association is crucial to localization and map building for mobile robot applications. For that reason, many mobile robots tend to choose vision-based SLAM solutions. In this paper, a SLAM scheme based on visual object recognition, not just a scene matching, in home environment is proposed without using artificial landmarks. For the object-based SLAM, the following algorithms are suggested: 1) a novel local invariant feature extraction by combining advantages of multi-scale Harris corner as a detector and its SIFT descriptor for natural object recognition, 2) the RANSAC clustering for robust object recognition in the presence of outliers and 3) calculating accurate metric information for SLAM update. The proposed algorithms increase robustness by correct data association and accurate observation. Moreover, it also can be easily implemented real-time by reducing the number of representative landmarks, i.e. objects. The performance of the proposed algorithm was verified by experiments using EKF-SLAM with a stereo camera in home-like environments, and it showed that the final pose error was bounded after battery-run-out autonomous navigation for 50 minutes

...read moreread less

Patent•

Image Recognizing Apparatus and Method, and Position Determining Apparatus, Vehicle Controlling Apparatus and Navigation Apparatus Using the Image Recognizing Apparatus or Method

[...]

Masaki Nakamura¹, Takayuki Miyajima¹, Motohiro Nakamura¹•Institutions (1)

Toyota¹

25 Jan 2006

TL;DR: In this paper, an image recognizing apparatus which can increase the recognition rate of the image of a recognition target even when the recognition performance in the image recognition operation would deteriorate otherwise due to inability of obtaining good image information on the recognition target if the operation relied solely on picked up image information.

...read moreread less

Abstract: There is provided e.g. an image recognizing apparatus which can increase the recognition rate of the image of a recognition target even when the recognition rate in the image recognition operation would deteriorate otherwise due to inability of obtaining good image information on the recognition target if the operation relied solely on picked up image information. The apparatus includes an image information obtaining section 3, an imaging position obtaining section 7, a land object information storing section 8, a land object information obtaining section 9 for obtaining, from the land object information storing section 8, the land object information on one or more land objects included within an imaging area of the image information, a determining section 15 for determining whether or not a plurality of recognition target land objects to be recognized are included within the imaging area of the image information, based on the obtained land object information and an image recognizing section 10 for recognizing an image of one recognition target land object, based on result of image recognition of another recognition target land object and on position relationship between the one recognition target land object and another recognition target land object based on the position information included in the land object information, if the determining section has determined that a plurality of recognition target land objects are included.

...read moreread less

Proceedings Article•DOI•

Real-time detection of elliptic shapes for automated object recognition and object tracking

[...]

Christian Teutsch¹, Dirk Berndt¹, Erik Trostmann¹, Michael Weber¹•Institutions (1)

Fraunhofer Society¹

02 Feb 2006

TL;DR: This work takes a very practical look at the automated shape recognition for common industrial tasks and presents a very fast novel approach for the detection of deformed shapes which are in the broadest sense elliptic.

...read moreread less

Abstract: The detection of varying 2D shapes is a recurrent task for Computer Vision applications, and camera based object recognition has become a standard procedure. Due to the discrete nature of digital images and aliasing effects, shape recognition can be complicated. There are many existing algorithms that discuss the identification of circles and ellipses, but they are very often limited in flexibility or speed or require high quality input data. Our work considers the application of shape recognition for processes in industrial environments and, especially the automatization requires reliable and fast algorithms at the same time. We take a very practical look at the automated shape recognition for common industrial tasks and present a very fast novel approach for the detection of deformed shapes which are in the broadest sense elliptic. Furthermore, we consider the automated recognition of bacteria colonies and coded markers for both 3D object tracking and an automated camera calibration procedure.

...read moreread less

Book Chapter•DOI•

Sparse patch-histograms for object classification in cluttered images

[...]

Thomas Deselaers¹, Andre Hegerath¹, Daniel Keysers¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

12 Sep 2006

TL;DR: A novel model for object recognition and detection that follows the widely adopted assumption that objects in images can be represented as a set of loosely coupled parts is presented and yields very competitive results for the commonly used Caltech object detection tasks.

...read moreread less

Abstract: We present a novel model for object recognition and detection that follows the widely adopted assumption that objects in images can be represented as a set of loosely coupled parts. In contrast to former models, the presented method can cope with an arbitrary number of object parts. Here, the object parts are modelled by image patches that are extracted at each position and then efficiently stored in a histogram. In addition to the patch appearance, the positions of the extracted patches are considered and provide a significant increase in the recognition performance. Additionally, a new and efficient histogram comparison method taking into account inter-bin similarities is proposed. The presented method is evaluated for the task of radiograph recognition where it achieves the best result published so far. Furthermore it yields very competitive results for the commonly used Caltech object detection tasks.

...read moreread less

Journal Article•

Learning class-specific edges for object detection and segmentation

[...]

Mukta Prosad, Andrew Zisserman, Andrew Fitzgibbon, M. Pawan Kumar, Philip H. S. Torr - Show less +1 more

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: In this article, a class-specific edge classification method is proposed to prune edges which are not relevant to the object class, and thereby improve the performance of subsequent processing, and demonstrate learning class specific edges for a number of object classes under challenging scale and illumination variation.

...read moreread less

Abstract: Recent research into recognizing object classes (such as humans, cows and hands) has made use of edge features to hypothesize and localize class instances. However, for the most part, these edge-based methods operate solely on the geometric shape of edges, treating them equally and ignoring the fact that for certain object classes, the appearance of the object on the inside of the edge may provide valuable recognition cues. We show how. for such object classes, small regions around edges can be used to classify the edge into object or non-object. This classifier may then be used to prune edges which are not relevant to the object class, and thereby improve the performance of subsequent processing. We demonstrate learning class specific edges for a number of object classes -oranges, bananas and bottles - under challenging scale and illumination variation. Because class-specific edge classification provides a low-level analysis of the image it may be integrated into any edge-based recognition strategy without significant change in the high-level algorithms. We illustrate its application to two algorithms: (i) chamfer matching for object detection, and (ii) modulating contrast terms in MRF based object-specific segmentation. We show that performance of both algorithms (matching and segmentation) is considerably improved by the class-specific edge labelling.

...read moreread less

Book Chapter•DOI•

Learning class-specific edges for object detection and segmentation

[...]

Mukta Prasad¹, Andrew Zisserman¹, Andrew Fitzgibbon², M. Pawan Kumar³, Philip H. S. Torr³ - Show less +1 more•Institutions (3)

University of Oxford¹, Microsoft², Oxford Brookes University³

13 Dec 2006

TL;DR: It is shown how, for certain object classes, small regions around edges can be used to classify the edge into object or non-object, and performance of both algorithms (matching and segmentation) is considerably improved by the class-specific edge labelling.

...read moreread less

Abstract: Recent research into recognizing object classes (such as humans, cows and hands) has made use of edge features to hypothesize and localize class instances. However, for the most part, these edge-based methods operate solely on the geometric shape of edges, treating them equally and ignoring the fact that for certain object classes, the appearance of the object on the “inside” of the edge may provide valuable recognition cues. We show how, for such object classes, small regions around edges can be used to classify the edge into object or non-object. This classifier may then be used to prune edges which are not relevant to the object class, and thereby improve the performance of subsequent processing. We demonstrate learning class specific edges for a number of object classes — oranges, bananas and bottles — under challenging scale and illumination variation. Because class-specific edge classification provides a low-level analysis of the image it may be integrated into any edge-based recognition strategy without significant change in the high-level algorithms. We illustrate its application to two algorithms: (i) chamfer matching for object detection, and (ii) modulating contrast terms in MRF based object-specific segmentation. We show that performance of both algorithms (matching and segmentation) is considerably improved by the class-specific edge labelling.

...read moreread less

Proceedings Article•DOI•

Recognition of Multi-Object Events Using Attribute Grammars

[...]

S. W. Joo¹, Rama Chellappa¹•Institutions (1)

University of Maryland, College Park¹

01 Oct 2006

TL;DR: The effectiveness of the method for representing and recognizing visual events using attribute grammars for the task of recognizing vehicle casing in parking lots and events occurring in an airport tarmac is demonstrated.

...read moreread less

Abstract: We present a method for representing and recognizing visual events using attribute grammars. In contrast to conventional grammars, attribute grammars are capable of describing features that are not easily represented by finite symbols. Our approach handles multiple concurrent events involving multiple entities by associating unique object identification labels with multiple event threads. Probabilistic parsing and probabilistic conditions on the attributes are used to achieve a robust recognition system. We demonstrate the effectiveness of our method for the task of recognizing vehicle casing in parking lots and events occurring in an airport tarmac.

...read moreread less

Collapse