Showing papers on "3D single-object recognition published in 2010"

PDF

Open Access

Book Chapter•DOI•

Adapting visual category models to new domains

[...]

Kate Saenko¹, Brian Kulis¹, Mario Fritz¹, Trevor Darrell¹•Institutions (1)

05 Sep 2010

TL;DR: This paper introduces a method that adapts object models acquired in a particular visual domain to new imaging conditions by learning a transformation that minimizes the effect of domain-induced changes in the feature distribution.

...read moreread less

Abstract: Domain adaptation is an important emerging topic in computer vision. In this paper, we present one of the first studies of domain shift in the context of object recognition. We introduce a method that adapts object models acquired in a particular visual domain to new imaging conditions by learning a transformation that minimizes the effect of domain-induced changes in the feature distribution. The transformation is learned in a supervised manner and can be applied to categories for which there are no labeled examples in the new domain. While we focus our evaluation on object recognition tasks, the transform-based adaptation technique we develop is general and could be applied to nonimage data. Another contribution is a new multi-domain object database, freely available for download. We experimentally demonstrate the ability of our method to improve recognition on categories with few or no target domain labels and moderate to large changes in the imaging conditions.

...read moreread less

2,624 citations

Book Chapter•DOI•

Word spotting in the wild

[...]

Kai Wang¹, Serge Belongie¹•Institutions (1)

University of California, San Diego¹

05 Sep 2010

TL;DR: It is argued that the appearance of words in the wild spans this range of difficulties and a new word recognition approach based on state-of-the-art methods from generic object recognition is proposed, in which object categories are considered to be the words themselves.

...read moreread less

Abstract: We present a method for spotting words in the wild, i.e., in real images taken in unconstrained environments. Text found in the wild has a surprising range of difficulty. At one end of the spectrum, Optical Character Recognition (OCR) applied to scanned pages of well formatted printed text is one of the most successful applications of computer vision to date. At the other extreme lie visual CAPTCHAs - text that is constructed explicitly to fool computer vision algorithms. Both tasks involve recognizing text, yet one is nearly solved while the other remains extremely challenging. In this work, we argue that the appearance of words in the wild spans this range of difficulties and propose a new word recognition approach based on state-of-the-art methods from generic object recognition, in which we consider object categories to be the words themselves. We compare performance of leading OCR engines - one open source and one proprietary - with our new approach on the ICDAR Robust Reading data set and a new word spotting data set we introduce in this paper: the Street View Text data set. We show improvements of up to 16% on the data sets, demonstrating the feasibility of a new approach to a seemingly old problem.

...read moreread less

503 citations

Book Chapter•DOI•

Visual recognition with humans in the loop

[...]

Steve Branson¹, Catherine Wah¹, Florian Schroff¹, Boris Babenko¹, Peter Welinder², Pietro Perona², Serge Belongie¹ - Show less +3 more•Institutions (2)

University of California, San Diego¹, California Institute of Technology²

05 Sep 2010

TL;DR: The results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

...read moreread less

Abstract: We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

...read moreread less

492 citations

Journal Article•DOI•

Context based object categorization: A critical survey

[...]

Carolina Galleguillos¹, Serge Belongie¹•Institutions (1)

University of California, San Diego¹

01 Jun 2010-Computer Vision and Image Understanding

TL;DR: This work addresses the problem of incorporating different types of contextual information for robust object categorization in computer vision by considering the most common levels of extraction of context and the different levels of contextual interactions.

...read moreread less

383 citations

Posted Content•

Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition

[...]

Koray Kavukcuoglu¹, Marc'Aurelio Ranzato¹, Yann LeCun¹•Institutions (1)

New York University¹

18 Oct 2010-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a simple and efficient algorithm to learn basis functions, which provides a fast and smooth approximator to the optimal representation, achieving even better accuracy than exact sparse coding algorithms on visual object recognition tasks.

...read moreread less

Abstract: Adaptive sparse coding methods learn a possibly overcomplete set of basis functions, such that natural image patches can be reconstructed by linearly combining a small subset of these bases. The applicability of these methods to visual object recognition tasks has been limited because of the prohibitive cost of the optimization algorithms required to compute the sparse representation. In this work we propose a simple and efficient algorithm to learn basis functions. After training, this model also provides a fast and smooth approximator to the optimal representation, achieving even better accuracy than exact sparse coding algorithms on visual object recognition tasks.

...read moreread less

266 citations

Proceedings Article•DOI•

Figure-ground segmentation improves handled object recognition in egocentric video

[...]

Xiaofeng Ren¹, Chunhui Gu²•Institutions (2)

Intel¹, University of California, Berkeley²

13 Jun 2010

TL;DR: This work develops a bottom-up motion-based approach to robustly segment out foreground objects in egocentric video and shows that it greatly improves object recognition accuracy.

...read moreread less

Abstract: Identifying handled objects, i.e. objects being manipulated by a user, is essential for recognizing the person's activities. An egocentric camera as worn on the body enjoys many advantages such as having a natural first-person view and not needing to instrument the environment. It is also a challenging setting, where background clutter is known to be a major source of problems and is difficult to handle with the camera constantly and arbitrarily moving. In this work we develop a bottom-up motion-based approach to robustly segment out foreground objects in egocentric video and show that it greatly improves object recognition accuracy. Our key insight is that egocentric video of object manipulation is a special domain and many domain-specific cues can readily help. We compute dense optical flow and fit it into multiple affine layers. We then use a max-margin classifier to combine motion with empirical knowledge of object location and background movement as well as temporal cues of support region and color appearance. We evaluate our segmentation algorithm on the large Intel Egocentric Object Recognition dataset with 42 objects and 100K frames. We show that, when combined with temporal integration, figure-ground segmentation improves the accuracy of a SIFT-based recognition system from 33% to 60%, and that of a latent-HOG system from 64% to 86%.

...read moreread less

191 citations

Dissertation•

Sparse and redundant representations for inverse problems and recognition

[...]

Vishal M. Patel

01 Jan 2010

TL;DR: This research investigates the combination of domain adaptation, dictionary learning, object recognition, activity recognition, and shape representation in machine learning to solve the challenge of sparse representation in signal/Image processing.

...read moreread less

Abstract: Research Interests Security and privacy: Active authentication, biometrics template protection, biometrics recognition. Computer vision: Domain adaptation, dictionary learning, object recognition, activity recognition, shape representation. Machine learning: Dimensionality reduction, clustering, kernel methods, weakly-supervised learning. Signal/Image processing: Sparse representation, compressive sampling, synthetic aperture radar imaging, millimeter wave imaging.

...read moreread less

160 citations

Patent•

Computer vision gesture based control of a device

[...]

Haim Perski, Amir Kaplan, Eran Eilat

13 Oct 2010

TL;DR: In this article, a system and method for controlling a device based on computer vision is described, which is based on receiving a sequence of images of a field of view; detecting movement of at least one object in the images; applying a shape recognition algorithm on the at least moving object; confirming that the object is a user hand by combining information from at least two images of the object; and tracking the object to control the device.

...read moreread less

Abstract: A system and method are provided for controlling a device based on computer vision. Embodiments of the system and method of the invention are based on receiving a sequence of images of a field of view; detecting movement of at least one object in the images; applying a shape recognition algorithm on the at least one moving object; confirming that the object is a user hand by combining information from at least two images of the object; and tracking the object to control the device.

...read moreread less

155 citations

Journal Article•DOI•

Using the forest to see the trees: exploiting context for visual object detection and localization

[...]

Antonio Torralba¹, Kevin Murphy², William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, University of British Columbia²

01 Mar 2010-Communications of The ACM

TL;DR: In this paper, a probabilistic framework for encoding the relationships between context and object properties is proposed, which can be used to reduce the search space by looking only in places in which the object is expected to be; this also increases performance, by rejecting patterns that look like the target but appear in unlikely places.

...read moreread less

Abstract: Recognizing objects in images is an active area of research in computer vision. In the last two decades, there has been much progress and there are already object recognition systems operating in commercial products. However, most of the algorithms for detecting objects perform an exhaustive search across all locations and scales in the image comparing local image regions with an object model. That approach ignores the semantic structure of scenes and tries to solve the recognition problem by brute force. In the real world, objects tend to covary with other objects, providing a rich collection of contextual associations. These contextual associations can be used to reduce the search space by looking only in places in which the object is expected to be; this also increases performance, by rejecting patterns that look like the target but appear in unlikely places. Most modeling attempts so far have defined the context of an object in terms of other previously recognized objects. The drawback of this approach is that inferring the context becomes as difficult as detecting each object. An alternative view of context relies on using the entire scene information holistically. This approach is algorithmically attractive since it dispenses with the need for a prior step of individual object recognition. In this paper, we use a probabilistic framework for encoding the relationships between context and object properties and we show how an integrated system provides improved performance. We view this as a significant step toward general purpose machine vision systems.

...read moreread less

147 citations

Journal Article•DOI•

Object recognition by scene alignment

[...]

Antonio Torralba¹, Aude Oliva², William T. Freeman¹•Institutions (2)

Massachusetts Institute of Technology¹, Michigan State University²

16 Mar 2010-Journal of Vision

TL;DR: This work builds a probabilistic model to transfer the labels from the retrieval set to the input image, and demonstrates the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.

...read moreread less

Abstract: Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.

...read moreread less

119 citations

Journal Article•DOI•

Segmentation and object recognition using edge detection techniques

[...]

Y. Ramadevi, T. Sridevi, B. Poornima, B. Kalyani

23 Dec 2010-International Journal of Computer Science and Information Technology

TL;DR: Interaction between image segmentation (using different edge detection methods) and object recognition are discussed and Expectation-Maximization (EM) algorithm, OSTU and Genetic algorithms were used to demonstrate the synergy between the segmented images andobject recognition.

...read moreread less

Abstract: Image segmentation is to partition an image into meaningful regions with respect to a particular application. Object recognition is the task of finding a given object in an image or video sequence. In this paper, interaction between image segmentation (using different edge detection methods) and object recognition are discussed. Edge detection methods such as Sobel, Prewitt, Roberts, Canny, Laplacian of Guassian(LoG) are used for segmenting the image. Expectation-Maximization (EM) algorithm, OSTU and Genetic algorithms were used to demonstrate the synergy between the segmented images and object recognition.

...read moreread less

Proceedings Article•DOI•

Making specific features less discriminative to improve point-based 3D object recognition

[...]

Edward Hsiao¹, Alvaro Collet¹, Martial Hebert¹•Institutions (1)

Carnegie Mellon University¹

13 Jun 2010

TL;DR: A framework that retains ambiguity in feature matching to increase the performance of 3D object recognition systems is presented and vector quantize and match model features in a hierarchical manner to preserve ambiguity during matching.

...read moreread less

Abstract: We present a framework that retains ambiguity in feature matching to increase the performance of 3D object recognition systems. Whereas previous systems removed ambiguous correspondences during matching, we show that ambiguity should be resolved during hypothesis testing and not at the matching phase. To preserve ambiguity during matching, we vector quantize and match model features in a hierarchical manner. This matching technique allows our system to be more robust to the distribution of model descriptors in feature space. We also show that we can address recognition under arbitrary viewpoint by using our framework to facilitate matching of additional features extracted from affine transformed model images. The evaluation of our algorithms in 3D object recognition is demonstrated on a difficult dataset of 620 images.

...read moreread less

Patent•

Techniques for enabling or establishing the use of face recognition algorithms

[...]

Salih Burak Gokturk, Dragomir Anguelov¹, Lorenzo Torresani, Vincent Vanhoucke¹, Munjal Shah¹, Diem Vu¹, Kuang-Chih Lee¹ - Show less +3 more•Institutions (1)

Google¹

07 Dec 2010

TL;DR: In this article, the authors describe a set of Embodiments that facilitate or enhance the implementation of image recognition processes which can perform recognition on images to identify objects and/or faces by class or by people.

...read moreread less

Abstract: Embodiments described herein facilitate or enhance the implementation of image recognition processes which can perform recognition on images to identify objects and/or faces by class or by people.

...read moreread less

Proceedings Article•DOI•

Toward coherent object detection and scene layout understanding

[...]

Sid Yingze Bao¹, Min Sun¹, Silvio Savarese¹•Institutions (1)

University of Michigan¹

13 Jun 2010

TL;DR: This work has the unique ability to jointly reduce false alarm and false negative object detection rate and recover object location and supporting planes within the 3D camera reference system and infer camera parameters from just one single uncalibrated image.

...read moreread less

Abstract: Detecting objects in complex scenes while recovering the scene layout is a critical functionality in many vision-based applications. Inspired by the work of [18], we advocate the importance of geometric contextual reasoning for object recognition. We start from the intuition that objects' location and pose in the 3D space are not arbitrarily distributed but rather constrained by the fact that objects must lie on one or multiple supporting surfaces. We model such supporting surfaces by means of hidden parameters (i.e. not explicitly observed) and formulate the problem of joint scene reconstruction and object recognition as the one of finding the set of parameters that maximizes the joint probability of having a number of detected objects on K supporting planes given the observations. As a key ingredient for solving this optimization problem, we have demonstrated a novel relationship between object location and pose in the image, and the scene layout parameters (i.e. normal of one or more supporting planes in 3D and camera pose, location and focal length). Using the probabilistic formulation and the above relationship our method has the unique ability to jointly: i) reduce false alarm and false negative object detection rate; ii) recover object location and supporting planes within the 3D camera reference system; iii) infer camera parameters (view point and the focal length) from just one single uncalibrated image. Quantitative and qualitative experimental evaluation on a number of datasets (a novel in-house dataset and label-me[28] on car and pedestrian) demonstrates our theoretical claims.

...read moreread less

Book Chapter•DOI•

Adaptation of SIFT features for robust face recognition

[...]

Janez Križaj¹, Vitomir Struc¹, Nikola Pavešić¹•Institutions (1)

University of Ljubljana¹

21 Jun 2010

TL;DR: A novel face recognition technique that computes the SIFT descriptors at predefined (fixed) locations learned during the training stage is presented, which renders the approach more robust to illumination changes than related approaches from the literature.

...read moreread less

Abstract: The Scale Invariant Feature Transform (SIFT) is an algorithm used to detect and describe scale-, translation- and rotation-invariant local features in images The original SIFT algorithm has been successfully applied in general object detection and recognition tasks, panorama stitching and others One of its more recent uses also includes face recognition, where it was shown to deliver encouraging results SIFT-based face recognition techniques found in the literature rely heavily on the so-called keypoint detector, which locates interest points in the given image that are ultimately used to compute the SIFT descriptors While these descriptors are known to be among others (partially) invariant to illumination changes, the keypoint detector is not Since varying illumination is one of the main issues affecting the performance of face recognition systems, the keypoint detector represents the main source of errors in face recognition systems relying on SIFT features To overcome the presented shortcoming of SIFT-based methods, we present in this paper a novel face recognition technique that computes the SIFT descriptors at predefined (fixed) locations learned during the training stage By doing so, it eliminates the need for keypoint detection on the test images and renders our approach more robust to illumination changes than related approaches from the literature Experiments, performed on the Extended Yale B face database, show that the proposed technique compares favorably with several popular techniques from the literature in terms of performance

...read moreread less

Journal Article•DOI•

Object recognition in construction-site images using 3D CAD-based filtering

[...]

Yuhong Wu, Hyoungkwan Kim, Changyoon Kim, Seung Heon Han

01 Jan 2010-Journal of Computing in Civil Engineering

TL;DR: A robust image processing methodology to effectively extract the objects of interest from construction-site digital images makes use of advanced imaging algorithms and a three-dimensional computer aided design perspective view to increase the accuracy of the object recognition.

...read moreread less

Abstract: Construction-site images that are now easily obtained from digital cameras have the potential to automatically provide the project status information. For example, once construction objects such as concrete columns are accurately identified and counted, the current level of project progress in the column installation activity can easily be measured. However, in order to identify and count the number of concrete columns installed at a particular point of time, a robust object recognition methodology is required. Without the successful recognition and extraction of the construction object of interest, it is almost impossible to understand the current level of project progress. This paper presents a robust image processing methodology to effectively extract the objects of interest from construction-site digital images. The proposed methodology makes use of advanced imaging algorithms and a three-dimensional computer aided design perspective view to increase the accuracy of the object recognition. Tests show that the methodology is promising and expected to provide a solid base for the successful, automatic acquisition of project information.

...read moreread less

Book•DOI•

Applied Graph Theory in Computer Vision and Pattern Recognition

[...]

Abraham Kandel, Horst Bunke

19 Nov 2010

TL;DR: In this article, the application of graph theory to low-level processing of digital images, presents graph-theoretic learning algorithms for high-level computer vision and pattern recognition applications, and provides detailed descriptions of several applications of graph-based methods to real-world pattern recognition tasks.

...read moreread less

Abstract: This book presents novel graph-theoretic methods for complex computer vision and pattern recognition tasks. It presents the application of graph theory to low-level processing of digital images, presents graph-theoretic learning algorithms for high-level computer vision and pattern recognition applications, and provides detailed descriptions of several applications of graph-based methods to real-world pattern recognition tasks.

...read moreread less

Proceedings Article•DOI•

Autonomous acquisition of visual multi-view object representations for object recognition on a humanoid robot

[...]

Kai Welke¹, Jan Issac¹, David Schiebener¹, Tamim Asfour¹, Rüdiger Dillmann¹ - Show less +1 more•Institutions (1)

Karlsruhe Institute of Technology¹

03 May 2010

TL;DR: This paper presents a systems for autonomous acquisition of visual object representations, which endows a humanoid robot with the ability to enrich its internal object representation and allows the realization of complex visual tasks.

...read moreread less

Abstract: The autonomous acquisition of object representations which allow recognition, localization and grasping of objects in the environment is a challenging task, which has shown to be difficult. In this paper, we present a systems for autonomous acquisition of visual object representations, which endows a humanoid robot with the ability to enrich its internal object representation and allows the realization of complex visual tasks. More precisely, we present techniques for segmentation and modeling of objects held in the five-fingered robot hand. Multiple object views are generated by rotating the held objects in the robot's field of view. The acquired object representations are evaluated in the context of visual search and object recognition tasks in cluttered environments. Experimental results show successful implementation of the complete cycle from object exploration to object recognition on a humanoid robot.

...read moreread less

Book Chapter•DOI•

Face Recognition and Retrieval in Video

[...]

Caifeng Shan¹•Institutions (1)

Philips¹

01 Jan 2010

TL;DR: This chapter reviews existing research on face recognition and retrieval in video, and the relevant techniques are comprehensively surveyed and discussed.

...read moreread less

Abstract: Automatic face recognition has long been established as one of the most active research areas in computer vision. Face recognition in unconstrained environments remains challenging for most practical applications. In contrast to traditional still-image based approaches, recently the research focus has shifted towards videobased approaches. Video data provides rich and redundant information, which can be exploited to resolve the inherent ambiguities of image-based recognition like sensitivity to low resolution, pose variations and occlusion, leading to more accurate and robust recognition. Face recognition has also been considered in the content-based video retrieval setup, for example, character-based video search. In this chapter, we review existing research on face recognition and retrieval in video. The relevant techniques are comprehensively surveyed and discussed.

...read moreread less

Patent•

Translating User Motion Into Multiple Object Responses

[...]

Oscar Omar Garza Santos¹, Matthew Eric Haigh¹, Christopher Vuchetich¹, Ben John Hindle¹, Darren Bennett¹ - Show less +1 more•Institutions (1)

Microsoft¹

20 Aug 2010

TL;DR: In this article, a system for translating user motion into multiple object responses of an on-screen object based on user interaction of an application executing on a computing device is provided, where user motion data is received from a capture device from one or more users.

...read moreread less

Abstract: A system for translating user motion into multiple object responses of an on-screen object based on user interaction of an application executing on a computing device is provided. User motion data is received from a capture device from one or more users. The user motion data corresponds to user interaction with an on-screen object presented in the application. The on-screen object corresponds to an object other than an on-screen representation of a user that is displayed by the computing device. The user motion data is automatically translated into multiple object responses of the on-screen object. The multiple object responses of the on-screen object are simultaneously displayed to the users.

...read moreread less

Proceedings Article•DOI•

Towards an efficient distributed object recognition system in wireless smart camera networks

[...]

Nikhil Naikal¹, Allen Y. Yang¹, S. Shankar Sastry¹•Institutions (1)

University of California, Berkeley¹

26 Jul 2010

TL;DR: This paper presents a public multiple-view object recognition database, called the Berkeley Multiview Wireless (BMW), and proposes a fast multiple- view classification method to jointly classify the object observed by the cameras.

...read moreread less

Abstract: We propose an efficient distributed object recognition system for sensing, compression, and recognition of 3-D objects and landmarks using a network of wireless smart cameras. The foundation is based on a recent work that shows the representation of scale-invariant image features exhibit certain degree of sparsity: If a common object is observed by multiple cameras from different vantage points, the corresponding features can be efficiently compressed in a distributed fashion, and the joint signals can be simultaneously decoded based on distributed compressive sensing theory. In this paper, we first present a public multiple-view object recognition database, called the Berkeley Multiview Wireless (BMW) database. It captures the 3-D appearance of 20 landmark buildings sampled by five low-power, low-resolution camera sensors from multiple vantage points. Then we review and benchmark state-of-the-art methods to extract image features and compress their sparse representations. Finally, we propose a fast multiple-view recognition method to jointly classify the object observed by the cameras. To this end, a distributed object recognition system is implemented on the Berkeley CITRIC smart camera platform. The system is capable of adapting to different network configurations and the wireless bandwidth. The multiple-view classification improves the performance of object recognition upon the traditional per-view classification algorithms.

...read moreread less

Book Chapter•DOI•

Recognizing partially occluded faces from a single sample per class using string-based matching

[...]

Weiping Chen¹, Yongsheng Gao¹•Institutions (1)

Griffith University¹

05 Sep 2010

TL;DR: Experimental results demonstrate, for the first time, the feasibility and effectiveness of a high-level syntactic method in face recognition, showing a new strategy for face representation and recognition.

...read moreread less

Abstract: Automatically recognizing human faces with partial occlusions is one of the most challenging problems in face analysis community. This paper presents a novel string-based face recognition approach to address the partial occlusion problem in face recognition. In this approach, a new face representation, Stringface, is constructed to integrate the relational organization of intermediate-level features (line segments) into a high-level global structure (string). The matching of two faces is done by matching two Stringfaces through a string-to-string matching scheme, which is able to efficiently find the most discriminative local parts (substrings) for recognition without making any assumption on the distributions of the deformed facial regions. The proposed approach is compared against the state-of-the-art algorithms using both the AR database and FRGC (Face Recognition Grand Challenge) ver2.0 database. Very encouraging experimental results demonstrate, for the first time, the feasibility and effectiveness of a high-level syntactic method in face recognition, showing a new strategy for face representation and recognition.

...read moreread less

Journal Article•DOI•

Image processing based recognition of images with a limited number of pixels using simulated prosthetic vision

[...]

Ying Zhao¹, Yanyu Lu¹, Yukun Tian¹, Liming Li¹, Qiushi Ren², Xinyu Chai¹ - Show less +2 more•Institutions (2)

Shanghai Jiao Tong University¹, Peking University²

01 Aug 2010-Information Sciences

TL;DR: This work investigates the effects of two kinds of image processing methods, two common shapes of pixels (square and circular) and six resolutions (8x8, 16x16, 24x24, 32x32, 48x48 and 64x64) and shows that the mean recognition accuracy increased with the number of pixels.

...read moreread less

Proceedings Article•

Multi-label Multiple Kernel Learning by Stochastic Approximation: Application to Visual Object Recognition

[...]

Serhat S. Bucak¹, Rong Jin¹, Anil K. Jain¹•Institutions (1)

Michigan State University¹

06 Dec 2010

TL;DR: This work develops an efficient algorithm for multi-label multiple kernel learning (ML-MKL) that combines the worst-case analysis with stochastic approximation and shows that the complexity of the algorithm is O(m1/3√lnm), where m is the number of classes.

...read moreread less

Abstract: Recent studies have shown that multiple kernel learning is very effective for object recognition, leading to the popularity of kernel learning in computer vision problems. In this work, we develop an efficient algorithm for multi-label multiple kernel learning (ML-MKL). We assume that all the classes under consideration share the same combination of kernel functions, and the objective is to find the optimal kernel combination that benefits all the classes. Although several algorithms have been developed for ML-MKL, their computational cost is linear in the number of classes, making them unscalable when the number of classes is large, a challenge frequently encountered in visual object recognition. We address this computational challenge by developing a framework for ML-MKL that combines the worst-case analysis with stochastic approximation. Our analysis shows that the complexity of our algorithm is O(m1/3√lnm), where m is the number of classes. Empirical studies with object recognition show that while achieving similar classification accuracy, the proposed method is significantly more efficient than the state-of-the-art algorithms for ML-MKL.

...read moreread less

Journal Article•DOI•

Three dimensional object recognition with photon counting imagery in the presence of noise

[...]

Mehdi Daneshpanah¹, Bahram Javidi¹, Edward A. Watson²•Institutions (2)

University of Connecticut¹, Air Force Research Laboratory²

06 Dec 2010-Optics Express

TL;DR: This work presents a statistical framework for 3D passive object recognition in presence of noise and suggests that with proper translation of physical characteristics of the imaging system into the information processing algorithms, photon-counting imagery can be used for object classification.

...read moreread less

Abstract: Three dimensional (3D) imaging systems have been recently suggested for passive sensing and recognition of objects in photon-starved environments where only a few photons are emitted or reflected from the object. In this paradigm, it is important to make optimal use of limited information carried by photons. We present a statistical framework for 3D passive object recognition in presence of noise. Since in quantum-limited regime, detector dark noise is present, our approach takes into account the effect of noise on information bearing photons. The model is tested when background noise and dark noise sources are present for identifying a target in a 3D scene. It is shown that reliable object recognition is possible in photon-counting domain. The results suggest that with proper translation of physical characteristics of the imaging system into the information processing algorithms, photon-counting imagery can be used for object classification.

...read moreread less

Journal Article•DOI•

Survey: Subspace methods for face recognition

[...]

Ashok Rao, S. Noushath

01 Feb 2010-Computer Science Review

TL;DR: This paper has considered the performance of about twenty five different subspace algorithms on data taken from four standard face and object databases namely ORL, Yale, FERET and the COIL-20 object database.

...read moreread less

Proceedings Article•DOI•

Optimizing one-shot recognition with micro-set learning

[...]

Kevin Tang¹, Marshall F. Tappen², Rahul Sukthankar³, Christoph H. Lampert⁴•Institutions (4)

Cornell University¹, University of Central Florida², Carnegie Mellon University³, Max Planck Society⁴

13 Jun 2010

TL;DR: This work details a discriminative approach for optimizing one-shot recognition using micro-sets and presents experiments on the Animals with Attributes and Caltech-101 datasets that demonstrate the benefits of the formulation.

...read moreread less

Abstract: For object category recognition to scale beyond a small number of classes, it is important that algorithms be able to learn from a small amount of labeled data per additional class One-shot recognition aims to apply the knowledge gained from a set of categories with plentiful data to categories for which only a single exemplar is available for each As with earlier efforts motivated by transfer learning, we seek an internal representation for the domain that generalizes across classes However, in contrast to existing work, we formulate the problem in a fundamentally new manner by optimizing the internal representation for the one-shot task using the notion of micro-sets A micro-set is a sample of data that contains only a single instance of each category, sampled from the pool of available data, which serves as a mechanism to force the learned representation to explicitly address the variability and noise inherent in the one-shot recognition task We optimize our learned domain features so that they minimize an expected loss over micro-sets drawn from the training set and show that these features generalize effectively to previously unseen categories We detail a discriminative approach for optimizing one-shot recognition using micro-sets and present experiments on the Animals with Attributes and Caltech-101 datasets that demonstrate the benefits of our formulation

...read moreread less

Book•

An Introduction to Object Recognition: Selected Algorithms for a Wide Variety of Applications

[...]

Marco Alexander Treiber

02 Aug 2010

TL;DR: This book is written in a tutorial style and is suitable as an introduction into the field of object recognition for interested readers who are not yet experts and avoids extensive usage of mathematics.

...read moreread less

Abstract: Object recognition has been an area of extensive research for a long time. During the last decades, a large number of algorithms have been proposed. This is due to the fact that, at a closer look, "object recognition" is an umbrella term for different algorithms designed for a wide variety of applications, where each application has its specific requirements and constraints. This book demonstrates the diversity of applications and highlights some important algorithm classes by presenting representative example algorithms for each class. This book is written in a tutorial style and is therefore suitable as an introduction into the field of object recognition for interested readers who are not yet experts. The presentation of each algorithm focuses on the main idea, which is described in detail, and avoids extensive usage of mathematics. Graphic illustrations of the algorithm flow facilitate understanding. The algorithms presented are classified according to the following categories: global approaches, transformation-search-based methods, geometrical model driven methods, 3D object recognition schemes, flexible contour fitting algorithms and feature-based methods. Typical example algorithms are presented for each of the categories.

...read moreread less

Proceedings Article•DOI•

Real-time 3D visual sensor for robust object recognition

[...]

Muhammad Attamimi¹, Akira Mizutani¹, Tomoaki Nakamura¹, Takayuki Nagai¹, Kotaro Funakoshi², Mikio Nakano² - Show less +2 more•Institutions (2)

University of Electro-Communications¹, Honda²

03 Dec 2010

TL;DR: A novel 3D measurement system, which yields both depth and color information in real time, by calibrating a time-of-flight and two CCD cameras, and a robust object recognition using the 3D visual sensor is presented.

...read moreread less

Abstract: This paper presents a novel 3D measurement system, which yields both depth and color information in real time, by calibrating a time-of-flight and two CCD cameras. The problem of occlusions is solved by the proposed fast occluded-pixel detection algorithm. Since the system uses two CCD cameras, missing color information of occluded pixels is covered by one another. We also propose a robust object recognition using the 3D visual sensor. Multiple cues, such as color, texture and 3D (depth) information, are integrated in order to recognize various types of objects under varying lighting conditions. We have implemented the system on our autonomous robot and made the robot do recognition tasks (object learning, detection, and recognition) in various environments. The results revealed that the proposed recognition system provides far better performance than the previous system that is based only on color and texture information.

...read moreread less

Proceedings Article•DOI•

A new object search and recognition method based on artificial object mark in complex indoor environment

[...]

Yinghua Xue¹, Guohui Tian¹, Rongkuan Li¹, Haitao Jiang¹•Institutions (1)

Shandong University¹

07 Jul 2010

TL;DR: The object search and recognition scheme proposed in the paper can improve the accuracy rate of object recognition, reduce the impact of light, and have a high recognition rate even the target is occluded partly.

...read moreread less

Abstract: A complete program of object search and recognition is proposed in the paper in order to realize the object recognition in the complex indoor environment. We design a kind of new object mark to assist object recognition. The mark is composed of two parts: the inner information representation and the outer logo. The inner information including attribute information and operating information is stored in QR Code. The outer logo includes two concentric colored circles and four orientation regions. The concentric red circles are used to locate the mark from a little far distance, and the orientation religions are helpful to assist robot to operate the target properly. The mark can only be recognized in a close distance, so the RFID technology is used to locate the object in a large scale. The large furniture is tagged with reference tag, and the target is pasted with target tag. When the robot moves around the space, he can read the tags one by one, and can obtain the rough position of target from the time sequence of tags. The object search and recognition scheme proposed in the paper can improve the accuracy rate of object recognition, reduce the impact of light, and have a high recognition rate even the target is occluded partly. The experiments demonstrate the effectiveness and feasibility of the scheme.

...read moreread less

Collapse