Search or ask a question

Showing papers by "Rob Fergus published in 2007"

PDF

Open Access

Journal Article•DOI•

Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories

[...]

Li Fei-Fei¹, Rob Fergus², Pietro Perona³•Institutions (3)

Princeton University¹, University of Oxford², California Institute of Technology³

01 Apr 2007-Computer Vision and Image Understanding

TL;DR: The incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximum-likelihood, which have comparable classification performance on small training sets, but incremental learning is significantly faster, making real-time learning feasible.

...read moreread less

2,597 citations

Proceedings Article•DOI•

Image and depth from a conventional camera with a coded aperture

[...]

Anat Levin, Rob Fergus, Frédo Durand, William T. Freeman

29 Jul 2007

TL;DR: A simple modification to a conventional camera is proposed to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture, and introduces a criterion for depth discriminability which is used to design the preferred aperture pattern.

...read moreread less

Abstract: A conventional camera captures blurred versions of scene information away from the plane of focus. Camera systems have been proposed that allow for recording all-focus images, or for extracting depth, but to record both simultaneously has required more extensive hardware and reduced spatial resolution. We propose a simple modification to a conventional camera that allows for the simultaneous recovery of both (a) high resolution image information and (b) depth information adequate for semi-automatic extraction of a layered depth representation of the image. Our modification is to insert a patterned occluder within the aperture of the camera lens, creating a coded aperture. We introduce a criterion for depth discriminability which we use to design the preferred aperture pattern. Using a statistical model of images, we can recover both depth information and an all-focus image from single photographs taken with the modified camera. A layered depth map is then extracted, requiring user-drawn strokes to clarify layer assignments in some cases. The resulting sharp image and layered depth map can be combined for various photographic applications, including automatic scene segmentation, post-exposure refocusing, or re-rendering of the scene from an alternate viewpoint.

...read moreread less

1,489 citations

Journal Article•DOI•

Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition

[...]

Rob Fergus¹, Pietro Perona², Andrew Zisserman¹•Institutions (2)

University of Oxford¹, California Institute of Technology²

01 Mar 2007-International Journal of Computer Vision

TL;DR: The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).

...read moreread less

Abstract: We investigate a method for learning object categories in a weakly supervised manner. Given a set of images known to contain the target category from a similar viewpoint, learning is translation and scale-invariant; does not require alignment or correspondence between the training images, and is robust to clutter and occlusion. Category models are probabilistic constellations of parts, and their parameters are estimated by maximizing the likelihood of the training data. The appearance of the parts, as well as their mutual position, relative scale and probability of detection are explicitly described in the model. Recognition takes place in two stages. First, a feature-finder identifies promising locations for the model"s parts. Second, the category model is used to compare the likelihood that the observed features are generated by the category model, or are generated by background clutter. The flexible nature of the model is demonstrated by results over six diverse object categories including geometrically constrained categories (e.g. faces, cars) and flexible objects (such as animals).

...read moreread less

234 citations

Proceedings Article•

Object Recognition by Scene Alignment

[...]

Bryan Russell¹, Antonio Torralba¹, Ce Liu¹, Rob Fergus¹, William T. Freeman¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

03 Dec 2007

TL;DR: In this article, a probabilistic model is proposed to transfer the labels from the retrieval set to the input image, in an appropriate representation, to obtain hypotheses for object identities and locations.

...read moreread less

Abstract: Current object recognition systems can only recognize a limited number of object categories; scaling up to many categories is the next challenge. We seek to build a system to recognize and localize many different object categories in complex scenes. We achieve this through a simple approach: by matching the input image, in an appropriate representation, to images in a large training set of labeled images. Due to regularities in object identities across similar scenes, the retrieved matches provide hypotheses for object identities and locations. We build a probabilistic model to transfer the labels from the retrieval set to the input image. We demonstrate the effectiveness of this approach and study algorithm component contributions using held-out test sets from the LabelMe database.

...read moreread less

101 citations