scispace - formally typeset
Search or ask a question

Showing papers by "Matthew Turk published in 2009"


Proceedings ArticleDOI
01 Dec 2009
TL;DR: A novel framework for searching for people in surveillance environments based on a parsing of human parts and their attributes, including facial hair, eyewear, clothing color, etc, which can be extracted using detectors learned from large amounts of training data is proposed.
Abstract: We propose a novel framework for searching for people in surveillance environments. Rather than relying on face recognition technology, which is known to be sensitive to typical surveillance conditions such as lighting changes, face pose variation, and low-resolution imagery, we approach the problem in a different way: we search for people based on a parsing of human parts and their attributes, including facial hair, eyewear, clothing color, etc. These attributes can be extracted using detectors learned from large amounts of training data. A complete system that implements our framework is presented. At the interface, the user can specify a set of personal characteristics, and the system then retrieves events that match the provided description. For example, a possible query is “show me the bald people who entered a given building last Saturday wearing a red shirt and sunglasses.” This capability is useful in several applications, such as finding suspects or missing people. To evaluate the performance of our approach, we present extensive experiments on a set of images collected from the Internet, on infrared imagery, and on two-and-a-half months of video from a real surveillance environment. We are not aware of any similar surveillance system capable of automatically finding people in video based on their fine-grained body parts and attributes.

233 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: This paper learns the matching scores themselves so as to produce shape similarity scores that minimize the classification loss, based on a max-margin formulation in the structured prediction setting.
Abstract: Many traditional methods for shape classification involve establishing point correspondences between shapes to produce matching scores, which are in turn used as similarity measures for classification. Learning techniques have been applied only in the second stage of this process, after the matching scores have been obtained. In this paper, instead of simply taking for granted the scores obtained by matching and then learning a classifier, we learn the matching scores themselves so as to produce shape similarity scores that minimize the classification loss. The solution is based on a max-margin formulation in the structured prediction setting. Experiments in shape databases reveal that such an integrated learning algorithm substantially improves on existing methods.

17 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: The method is useful to extend the applicability of techniques that rely on the analysis of shadows cast by multiple light sources placed at different positions, as the individual shadows captured at distinct instants of time now can be obtained from a single shot, enabling the processing of dynamic scenes.
Abstract: Consider a projector-camera setup where a sinusoidal pattern is projected onto the scene, and an image of the objects imprinted with the pattern is captured by the camera. In this configuration, the local frequency of the sinusoidal pattern as seen by the camera is a function of both the frequency of the projected sinusoid and the local geometry of objects in the scene. We observe that, by strategically placing the projector and the camera in canonical configuration and projecting sinusoidal patterns aligned with the epipolar lines, the frequency of the sinusoids seen in the image becomes invariant to the local object geometry. This property allows us to design systems composed of a camera and multiple projectors, which can be used to capture a single image of a scene illuminated by all projectors at the same time, and then demultiplex the frequencies generated by each individual projector separately. We show how imaging systems like those can be used to segment, from a single image, the shadows cast by each individual projector - an application that we call coded shadow photography. The method is useful to extend the applicability of techniques that rely on the analysis of shadows cast by multiple light sources placed at different positions, as the individual shadows captured at distinct instants of time now can be obtained from a single shot, enabling the processing of dynamic scenes.

9 citations


Book ChapterDOI
16 Dec 2009

1 citations