Showing papers by "Trevor Darrell published in 2001"

PDF

Open Access

Proceedings Article•DOI•

Integrated face and gait recognition from multiple views

[...]

Gregory Shakhnarovich¹, L. Lee¹, Trevor Darrell¹•Institutions (1)

01 Jan 2001

TL;DR: This work develops a view-normalization approach to multi-view face and gait recognition that provides greater recognition accuracy than is obtained using the unnormalized input sequences, and that integrated face andgait recognition provides improved performance over either modality alone.

...read moreread less

Abstract: We develop a view-normalization approach to multi-view face and gait recognition. An image-based visual hull (IBVH) is computed from a set of monocular views and used to render virtual views for tracking and recognition. We determine canonical viewpoints by examining the 3D structure, appearance (texture), and motion of the moving person. For optimal face recognition, we place virtual cameras to capture frontal face appearance; for gait recognition we place virtual cameras to capture a side-view of the person. Multiple cameras can be rendered simultaneously, and camera position is dynamically updated as the person moves through the workspace. Image sequences from each canonical view are passed to an unmodified face or gait recognition algorithm. We show that our approach provides greater recognition accuracy than is obtained using the unnormalized input sequences, and that integrated face and gait recognition provides improved performance over either modality alone. Canonical view estimation, rendering, and recognition have been efficiently implemented and can run at near real-time speeds.

...read moreread less

263 citations

Journal Article•DOI•

Privacy in context

[...]

Mark S. Ackerman¹, Trevor Darrell¹, Daniel J. Weitzner¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Dec 2001-Human-Computer Interaction

TL;DR: It is argued that privacy in context-aware computing, especially those with perceptually aware environments, will be quite complex, and future research will need to consider how regulatory and technical solutions might be co-designed to form a public good.

...read moreread less

Abstract: Context-aware computing offers the promise of significant user gains--the ability for systems to adapt more readily to user needs, models, and goals. Dey, Abowd, and Salber (2001 [this special issue]) present a masterful step toward understanding context-aware applications. We examine Dey et al. in the light of privacy issues--that is, individuals' control over their personal data--to highlight some of the thorny issues in context-aware computing that will be upon us soon. We argue that privacy in context-aware computing, especially those with perceptually aware environments, will be quite complex. Indeed, privacy forms a co-design space between the social, the technical, and the regulatory. We recognize that Dey et al. is a necessary first step in examining important software engineering concerns, but future research will need to consider how regulatory and technical solutions might be co-designed to form a public good.

...read moreread less

234 citations

Proceedings Article•DOI•

Plan-view trajectory estimation with dense stereo background models

[...]

Trevor Darrell¹, David Demirdjian¹, Neal Checka¹, Pedro F. Felzenszwalb¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 2001

TL;DR: In this paper, the authors derive dense stereo models for object tracking using long-term, extended dynamic-range imagery, and by detecting and interpolating uniform but unoccluded planar regions.

...read moreread less

Abstract: In a known environment, objects may be tracked in multiple views using a set of background models. Stereo-based models can be illumination-invariant, but often have undefined values which inevitably lead to foreground classification errors. We derive dense stereo models for object tracking using long-term, extended dynamic-range imagery, and by detecting and interpolating uniform but unoccluded planar regions. Foreground points are detected quickly in new images using pruned disparity search. We adopt a "late-segmentation" strategy, using an integrated plan-view density representation. Foreground points are segmented into object regions only when a trajectory is finally estimated, using a dynamic programming-based method. Object entry and exit are optimally determined and are not restricted to special spatial zones.

...read moreread less

125 citations

Proceedings Article•DOI•

Motion estimation from disparity images

[...]

David Demirdjian¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

07 May 2001

TL;DR: A rigid transformation is introduced that maps two disparity images of a rigidly moving object and how it is related to the Euclidean rigid motion and a motion estimation algorithm is derived.

...read moreread less

Abstract: A new method for 3D rigid motion estimation from stereo is proposed in this paper. The appealing feature of this method is that it directly uses the disparity images obtained from stereo matching. We assume that the stereo rig has parallel cameras and show, in that case, the geometric and topological properties of the disparity images. Then we introduce a rigid transformation (called d-motion) that maps two disparity images of a rigidly moving object. We show how it is related to the Euclidean rigid motion and a motion estimation algorithm is derived. We show with experiments that our approach is simple and more accurate than standard approaches.

...read moreread less

62 citations

Proceedings Article•DOI•

Reducing drift in parametric motion tracking

[...]

Ali Rahimi¹, Louis-Philippe Morency¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

07 May 2001

TL;DR: A class of differential motion trackers that automatically stabilize when in finite domains is developed, using an approximation to the posterior distribution of pose changes as an uncertainty model for parametric motion in order to help arbitrate the use of multiple base frames.

...read moreread less

Abstract: We develop a class of differential motion trackers that automatically stabilize when in finite domains. Most differential trackers compute motion only relative to one previous frame, accumulating errors indefinitely. We estimate pose changes between a set of past frames, and develop a probabilistic framework for integrating those estimates. We use an approximation to the posterior distribution of pose changes as an uncertainty model for parametric motion in order to help arbitrate the use of multiple base frames. We demonstrate this framework on a simple 2D translational tracker and a 3D, 6-degree of freedom tracker.

...read moreread less

49 citations

Journal Article•DOI•

Using multiple-hypothesis disparity maps and image velocity for 3-D motion estimation

[...]

David Demirdjian¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

09 Dec 2001-International Journal of Computer Vision

TL;DR: The use of Gaussian mixtures to model correspondence uncertainties for disparity and image velocity estimation is introduced and some properties of the disparity space are shown and how rigid transformations can be represented.

...read moreread less

Abstract: In this paper we explore a multiple hypothesis approach to estimating rigid motion from a moving stereo rig. More precisely, we introduce the use of Gaussian mixtures to model correspondence uncertainties for disparity and velocity (optical flow) estimation. We show some properties of the disparity space and show how rigid transformations can be represented. An algorithm derived from standard random sampling-based robust estimators, that efficiently estimates rigid transformations from multi-hypothesis disparity maps and velocity fields is given.

...read moreread less

21 citations

Proceedings Article•DOI•

Audio-video array source separation for perceptual user interfaces

[...]

Kevin R. Wilson¹, Neal Checka¹, David Demirdjian¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

15 Nov 2001

TL;DR: This work presents an audio-video localization technique that combines the benefits of the two modalities, and achieves an 8.9 dB improvement over a single far-field microphone and a 6.7dB improvement over source separation based on video-only localization.

...read moreread less

Abstract: Steerable microphone arrays provide a flexible infrastructure for audio source separation. In order for them to be used effectively in perceptual user interfaces, there must be a mechanism in place for steering the focus of the array to the sound source. Audio-only steering techniques often perform poorly in the presence of multiple sound sources or strong reverberation. Video-only techniques can achieve high spatial precision but require that the audio and video subsystems be accurately calibrated to preserve this precision. We present an audio-video localization technique that combines the benefits of the two modalities. We implement our technique in a test environment containing multiple stereo cameras and a room-sized microphone array. Our technique achieves an 8.9 dB improvement over a single far-field microphone and a 6.7 dB improvement over source separation based on video-only localization.

...read moreread less

20 citations

Proceedings Article•DOI•

Signal level fusion for multimodal perceptual user interface

[...]

John W. Fisher¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

15 Nov 2001

TL;DR: This work presents an information theoretic approach for fusion of multiple modalities, and presents empirical results demonstrating audio-video localization and consistency measurement.

...read moreread less

Abstract: Multi-modal fusion is an important, yet challenging task for perceptual user interfaces. Humans routinely perform complex and simple tasks in which ambiguous auditory and visual data are combined in order to support accurate perception. By contrast, automated approaches for processing multi-modal data sources lag far behind. This is primarily due to the fact that few methods adequately model the complexity of the audio/visual relationship. We present an information theoretic approach for fusion of multiple modalities. Furthermore we discuss a statistical model for which our approach to fusion is justified. We present empirical results demonstrating audio-video localization and consistency measurement. We show examples determining where a speaker is within a scene, and whether they are producing the specified audio stream.

...read moreread less

17 citations

Journal Article•DOI•

Correspondence with cumulative similarity transforms

[...]

Trevor Darrell¹, M. Covell•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 2001-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A local image transform based on cumulative similarity measures is defined and is shown to enable efficient correspondence and tracking near occluding boundaries and results comparing this method to traditional least-squares and robust correspondence matching are shown.

...read moreread less

Abstract: A local image transform based on cumulative similarity measures is defined and is shown to enable efficient correspondence and tracking near occluding boundaries. Unlike traditional methods, this transform allows correspondences to be found when the only contrast present is the occluding boundary itself and when the sign of contrast along the boundary is possibly reversed. The transform is based on the idea of a cumulative similarity measure which characterizes the shape of local image homogeneity; both the value of an image at a particular point and the shape of the region with locally similar and connected values is captured. This representation is insensitive to structure beyond an occluding boundary but is sensitive to the shape of the boundary itself, which is often an important cue. We show results comparing this method to traditional least-squares and robust correspondence matching.

...read moreread less

9 citations

Journal Article•DOI•

Range segmentation using visibility constraints

[...]

Leonid Taycher¹, Trevor Darrell¹•Institutions (1)

Massachusetts Institute of Technology¹

09 Dec 2001-International Journal of Computer Vision

TL;DR: In cases where the background pattern is stationary, it is shown how visibility constraints from other views can generate virtual background values at points with no valid depth in the primary view.

...read moreread less

Abstract: Visibility constraints can aid the segmentation of foreground objects observed with multiple range images. In our approach, points are defined as foreground if they can be determined to occlude some empty space in the scene. We present an efficient algorithm to estimate foreground points in each range view using explicit epipolar search. In cases where the background pattern is stationary, we show how visibility constraints from other views can generate virtual background values at points with no valid depth in the primary view. We demonstrate the performance of both algorithms for detecting people in indoor office environments.

...read moreread less

8 citations