Showing papers by "Shai Avidan published in 2004"

PDF

Open Access

Journal Article•DOI•

[...]

01 Aug 2004-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Support Vector Tracking integrates the Support Vector Machine (SVM) classifier into an optic-flow-based tracker and maximizes the SVM classification score to account for large motions between successive frames.

...read moreread less

Abstract: Support Vector Tracking (SVT) integrates the Support Vector Machine (SVM) classifier into an optic-flow-based tracker. Instead of minimizing an intensity difference function between successive frames, SVT maximizes the SVM classification score. To account for large motions between successive frames, we build pyramids from the support vectors and use a coarse-to-fine approach in the classification stage. We show results of using SVT for vehicle tracking in image sequences.

...read moreread less

1,131 citations

Proceedings Article•DOI•

Crowd detection in video sequences

[...]

P. Reisman, O. Mano, Shai Avidan, Amnon Shashua

14 Jun 2004

TL;DR: A scheme that looks at the motion patterns of crowd in the spatio-temporal domain and gives an efficient implementation that can detect crowd in real-time that detects moving crowd in a video sequence.

...read moreread less

Abstract: We present a real-time system that detects moving crowd in a video sequence. Crowd detection differs from pedestrian detection in that we assume that no individual pedestrian can be properly segmented in the image. We propose a scheme that looks at the motion patterns of crowd in the spatio-temporal domain and give an efficient implementation that can detect crowd in real-time. In our experiments we detected crowd at distances of up to 70 m.

...read moreread less

86 citations

Proceedings Article•DOI•

Joint feature-basis subset selection

[...]

Shai Avidan¹•Institutions (1)

Mitsubishi¹

19 Jul 2004

TL;DR: The masking matrix help extend feature and basis selection methods while blurring the lines between them and offers a sub-optimal probabilistic method to find it.

...read moreread less

Abstract: We treat feature selection and basis selection in a unified framework by introducing the masking matrix. If one considers feature selection as finding a binary mask vector that determines which features participate in the learning process, and similarly, basis selection as finding a binary mask vector that determines which basis vectors are needed for the learning process, then the masking matrix is, in particular, the outer product of the feature masking vector and the basis masking vector. This representation allows for a joint estimation of both features and basis. In addition, it allows one to select features that appear in only part of the basis functions. This joint selection of feature/basis subset is not possible when using feature selection and basis selection algorithms independently, thus, the masking matrix help extend feature and basis selection methods while blurring the lines between them. The problem of searching for an optimal masking matrix is NP-hand and we offer a sub-optimal probabilistic method to find it. In particular we demonstrate our ideas on the problem of feature and basis selection for SVM classification and show results for the problem of image classification on faces and vehicles.

...read moreread less

14 citations

Book Chapter•DOI•

Probabilistic Multi-view Correspondence in a Distributed Setting with No Central Server

[...]

Shai Avidan, Yael Moses, Yoram Moses¹•Institutions (1)

Technion – Israel Institute of Technology¹

11 May 2004

TL;DR: A theoretical analysis of the number of times the \(\mathcal{WBS}\) must be performed to ensure that an overwhelming portion of the correspondence information is extracted and can be used to improve the performance of centralized algorithms for correspondence.

...read moreread less

Abstract: We present a probabilistic algorithm for finding correspondences across multiple images. The algorithm runs in a distributed setting, where each camera is attached to a separate computing unit, and the cameras communicate over a network. No central computer is involved in the computation. The algorithm runs with low computational and communication cost. Our distributed algorithm assumes access to a standard pairwise wide-baseline stereo matching algorithm (\(\mathcal{WBS}\)) and our goal is to minimize the number of images transmitted over the network, as well as the number of times the \(\mathcal{WBS}\) is computed. We employ the theory of random graphs to provide an efficient probabilistic algorithm that performs \(\mathcal{WBS}\) on a small number of image pairs, followed by a correspondence propagation phase. The heart of the paper is a theoretical analysis of the number of times \(\mathcal{WBS}\) must be performed to ensure that an overwhelming portion of the correspondence information is extracted. The analysis is extended to show how to combat computer and communication failures, which are expected to occur in such settings, as well as correspondence misses. This analysis yields an efficient distributed algorithm, but it can also be used to improve the performance of centralized algorithms for correspondence.

...read moreread less

10 citations

Proceedings Article•

The power of feature clustering: An application to object detection

[...]

Shai Avidan, Moshe Butman

01 Dec 2004

TL;DR: A fast rejection scheme that is based on image segments that is simple and fast to be learned, thus making it an excellent pre-processing step to accelerate standard machine learning classifiers, such as neural-networks, Bayes classifiers or SVM.

...read moreread less

Abstract: We give a fast rejection scheme that is based on image segments and demonstrate it on the canonical example of face detection. However, instead of focusing on the detection step we focus on the rejection step and show that our method is simple and fast to be learned, thus making it an excellent pre-processing step to accelerate standard machine learning classifiers, such as neural-networks, Bayes classifiers or SVM. We decompose a collection of face images into regions of pixels with similar behavior over the image set. The relationships between the mean and variance of image segments are used to form a cascade of rejectors that can reject over 99.8% of image patches, thus only a small fraction of the image patches must be passed to a full-scale classifier. Moreover, the training time for our method is much less than an hour, on a standard PC. The shape of the features (i.e. image segments) we use is data-driven, they are very cheap to compute and they form a very low dimensional feature space in which exhaustive search for the best features is tractable.

...read moreread less

8 citations