Showing papers by "Shai Avidan published in 2012"

PDF

Open Access

Proceedings Article•DOI•

[...]

Shaul Oron¹, Aharon Bar-Hillel², Dan Levi², Shai Avidan¹•Institutions (2)

16 Jun 2012

TL;DR: This work provides a probabilistic model of the object variations over time and shows LOT's tracking capabilities on challenging video sequences, both commonly used and new, demonstrating performance comparable to state-of-the-art methods.

...read moreread less

Abstract: Locally Orderless Tracking (LOT) is a visual tracking algorithm that automatically estimates the amount of local (dis)order in the object. This lets the tracker specialize in both rigid and deformable objects on-line and with no prior assumptions. We provide a probabilistic model of the object variations over time. The model is implemented using the Earth Mover's Distance (EMD) with two parameters that control the cost of moving pixels and changing their color. We adjust these costs on-line during tracking to account for the amount of local (dis)order in the object. We show LOT's tracking capabilities on challenging video sequences, both commonly used and new, demonstrating performance comparable to state-of-the-art methods.

...read moreread less

262 citations

Proceedings Article•DOI•

Object retrieval and localization with spatially-constrained similarity measure and k-NN re-ranking

[...]

Xiaohui Shen¹, Zhe Lin², Jonathan Brandt², Shai Avidan³, Ying Wu¹ - Show less +1 more•Institutions (3)

Northwestern University¹, Adobe Systems², Tel Aviv University³

16 Jun 2012

TL;DR: A new spatially-constrained similarity measure (SCSM) is proposed to handle object rotation, scaling, view point change and appearance deformation, and a novel and robust re-ranking method with the k-nearest neighbors of the query for automatically refining the initial search results.

...read moreread less

Abstract: One fundamental problem in object retrieval with the bag-of-visual words (BoW) model is its lack of spatial information. Although various approaches are proposed to incorporate spatial constraints into the BoW model, most of them are either too strict or too loose so that they are only effective in limited cases. We propose a new spatially-constrained similarity measure (SCSM) to handle object rotation, scaling, view point change and appearance deformation. The similarity measure can be efficiently calculated by a voting-based method using inverted files. Object retrieval and localization are then simultaneously achieved without post-processing. Furthermore, we introduce a novel and robust re-ranking method with the k-nearest neighbors of the query for automatically refining the initial search results. Extensive performance evaluations on six public datasets show that SCSM significantly outperforms other spatial models, while k-NN re-ranking outperforms most state-of-the-art approaches using query expansion.

...read moreread less

223 citations

Book Chapter•DOI•

TreeCANN - k-d tree coherence approximate nearest neighbor algorithm

[...]

Igor Olonetsky¹, Shai Avidan¹•Institutions (1)

Tel Aviv University¹

07 Oct 2012

TL;DR: It is shown that a sequence of key design decisions can make k-d trees run as fast as recently proposed state-of-the-art methods, and because of image coherency it is enough to consider only a sparse grid of patches across the image plane.

...read moreread less

Abstract: TreeCANN is a fast algorithm for approximately matching all patches between two images It does so by following the established convention of finding an initial set of matching patch candidates between the two images and then propagating good matches to neighboring patches in the image plane TreeCANN accelerates each of these components substantially leading to an algorithm that is ×3 to ×5 faster than existing methods Seed matching is achieved using a properly tuned k-d tree on a sparse grid of patches In particular, we show that a sequence of key design decisions can make k-d trees run as fast as recently proposed state-of-the-art methods, and because of image coherency it is enough to consider only a sparse grid of patches across the image plane We then develop a novel propagation step that is based on the integral image, which drastically reduces the computational load that is dominated by the need to repeatedly measure similarity between pairs of patches As a by-product we give an optimal algorithm for exact matching that is based on the integral image The proposed exact algorithm is faster than previously reported results and depends only on the size of the images and not on the size of the patches We report results on large and varied data sets and show that TreeCANN is orders of magnitude faster than exact NN search yet produces matches that are within 1% error, compared to the exact NN search

...read moreread less

53 citations

Book Chapter•DOI•

Photo sequencing

[...]

Tali Basha¹, Yael Moses², Shai Avidan¹•Institutions (2)

Tel Aviv University¹, Interdisciplinary Center Herzliya²

07 Oct 2012

TL;DR: This work proposes a method for photo-sequencing --- temporally ordering a set of still images taken asynchronously by aSet of uncalibrated cameras, and uses rank aggregation to combine them into a globally consistent temporal order of images.

...read moreread less

Abstract: Dynamic events such as family gatherings, concerts or sports events are often captured by a group of people. The set of still images obtained this way is rich in dynamic content but lacks accurate temporal information. We propose a method for photo-sequencing --- temporally ordering a set of still images taken asynchronously by a set of uncalibrated cameras. Photo-sequencing is an essential tool in analyzing (or visualizing) a dynamic scene captured by still images. The first step of the method detects sets of corresponding static and dynamic feature points across images. The static features are used to determine the epipolar geometry between pairs of images, and each dynamic feature votes for the temporal order of the images in which it appears. The partial orders provided by the dynamic features are not necessarily consistent, and we use rank aggregation to combine them into a globally consistent temporal order of images. We demonstrate successful photo sequencing on several challenging collections of images taken using a number of mobile phones.

...read moreread less

46 citations

Proceedings Article•DOI•

Structure and motion from scene registration

[...]

Tali Basha¹, Shai Avidan¹, Alexander Hornung², Wojciech Matusik³•Institutions (3)

Tel Aviv University¹, Disney Research², Massachusetts Institute of Technology³

16 Jun 2012

TL;DR: The core idea is to use a dense multi-camera array to construct a novel, dense 3D volumetric representation of the 3D space where each voxel holds an estimated intensity value and a confidence measure of this value.

...read moreread less

Abstract: We propose a method for estimating the 3D structure and the dense 3D motion (scene flow) of a dynamic nonrigid 3D scene, using a camera array. The core idea is to use a dense multi-camera array to construct a novel, dense 3D volumetric representation of the 3D space where each voxel holds an estimated intensity value and a confidence measure of this value. The problem of 3D structure and 3D motion estimation of a scene is thus reduced to a nonrigid registration of two volumes — hence the term ”Scene Registration”. Registering two dense 3D scalar volumes does not require recovering the 3D structure of the scene as a preprocessing step, nor does it require explicit reasoning about occlusions. From this nonrigid registration we accurately extract the 3D scene flow and the 3D structure of the scene, and successfully recover the sharp discontinuities in both time and space. We demonstrate the advantages of our method on a number of challenging synthetic and real data sets.

...read moreread less

36 citations

Proceedings Article•DOI•

Multi-video browsing and summarization

[...]

Kevin Dale¹, Eli Shechtman², Shai Avidan³, Hanspeter Pfister¹•Institutions (3)

Harvard University¹, Adobe Systems², Tel Aviv University³

16 Jun 2012

TL;DR: A method for browsing multiple videos with a common theme, such as the result of a search query on a video sharing website, or videos of an event covered by multiple cameras, is proposed.

...read moreread less

Abstract: We propose a method for browsing multiple videos with a common theme, such as the result of a search query on a video sharing website, or videos of an event covered by multiple cameras. Given the collection of videos we first align each video with all others. This pairwise video alignment forms the basis of a novel browsing interface, termed the Browsing Companion. It is used to play a primary video and, in addition as thumbnails, other video clips that are temporally synchronized with it. The user can, at any time, click on one of the thumbnails to make it the primary. We also show that video alignment can be used for other applications such as automatic highlight detection and multivideo summarization.

...read moreread less

27 citations

Proceedings Article•DOI•

Racing Bib Numbers Recognition.

[...]

Idan Ben-ami, Tali Basha¹, Shai Avidan¹•Institutions (1)

Tel Aviv University¹

01 Jan 2012

TL;DR: This work introduces an automatic system that receives a set of natural images taken in running sports events and outputs the participants’ RBN, used to identify that competitor during the race.

...read moreread less

Abstract: Running races, such as marathons, are broadly covered by professional as well as amateur photographers. This leads to a constantly growing number of photos covering a race, making the process of identifying a particular runner in such datasets difficult. Today, such identification is often done manually. In running races, each competitor has an identification number, called the Racing Bib Number (RBN), used to identify that competitor during the race. RBNs are usually printed on a paper or cardboard tag and pined onto the competitor’s T-shirt during the race. We introduce an automatic system that receives a set of natural images taken in running sports events and outputs the participants’ RBN.

...read moreread less

24 citations