scispace - formally typeset
Search or ask a question

Showing papers by "Shai Avidan published in 2011"


Proceedings ArticleDOI
06 Nov 2011
TL;DR: Coherency Sensitive Hashing is verified on a new, large scale, data set of 133 image pairs and is at least three to four times faster than PatchMatch and more accurate, especially in textured regions, where reconstruction artifacts are most noticeable to the human eye.
Abstract: Coherency Sensitive Hashing (CSH) extends Locality Sensitivity Hashing (LSH) and PatchMatch to quickly find matching patches between two images. LSH relies on hashing, which maps similar patches to the same bin, in order to find matching patches. PatchMatch, on the other hand, relies on the observation that images are coherent, to propagate good matches to their neighbors, in the image plane. It uses random patch assignment to seed the initial matching. CSH relies on hashing to seed the initial patch matching and on image coherence to propagate good matches. In addition, hashing lets it propagate information between patches with similar appearance (i.e., map to the same bin). This way, information is propagated much faster because it can use similarity in appearance space or neighborhood in the image plane. As a result, CSH is at least three to four times faster than PatchMatch and more accurate, especially in textured regions, where reconstruction artifacts are most noticeable to the human eye. We verified CSH on a new, large scale, data set of 133 image pairs.

197 citations


Proceedings ArticleDOI
16 Oct 2011
TL;DR: Pause-and-Play is presented, a system that helps users work along with existing video tutorials by using computer vision to detect events in existing videos and leverages application scripting APIs to obtain real time usage traces.
Abstract: Video tutorials provide a convenient means for novices to learn new software applications. Unfortunately, staying in sync with a video while trying to use the target application at the same time requires users to repeatedly switch from the application to the video to pause or scrub backwards to replay missed steps. We present Pause-and-Play, a system that helps users work along with existing video tutorials. Pause-and-Play detects important events in the video and links them with corresponding events in the target application as the user tries to replicate the depicted procedure. This linking allows our system to automatically pause and play the video to stay in sync with the user. Pause-and-Play also supports convenient video navigation controls that are accessible from within the target application and allow the user to easily replay portions of the video without switching focus out of the application. Finally, since our system uses computer vision to detect events in existing videos and leverages application scripting APIs to obtain real time usage traces, our approach is largely independent of the specific target application and does not require access or modifications to application source code. We have implemented Pause-and-Play for two target applications, Google SketchUp and Adobe Photoshop, and we report on a user study that shows our system improves the user experience of working with video tutorials.

155 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: A novel method for retargeting a pair of stereo images that minimizes the visual distortion in each of the images as well as the depth distortion and guarantees that the retargeted pair is geometrically consistent with a feasible 3D scene, similar to the original one.
Abstract: Image retargeting algorithms attempt to adapt the image content to the screen without distorting the important objects in the scene. Existing methods address retargeting of a single image. In this paper we propose a novel method for retargeting a pair of stereo images. Naively retargeting each image independently will distort the geometric structure and make it impossible to perceive the 3D structure of the scene. We show how to extend a single image seam carving to work on a pair of images. Our method minimizes the visual distortion in each of the images as well as the depth distortion. A key property of the proposed method is that it takes into account the visibility relations between pixels in the image pair (occluded and occluding pixels). As a result, our method guarantees, as we formally prove, that the retargeted pair is geometrically consistent with a feasible 3D scene, similar to the original one. Hence, the retargeted stereo pair can be viewed on a stereoscopic display or processed by any computer vision algorithm. We demonstrate our method on a number of challenging indoor and outdoor stereo images.

68 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: A new tracker is proposed, dubbed Scatter Tracker, that can efficiently deal with scattered occlusion and is based on a new similarity measure between images that combines order statistics with a spatial prior that forces the order statistics to work on non-overlapping patches.
Abstract: Scattered occlusion is an occlusion that is not localized in space or time. It occurs because of heavy smoke, rain, snow and fog, as well as tree branches and leafs, or any other thick flora for that matter. As a result, we can not assume that there is correlation in the visibility of nearby pixels. We propose a new tracker, dubbed Scatter Tracker that can efficiently deal with this type of occlusion. Our tracker is based on a new similarity measure between images that combines order statistics with a spatial prior that forces the order statistics to work on non-overlapping patches. We analyze the probability of detection, and false detection, of our tracker and show that it can be modeled as a sequence of independent Bernoulli trials on pixel similarity. In addition, to handle appearance variations of the tracked target, an appearance model update scheme based on incremental-PCA procedure is incorporated into the tracker. We show that the combination of order statistics and spatial prior greatly enhances the quality of our tracker and demonstrate its effectiveness on a number of challenging video sequences.

2 citations


Book ChapterDOI
Shai Avidan1
15 Jun 2011
TL;DR: Two extensions of AdaBoost are discussed, termed Ensemble Tracking and SpatialBoost, which extend AdaBoost in the temporal domain and adapts it to the problem of tracking an object in a video sequence.
Abstract: Ensemble methods offer an elegant way of training an ensemble of weak classifiers into a strong classifier through the use of the AdaBoost algorithm. In this abstract we discuss two extensions of AdaBoost and demonstrate them on two problems in the field of Computer Vision. The first, termed Ensemble Tracking, extends AdaBoost in the temporal domain and adapts it to the problem of tracking an object in a video sequence. The second, termed SpatialBoost, extends AdaBoost in the spatial domain and adapts it to the problem of interactive image segmentation.

2 citations