scispace - formally typeset
Search or ask a question

Showing papers by "Shai Avidan published in 2009"


Journal ArticleDOI
27 Jul 2009
TL;DR: A new image similarity measure is defined, which is term Bi-Directional Warping (BDW), and used with a dynamic programming algorithm to find an optimal path in the resizing space and shows how a path in this space defines a sequence of operations to retarget media.
Abstract: Content aware resizing gained popularity lately and users can now choose from a battery of methods to retarget their media. However, no single retargeting operator performs well on all images and all target sizes. In a user study we conducted, we found that users prefer to combine seam carving with cropping and scaling to produce results they are satisfied with. This inspires us to propose an algorithm that combines different operators in an optimal manner. We define a resizing space as a conceptual multi-dimensional space combining several resizing operators, and show how a path in this space defines a sequence of operations to retarget media. We define a new image similarity measure, which we term Bi-Directional Warping (BDW), and use it with a dynamic programming algorithm to find an optimal path in the resizing space. In addition, we show a simple and intuitive user interface allowing users to explore the resizing space of various image sizes interactively. Using key-frames and interpolation we also extend our technique to retarget video, providing the flexibility to use the best combination of operators at different times in the sequence.

404 citations


Journal ArticleDOI
TL;DR: It is shown that computing a seam reduces to a dynamic programming problem for images and a graph min-cut search for video, and several image and video operations can be recast as a successive operation of the seam carving operator.
Abstract: Traditional image resizing techniques are oblivious to the content of the image when changing its width or height. In contrast, media (i.e., image and video) retargeting take s content into account. For example, one would like to change the aspect ratio of a video without making human figures look too fat or too skinny, or change the size of an image by automatically removing "unnecessary" portions while keeping the "important" features intact. We propose a simple operator; we term seam carving to support image and video retargeting. A seam is an optimal 1D path of pixels in an image, or a 2D manifold in a video cube, going from top to bottom, or left to right. Optimality is defined by minimizing an energy function that assigns costs to pixels. We show that computing a seam reduces to a dynamic programming problem for images and a graph min-cut search for video. We demonstrate that several image and video operations, such as aspect ratio correction, size change, and object removal, can be recast as a successive operation of the seam carving operator.

98 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: This work proposes a new mode detection step that greatly accelerates performance and adopts a particular extension of the median, known as the Tukey median, and shows that it can be computed efficiently using random projections of the high dimensional data onto 1D lines, just like LSH, leading to a tightly integrated and efficient algorithm.
Abstract: Median-shift is a mode seeking algorithm that relies on computing the median of local neighborhoods, instead of the mean. We further combine median-shift with Locality Sensitive Hashing (LSH) and show that the combined algorithm is suitable for clustering large scale, high dimensional data sets. In particular, we propose a new mode detection step that greatly accelerates performance. In the past, LSH was used in conjunction with mean shift only to accelerate nearest neighbor queries. Here we show that we can analyze the density of the LSH bins to quickly detect potential mode candidates and use only them to initialize the median-shift procedure. We use the median, instead of the mean (or its discrete counterpart - the medoid) because the median is more robust and because the median of a set is a point in the set. A median is well defined for scalars but there is no single agreed upon extension of the median to high dimensional data. We adopt a particular extension, known as the Tukey median, and show that it can be computed efficiently using random projections of the high dimensional data onto 1D lines, just like LSH, leading to a tightly integrated and efficient algorithm.

41 citations


Book ChapterDOI
01 Jan 2009
TL;DR: The fusion of a novel cryptographic protocol and recent advances in computer vision results in a secure and efficient protocol for image matching, which uses a secure fuzzy match of string and sets as its building block.
Abstract: Video surveillance is an intrusive operation that violates privacy. It is therefore desirable to devise surveillance protocols that minimize or even eliminate privacy intrusion. A principled way of doing so is to resort to Secure Multi-Party methods, that are provably secure, and adapt them to various vision algorithms. In this chapter, we describe an Oblivious Image Matching protocol which is a secure protocol for image matching. Image matching is a generalization of detection and recognition tasks since detection can be viewed as matching a particular image to a given object class (i.e., does this image contain a face?) while recognition can be viewed as matching an image of a particular instance of a class to another image of the same instance (i.e., does this image contain a particular car?). And instead of applying the Oblivious Image Matching to the entire image one can apply it to various sub-images, thus solving the localization problem (i.e., where is the gun in the image?). A leading approach to object detection and recognition is the bag-offeatures approach, where each object is reduced to a set of features and matching objects is reduced to matching their corresponding sets of features. Oblivious Image Matching uses a secure fuzzy match of string and sets as its building block. In the proposed protocol, two parties, Alice and Bob, wish to match their images, without leaking additional information. We use a novel cryptographic protocol for fuzzy matching and adopt it to the bag-of-features approach. Fuzzy matching compares two sets (or strings) and declares them to match if a certain percentage of their elements match. To apply fuzzy matching to images, we represent images as a set of visual words that can be fed to the secure fuzzy matching protocol. The fusion of a novel cryptographic protocol and recent advances in computer vision results in a secure and efficient protocol for image matching. Experiments on real images are presented.

5 citations