scispace - formally typeset
Search or ask a question

Showing papers by "Shai Avidan published in 2008"


Journal ArticleDOI
01 Aug 2008
TL;DR: This work replaces the dynamic programming method of seam carving with graph cuts that are suitable for 3D volumes and presents a novel energy criterion that improves the visual quality of the retargeted images and videos.
Abstract: Video, like images, should support content aware resizing. We present video retargeting using an improved seam carving operator. Instead of removing 1D seams from 2D images we remove 2D seam manifolds from 3D space-time volumes. To achieve this we replace the dynamic programming method of seam carving with graph cuts that are suitable for 3D volumes. In the new formulation, a seam is given by a minimal cut in the graph and we show how to construct a graph such that the resulting cut is a valid seam. That is, the cut is monotonic and connected. In addition, we present a novel energy criterion that improves the visual quality of the retargeted images and videos. The original seam carving operator is focused on removing seams with the least amount of energy, ignoring energy that is introduced into the images and video by applying the operator. To counter this, the new criterion is looking forward in time - removing seams that introduce the least amount of energy into the retargeted result. We show how to encode the improved criterion into graph cuts (for images and video) as well as dynamic programming (for images). We apply our technique to images and videos and present results of various applications.

775 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: The patch transform is introduced, where an image is broken into non-overlapping patches, and modifications or constraints are applied in the ldquopatch domainrdquo, and a modified image is reconstructed from the patches, subject to those constraints.
Abstract: We introduce the patch transform, where an image is broken into non-overlapping patches, and modifications or constraints are applied in the ldquopatch domainrdquo. A modified image is then reconstructed from the patches, subject to those constraints. When no constraints are given, the reconstruction problem reduces to solving a jigsaw puzzle. Constraints the user may specify include the spatial locations of patches, the size of the output image, or the pool of patches from which an image is reconstructed. We define terms in a Markov network to specify a good image reconstruction from patches: neighboring patches must fit to form a plausible image, and each patch should be used only once. We find an approximate solution to the Markov network using loopy belief propagation, introducing an approximation to handle the combinatorially difficult patch exclusion constraint. The resulting image reconstructions show the original image, modified to respect the userpsilas changes. We apply the patch transform to various image editing tasks and show that the algorithm performs well on real world images.

174 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: This work proposes a white balance technique for scenes with two light types that are specified by the user, which can neutralize the light colors and render visually pleasing images and can be used to achieve post-exposure relighting effects.
Abstract: White balance is a crucial step in the photographic pipeline. It ensures the proper rendition of images by eliminating color casts due to differing illuminants. Digital cameras and editing programs provide white balance tools that assume a single type of light per image, such as daylight. However, many photos are taken under mixed lighting. We propose a white balance technique for scenes with two light types that are specified by the user. This covers many typical situations involving indoor/outdoor or flash/ambient light mixtures. Since we work from a single image, the problem is highly underconstrained. Our method recovers a set of dominant material colors which allows us to estimate the local intensity mixture of the two light types. Using this mixture, we can neutralize the light colors and render visually pleasing images. Our method can also be used to achieve post-exposure relighting effects.

151 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: A system for generating “infinite” images from large collections of photos by means of transformed image retrieval, which represents images in the database as a graph where each node is an image and different types of edges correspond to different type of geometric transformations simulating different camera motions.
Abstract: We present a system for exploring large collections of photos in a virtual 3D space. Our system does not assume the photographs are of a single real 3D location, nor that they were taken at the same time. Instead, we organize the photos in themes, such as city streets or skylines, and let users navigate within each theme using intuitive 3D controls that include move left/right, zoom and rotate. Themes allow us to maintain a coherent semantic meaning of the tour, while visual similarity allows us to create a ldquobeing thererdquo impression, as if the images were of a particular location. We present results on a collection of several million images downloaded from Flickr and broken into themes that consist of a few hundred thousand images each. A byproduct of our system is the ability to construct extremely long panoramas, as well as image taxi, a program that generates a virtual tour between a user supplied start and finish images. The system, and its underlying technology can be used in a variety of applications such as games, movies and online virtual 3D spaces like Second Life.

73 citations


Proceedings ArticleDOI
15 Aug 2008
TL;DR: This work extends the l0-norm ldquosubspectralrdquo algorithms developed for sparse-LDA and sparse-PCA to more general quadratic costs such as MSE in linear (or kernel) regression and generalizes Natarajanpsilas algorithm, also known as order-recursive matching pursuit.
Abstract: We extend the l0-norm ldquosubspectralrdquo algorithms developed for sparse-LDA (Moghaddam, 2006) and sparse-PCA (Moghaddam, 2006) to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting ldquosparse least squaresrdquo (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to times103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithmspsila utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for exact sparse least squares (ESLS) also benefits from partitioned matrix techniques. Our greedy sparse least squares (GSLS) algorithm generalizes Natarajanpsilas algorithm (Natarajan, 1995) also known as order-recursive matching pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS (Efron, 2004), forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularization.

24 citations


Proceedings ArticleDOI
23 Jun 2008
TL;DR: B boundary snapping allows the user to enforce hard constraints on the boundary directly, at the expense of moderate user labor in positioning the landmark points, and is fast, works on a variety of images, and handles situations where the boundary is not obvious.
Abstract: Boundary snapping is an interactive image cutout algorithm that requires a small number of user supplied control points, or landmarks, to infer the cutout contour. The key idea is to match the appearance of all points along the desired contour to the landmark points, where appearance is given by an intensity profile perpendicular to the boundary. An optimization process attempts to find a contour that maximizes the similarity score of its points with the landmarks. This approach works well in the typical case where the foreground and background differ in appearance, as well as in challenging cases where the subject is clearly perceived, but the regions on both sides of the boundary are similar and cannot be easily discriminated. By enabling the user to define the boundary points directly, the technique is not limited to boundaries that necessarily have to be the most salient or high gradient feature in the region. It can also be used for margin cutout around the boundary. The use of multiple control points along the boundary can handle spatially varying attributes as both foreground and background may change in appearance along the boundary. The final result is accurate, because it allows the user to enforce hard constraints on the boundary directly, at the expense of moderate user labor in positioning the landmark points. Finally, the algorithm is fast, works on a variety of images, and handles situations where the boundary is not obvious.

10 citations


Proceedings ArticleDOI
12 Dec 2008
TL;DR: Efficient and practical protocols for privacy preserving pattern classification that allow a client to have his data classified by a server, without revealing information to either party, other than the classification result are given.
Abstract: We give efficient and practical protocols for privacy preserving pattern classification that allow a client to have his data classified by a server, without revealing information to either party, other than the classification result. We illustrate the advantages of such a framework on several real-world scenarios and give secure protocols for several classifiers.

6 citations