scispace - formally typeset
Search or ask a question
Author

Charles E. Jacobs

Other affiliations: University of Washington
Bio: Charles E. Jacobs is an academic researcher from Microsoft. The author has contributed to research in topics: Layout engine & Language model. The author has an hindex of 24, co-authored 37 publications receiving 4677 citations. Previous affiliations of Charles E. Jacobs include University of Washington.

Papers
More filters
Proceedings ArticleDOI
01 Aug 2001
TL;DR: This paper describes a new framework for processing images by example, called “image analogies,” based on a simple multi-scale autoregression, inspired primarily by recent results in texture synthesis.
Abstract: This paper describes a new framework for processing images by example, called “image analogies.” The framework involves two stages: a design phase, in which a pair of images, with one image purported to be a “filtered” version of the other, is presented as “training data”; and an application phase, in which the learned filter is applied to some new target image in order to create an “analogous” filtered result. Image analogies are based on a simple multi-scale autoregression, inspired primarily by recent results in texture synthesis. By choosing different types of source image pairs as input, the framework supports a wide variety of “image filter” effects, including traditional image filters, such as blurring or embossing; improved texture synthesis, in which some textures are synthesized with higher quality than by previous approaches; super-resolution, in which a higher-resolution image is inferred from a low-resolution source; texture transfer, in which images are “texturized” with some arbitrary source texture; artistic filters, in which various drawing and painting styles are synthesized based on scanned real-world examples; and texture-by-numbers, in which realistic scenes, composed of a variety of textures, are created using a simple painting interface.

1,794 citations

Proceedings ArticleDOI
15 Sep 1995
TL;DR: An “image querying metric” is introduced that operates on how many significant wavelet coefficients the query has in common with potential targets, and includes parameters that can be tuned, using a statistical analysis, to accommodate the kinds of image distortions found in different types of image queries.
Abstract: We present a method for searching in an image database using a query image that is similar to the intended target. The query image may be a hand-drawn sketch or a (potentially low-quality) scan of the image to be retrieved. Our searching algorithm makes use of multiresolution wavelet decompositions of the query and database images. The coefficients of these decompositions are distilled into small “signatures” for each image. We introduce an “image querying metric” that operates on these signatures. This metric essentially compares how many significant wavelet coefficients the query has in common with potential targets. The metric includes parameters that can be tuned, using a statistical analysis, to accommodate the kinds of image distortions found in different types of image queries. The resulting algorithm is simple, requires very little storage overhead for the database of signatures, and is fast enough to be performed on a database of 20,000 images at interactive rates (on standard desktop machines) as a query is sketched. Our experiments with hundreds of queries in databases of 1000 and 20,000 images show dramatic improvement, in both speed and success rate, over using a conventional L1, L2, or color histogram norm. CR

832 citations

Patent
02 May 2005
TL;DR: In this article, a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer is described, where the user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations.
Abstract: Described herein is a technique for creating a 3D face model using images obtained from an inexpensive camera associated with a general-purpose computer. Two still images of the user are captured, and two video sequences. The user is asked to identify five facial features, which are used to calculate a mask and to perform fitting operations. Based on a comparison of the still images, deformation vectors are applied to a neutral face model to create the 3D model. The video sequences are used to create a texture map. The process of creating the texture map references the previously obtained 3D model to determine poses of the sequential video images.

323 citations

Proceedings ArticleDOI
01 Jul 2003
TL;DR: This work presents a new approach to adaptive grid-based document layout, which knows how to adapt to a range of page sizes and other viewing conditions, and describes an XML-based representation for templates and content, which maintains a clean separation between the two.
Abstract: Grid-based page designs are ubiquitous in commercially printed publications, such as newspapers and magazines. Yet, to date, no one has invented a good way to easily and automatically adapt such designs to arbitrarily-sized electronic displays. The difficulty of generalizing grid-based designs explains the generally inferior nature of on-screen layouts when compared to their printed counterparts, and is arguably one of the greatest remaining impediments to creating on-line reading experiences that rival those of ink on paper. In this work, we present a new approach to adaptive grid-based document layout, which attempts to bridge this gap. In our approach, an adaptive layout style is encoded as a set of grid-based templates that know how to adapt to a range of page sizes and other viewing conditions. These templates include various types of layout elements (such as text, figures, etc.) and define, through constraint-based relationships, just how these elements are to be laid out together as a function of both the properties of the content itself, such as a figure's size and aspect ratio, and the properties of the viewing conditions under which the content is being displayed. We describe an XML-based representation for our templates and content, which maintains a clean separation between the two. We also describe the various parts of our research prototype system: a layout engine for formatting the page; a paginator for determining a globally optimal allocation of content amongst the pages, as well as an optimal pairing of templates with content; and a graphical user interface for interactively creating adaptive templates. We also provide numerous examples demonstrating the capabilities of this prototype, including this paper, itself, which has been laid out with our system.

206 citations

Patent
30 May 2002
TL;DR: In this paper, a system and method for improving document layout on arbitrary devices of different resolutions and size using manifold representations of content is presented, where multiple versions of anything that might appear in a document, from text, to images, to even such things as stylistic conventions are selected and formatted dynamically, on the fly, by a layout engine.
Abstract: A system and method for improving document layout on arbitrary devices of different resolutions and size using manifold representations of content. Manifold representations of content are: multiple versions of anything that might appear in a document, from text, to images, to even such things as stylistic conventions. The specific content is selected and formatted dynamically, on the fly, by a layout engine in order to best adapt to a given viewing situation. A user interface for authoring and editing such manifold content is disclosed.

176 citations


Cited by
More filters
Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Abstract: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F : Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

11,682 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.
Abstract: Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

6,447 citations

Proceedings ArticleDOI
27 Jun 2016
TL;DR: A Neural Algorithm of Artistic Style is introduced that can separate and recombine the image content and style of natural images and provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.
Abstract: Rendering the semantic content of an image in different styles is a difficult image processing task. Arguably, a major limiting factor for previous approaches has been the lack of image representations that explicitly represent semantic information and, thus, allow to separate image content from style. Here we use image representations derived from Convolutional Neural Networks optimised for object recognition, which make high level image information explicit. We introduce A Neural Algorithm of Artistic Style that can separate and recombine the image content and style of natural images. The algorithm allows us to produce new images of high perceptual quality that combine the content of an arbitrary photograph with the appearance of numerous wellknown artworks. Our results provide new insights into the deep image representations learned by Convolutional Neural Networks and demonstrate their potential for high level image synthesis and manipulation.

4,888 citations