scispace - formally typeset
Search or ask a question
Author

James A. Ferwerda

Other affiliations: Cornell University
Bio: James A. Ferwerda is an academic researcher from Rochester Institute of Technology. The author has contributed to research in topics: Rendering (computer graphics) & Computer graphics. The author has an hindex of 24, co-authored 75 publications receiving 4965 citations. Previous affiliations of James A. Ferwerda include Cornell University.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2002
TL;DR: The work presented in this paper leverages the time-tested techniques of photographic practice to develop a new tone reproduction operator and uses and extends the techniques developed by Ansel Adams to deal with digital images.
Abstract: A classic photographic task is the mapping of the potentially high dynamic range of real world luminances to the low dynamic range of the photographic print. This tone reproduction problem is also faced by computer graphics practitioners who map digital images to a low dynamic range print or screen. The work presented in this paper leverages the time-tested techniques of photographic practice to develop a new tone reproduction operator. In particular, we use and extend the techniques developed by Ansel Adams to deal with digital images. The resulting algorithm is simple and produces good results for a wide variety of images.

1,708 citations

Proceedings ArticleDOI
01 Aug 1996
TL;DR: A computational model of visual adaptation for realistic image synthesis based on psychophysical experiments that captures the changes in threshold visibility, color appearance, visual acuity, and sensitivity over time that are caused by the visual system’s adaptation mechanisms.
Abstract: In this paper we develop a computational model of visual adaptation for realistic image synthesis based on psychophysical experiments. The model captures the changes in threshold visibility, color appearance, visual acuity, and sensitivity over time that are caused by the visual system’s adaptation mechanisms. We use the model to display the results of global illumination simulations illuminated at intensities ranging from daylight down to starlight. The resulting images better capture the visual characteristics of scenes viewed over a wide range of illumination levels. Because the model is based on psychophysical data it can be used to predict the visibility and appearance of scene features. This allows the model to be used as the basis of perceptually-based error metrics for limiting the precision of global illumination computations. CR

489 citations

Proceedings ArticleDOI
24 Jul 1998
TL;DR: The model is based on a multiscale representation of pattern, luminance, and color processing in the human visual system and can be usefully applied to image quality metrics, image compression methods, and perceptually-based image synthesis algorithms.
Abstract: In this paper we develop a computational model of adaptation and spatial vision for realistic tone reproduction. The model is based on a multiscale representation of pattern, luminance, and color processing in the human visual system. We incorporate the model into a tone reproduction operator that maps the vast ranges of radiances found in real and synthetic scenes into the small fixed ranges available on conventional display devices such as CRT’s and printers. The model allows the operator to address the two major problems in realistic tone reproduction: wide absolute range and high dynamic range scenes can be displayed; and the displayed images match our perceptions of the scenes at both threshold and suprathreshold levels to the degree possible given a particular display device. Although in this paper we apply our visual model to the tone reproduction problem, the model is general and can be usefully applied to image quality metrics, image compression methods, and perceptually-based image synthesis algorithms. CR Categories: I.3.0 [Computer Graphics]: General;

458 citations

Journal ArticleDOI
TL;DR: Cue theory, which states that the visual system computes the distances of objects in the environment based on information from the posture of the eyes and from the patterns of light projected onto the retinas by the environment, is presented.
Abstract: The sources of visual information that must be present to correctly interpret spatial relations in images, the relative importance of different visual information sources with regard to metric judgments of spatial relations in images, and the ways that the task in which the images are used affect the visual information's usefulness are discussed Cue theory, which states that the visual system computes the distances of objects in the environment based on information from the posture of the eyes and from the patterns of light projected onto the retinas by the environment, is presented Three experiments in which the influence of pictorial cues on perceived spatial relations in computer-generated images was assessed are discussed Each experiment examined the accuracy with which subjects matched the position, orientation, and size of a test object with a standard by interactively translating, rotating, and scaling the test object >

300 citations

Proceedings ArticleDOI
01 Jul 2000
TL;DR: A new psychophysically-based light reflection model where the dimensions of the model are perceptually meaningful, and variations along the dimensions are perceptially uniform is introduced.
Abstract: In this paper we introduce a new light reflection model for image synthesis based on experimental studies of surface gloss perception. To develop the model, we've conducted two experiments that explore the relationships between the physical parameters used to describe the reflectance properties of glossy surfaces and the perceptual dimensions of glossy appearance. In the first experiment we use multidimensional scaling techniques to reveal the dimensionality of gloss perception for simulated painted surfaces. In the second experiment we use magnitude estimation methods to place metrics on these dimensions that relate changes in apparent gloss to variations in surface reflectance properties. We use the results of these experiments to rewrite the parameters of a physically-based light reflection model in perceptual terms. The result is a new psychophysically-based light reflection model where the dimensions of the model are perceptually meaningful, and variations along the dimensions are perceptually uniform. We demonstrate that the model can facilitate describing surface gloss in graphics rendering applications. This work represents a new methodology for developing light reflection models for image synthesis.

259 citations


Cited by
More filters
Proceedings ArticleDOI
21 Jul 2017
TL;DR: SRGAN as mentioned in this paper proposes a perceptual loss function which consists of an adversarial loss and a content loss, which pushes the solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.
Abstract: Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.

6,884 citations

Posted Content
TL;DR: SRGAN, a generative adversarial network (GAN) for image super-resolution (SR), is presented, to its knowledge, the first framework capable of inferring photo-realistic natural images for 4x upscaling factors and a perceptual loss function which consists of an adversarial loss and a content loss.
Abstract: Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.

4,404 citations

Book
30 Sep 2010
TL;DR: Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images and takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene.
Abstract: Humans perceive the three-dimensional structure of the world with apparent ease. However, despite all of the recent advances in computer vision research, the dream of having a computer interpret an image at the same level as a two-year old remains elusive. Why is computer vision such a challenging problem and what is the current state of the art? Computer Vision: Algorithms and Applications explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos. More than just a source of recipes, this exceptionally authoritative and comprehensive textbook/reference also takes a scientific approach to basic vision problems, formulating physical models of the imaging process before inverting them to produce descriptions of a scene. These problems are also analyzed using statistical models and solved using rigorous engineering techniques Topics and features: structured to support active curricula and project-oriented courses, with tips in the Introduction for using the book in a variety of customized courses; presents exercises at the end of each chapter with a heavy emphasis on testing algorithms and containing numerous suggestions for small mid-term projects; provides additional material and more detailed mathematical topics in the Appendices, which cover linear algebra, numerical techniques, and Bayesian estimation theory; suggests additional reading at the end of each chapter, including the latest research in each sub-field, in addition to a full Bibliography at the end of the book; supplies supplementary course material for students at the associated website, http://szeliski.org/Book/. Suitable for an upper-level undergraduate or graduate-level course in computer science or engineering, this textbook focuses on basic techniques that work under real-world conditions and encourages students to push their creative boundaries. Its design and exposition also make it eminently suitable as a unique reference to the fundamental techniques and current research literature in computer vision.

4,146 citations

Proceedings ArticleDOI
03 Aug 1997
TL;DR: This work discusses how this work is applicable in many areas of computer graphics involving digitized photographs, including image-based modeling, image compositing, and image processing, and demonstrates a few applications of having high dynamic range radiance maps.
Abstract: We present a method of recovering high dynamic range radiance maps from photographs taken with conventional imaging equipment. In our method, multiple photographs of the scene are taken with different amounts of exposure. Our algorithm uses these differently exposed photographs to recover the response function of the imaging process, up to factor of scale, using the assumption of reciprocity. With the known response function, the algorithm can fuse the multiple photographs into a single, high dynamic range radiance map whose pixel values are proportional to the true radiance values in the scene. We demonstrate our method on images acquired with both photochemical and digital imaging processes. We discuss how this work is applicable in many areas of computer graphics involving digitized photographs, including image-based modeling, image compositing, and image processing. Lastly, we demonstrate a few applications of having high dynamic range radiance maps, such as synthesizing realistic motion blur and simulating the response of the human visual system.

2,967 citations

Journal ArticleDOI
TL;DR: This work uses a simple statistical analysis to impose one image's color characteristics on another by choosing an appropriate source image and applying its characteristic to another image.
Abstract: We use a simple statistical analysis to impose one image's color characteristics on another. We can achieve color correction by choosing an appropriate source image and apply its characteristic to another image.

2,615 citations