scispace - formally typeset
Search or ask a question

Showing papers on "Computer graphics published in 2017"


Journal ArticleDOI
TL;DR: In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques as mentioned in this paper.
Abstract: Many scientific fields study data with an underlying structure that is non-Euclidean. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural-language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure and in cases where the invariances of these structures are built into networks used to model them.

2,565 citations


Proceedings ArticleDOI
21 Jul 2017
TL;DR: In this article, a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features is proposed.
Abstract: Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so far focused on dealing with 1D, 2D, or 3D Euclidean-structured data such as acoustic signals, images, or videos. Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph-and 3D shape analysis and show that it consistently outperforms previous approaches.

1,594 citations


Journal ArticleDOI
TL;DR: A comprehensive overview and discussion of research in light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data are presented.
Abstract: Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing, and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics, and signal processing communities. In this paper, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data.

412 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: In this paper, the authors proposed a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic cars, bedrooms, or faces, which allows users to scribble over the sketch to indicate preferred color for objects.
Abstract: Several recent works have used deep convolutional networks to generate realistic imagery. These methods sidestep the traditional computer graphics rendering pipeline and instead generate imagery at the pixel level by learning from large collections of photos (e.g. faces or bedrooms). However, these methods are of limited utility because it is difficult for a user to control what the network produces. In this paper, we propose a deep adversarial image synthesis architecture that is conditioned on sketched boundaries and sparse color strokes to generate realistic cars, bedrooms, or faces. We demonstrate a sketch based image synthesis system which allows users to scribble over the sketch to indicate preferred color for objects. Our network can then generate convincing images that satisfy both the color and the sketch constraints of user. The network is feed-forward which allows users to see the effect of their edits in real time. We compare to recent work on sketch to image synthesis and show that our approach generates more realistic, diverse, and controllable outputs. The architecture is also effective at user-guided colorization of grayscale images.

404 citations


Book ChapterDOI
01 Jan 2017
TL;DR: In a geometric context, a collision or proximity query reports information about the relative configuration or placement of two objects as mentioned in this paper, and the common examples of such queries include checking whether two objects overlap in space, or whether their boundaries intersect, or computing the minimum Euclidean separation distance between their boundaries.
Abstract: In a geometric context, a collision or proximity query reports information about the relative configuration or placement of two objects Some of the common examples of such queries include checking whether two objects overlap in space, or whether their boundaries intersect, or computing the minimum Euclidean separation distance between their boundaries Hundreds of papers have been published on different aspects of these queries in computational geometry and related areas such as robotics, computer graphics, virtual environments, and computer-aided design These queries arise in different applications including robot motion planning, dynamic simulation, haptic rendering, virtual prototyping, interactive walkthroughs, computer gaming, and molecular modeling For example, a large-scale virtual environment, eg, a walkthrough, creates a model of the environment with virtual objects Such an environment is used to give the user a sense of presence in a synthetic world and it should make the images of both the user and the surrounding objects feel solid The objects should not pass through each other, and objects should move as expected when pushed, pulled, or grasped; see Fig 3901 Such actions require fast and accurate collision detection between the geometric representations of both real and virtual objects Another example is rapid prototyping, where digital representations of mechanical parts, tools, and machines, need to be tested for interconnectivity, functionality, and reliability In Fig 3902, the motion of the pistons within the combustion chamber wall is simulated to check for tolerances and verify the design

273 citations


Proceedings ArticleDOI
04 Dec 2017
TL;DR: The proposed method uses a Convolutional Neural Network with a custom pooling layer to optimize current best-performing algorithms feature extraction scheme and outperforms state of the art methods for both local and full image classification.
Abstract: This paper presents a deep-learning method for distinguishing computer generated graphics from real photographic images The proposed method uses a Convolutional Neural Network (CNN) with a custom pooling layer to optimize current best-performing algorithms feature extraction scheme Local estimates of class probabilities are computed and aggregated to predict the label of the whole picture We evaluate our work on recent photo-realistic computer graphics and show that it outperforms state of the art methods for both local and full image classification

262 citations


BookDOI
01 Jan 2017
TL;DR: A novel intelligent multiple watermarking techniques are proposed that has reduced the amount of data to be embedded and consequently improved perceptual quality of the watermarked image.
Abstract: Most of the past document image watermarking schemes focus on providing same level of integrity and copyright protection for information present in the source document image. However, in a document image the information contents possess various levels of sensitivity. Each level of sensitivity needs different type of protection and this demands multiple watermarking techniques. In this paper, a novel intelligent multiple watermarking techniques are proposed. The sensitivity of the information content of a block is based on the homogeneity and relative energy contribution parameters. Appropriate watermarking scheme is applied based on sensitivity classification of the block. Experiments are conducted exhaustively on documents. Experimental results reveal the accurate identification of the sensitivity of information content in the block. The results reveal that multiple watermarking schemes has reduced the amount of data to be embedded and consequently improved perceptual quality of the watermarked image.

187 citations


Proceedings ArticleDOI
18 Mar 2017
TL;DR: A detailed introduction to augmented reality can be found in this paper, where the authors present selected technical achievements in this field and highlight some examples of successful application prototypes, as well as a review of the current state-of-the-art.
Abstract: This tutorial will provide a detailed introduction to Augmented Reality (AR). AR is a key user-interface technology for personalized, situated information delivery, navigation, on-demand instruction and games. The widespread availability and rapid evolution of smartphones and new devices such as Hololens enables software-only solutions for AR, where it was previously necessary to assemble custom hardware solutions. However, ergonomic and technical limitations of existing devices make this a challenging endeavor. In particular, it is necessary to design novel efficient real-time computer vision and computer graphics algorithms, and create new lightweight forms of interaction with the environment through small form-factor devices. This tutorial will present selected technical achievements in this field and highlight some examples of successful application prototypes.

137 citations


Journal ArticleDOI
TL;DR: The diagonal problem: synthesizing appearance from given per‐pixel attributes using a CNN is considered and the resulting Deep Shading renders screen space effects at competitive quality and speed while not being programmed by human experts but learned from example images.
Abstract: In computer vision, convolutional neural networks CNNs achieve unprecedented performance for inverse problems where RGB pixel appearance is mapped to attributes such as positions, normals or reflectance. In computer graphics, screen space shading has boosted the quality of real-time rendering, converting the same kind of attributes of a virtual scene back to appearance, enabling effects like ambient occlusion, indirect light, scattering and many more. In this paper we consider the diagonal problem: synthesizing appearance from given per-pixel attributes using a CNN. The resulting Deep Shading renders screen space effects at competitive quality and speed while not being programmed by human experts but learned from example images.

101 citations


Journal ArticleDOI
Thomas Müller1, Thomas Müller2, Markus Gross1, Markus Gross2, Jan Novák1 
01 Jul 2017
TL;DR: This work proposes an adaptive spatio‐directional hybrid data structure, referred to as SD‐tree, for storing and sampling incident radiance, and presents a principled way to automatically budget training and rendering computations to minimize the variance of the final image.
Abstract: We present a robust, unbiased technique for intelligent light-path construction in path-tracing algorithms. Inspired by existing path-guiding algorithms, our method learns an approximate representation of the scene's spatio-directional radiance field in an unbiased and iterative manner. To that end, we propose an adaptive spatio-directional hybrid data structure, referred to as SD-tree, for storing and sampling incident radiance. The SD-tree consists of an upper part-a binary tree that partitions the 3D spatial domain of the light field-and a lower part-a quadtree that partitions the 2D directional domain. We further present a principled way to automatically budget training and rendering computations to minimize the variance of the final image. Our method does not require tuning hyperparameters, although we allow limiting the memory footprint of the SD-tree. The aforementioned properties, its ease of implementation, and its stable performance make our method compatible with production environments. We demonstrate the merits of our method on scenes with difficult visibility, detailed geometry, and complex specular-glossy light transport, achieving better performance than previous state-of-the-art algorithms.

100 citations


Journal ArticleDOI
TL;DR: A light field-based CGH rendering pipeline is presented allowing for reproduction of high-definition 3D scenes with continuous depth and support of intra-pupil view-dependent occlusion and it is shown that the rendering accurately models the spherical illumination introduced by the eye piece and produces the desired 3D imagery at the designated depth.
Abstract: Holograms display a 3D image in high resolution and allow viewers to focus freely as if looking through a virtual window, yet computer generated holography (CGH) hasn't delivered the same visual quality under plane wave illumination and due to heavy computational cost. Light field displays have been popular due to their capability to provide continuous focus cues. However, light field displays must trade off between spatial and angular resolution, and do not model diffraction.We present a light field-based CGH rendering pipeline allowing for reproduction of high-definition 3D scenes with continuous depth and support of intra-pupil view-dependent occlusion. Our rendering accurately accounts for diffraction and supports various types of reference illuminations for hologram. We avoid under- and over-sampling and geometric clipping effects seen in previous work. We also demonstrate an implementation of light field rendering plus the Fresnel diffraction integral based CGH calculation which is orders of magnitude faster than the state of the art [Zhang et al. 2015], achieving interactive volumetric 3D graphics.To verify our computational results, we build a see-through, near-eye, color CGH display prototype which enables co-modulation of both amplitude and phase. We show that our rendering accurately models the spherical illumination introduced by the eye piece and produces the desired 3D imagery at the designated depth. We also analyze aliasing, theoretical resolution limits, depth of field, and other design trade-offs for near-eye CGH.

Journal ArticleDOI
TL;DR: Recent research on how computer vision techniques benefit computer graphics techniques and vice versa is surveyed, and research on analysis, manipulation, synthesis, and interaction is covered.
Abstract: The computer graphics and computer vision communities have been working closely together in recent years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media around us. There are three major driving forces behind this phenomenon: 1) the availability of big data from the Internet has created a demand for dealing with the ever-increasing, vast amount of resources; 2) powerful processing tools, such as deep neural networks, provide effective ways for learning how to deal with heterogeneous visual data; 3) new data capture devices, such as the Kinect, the bridge between algorithms for 2D image understanding and 3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey recent research on how computer vision techniques benefit computer graphics techniques and vice versa, and cover research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest possible further research directions.

Journal ArticleDOI
TL;DR: A new synthetic ground‐truth dataset is introduced that is used to evaluate the validity of these priors and the performance of the methods, and the performances of the different methods in the context of image‐editing applications are evaluated.
Abstract: Intrinsic images are a mid-level representation of an image that decompose the image into reflectance and illumination layers. The reflectance layer captures the color/texture of surfaces in the scene, while the illumination layer captures shading effects caused by interactions between scene illumination and surface geometry. Intrinsic images have a long history in computer vision and recently in computer graphics, and have been shown to be a useful representation for tasks ranging from scene understanding and reconstruction to image editing. In this report, we review and evaluate past work on this problem. Specifically, we discuss each work in terms of the priors they impose on the intrinsic image problem. We introduce a new synthetic ground-truth dataset that we use to evaluate the validity of these priors and the performance of the methods. Finally, we evaluate the performance of the different methods in the context of image-editing applications.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed layer-based algorithm can reconstruct quality 3D scenes with accurate depth information, as well as the occlusion effect.
Abstract: We propose a layer-based algorithm with single-viewpoint rendering geometry to calculate a three-dimensional (3D) computer-generated hologram (CGH) with occlusion effect. The 3D scene is sliced into multiple parallel layers according to the depth information. Slab-based orthographic projection is implemented to generate shading information for each layer, which renders hidden primitives for occlusion processing. The layer-based angular spectrum with silhouette mask culling is used to calculate the wave propagations from the layers to the CGH plane without paraxial approximation. The algorithm is compatible with the computer graphics pipeline for photorealistic rendering and robust for CGHs with different parameters. Experimental results demonstrate that the proposed algorithm can reconstruct quality 3D scenes with accurate depth information, as well as the occlusion effect.

Journal ArticleDOI
TL;DR: In this paper, the authors present recent advances in this field of transient imaging from a graphics and vision perspective, including capture techniques, analysis, applications and simulation, as well as a comprehensive overview of the current state of the art.

Proceedings ArticleDOI
01 Dec 2017
TL;DR: A new approach for highly realistic computer generated images detection by exploring inconsistencies into the region of the eyes by exploring the expression power of features extracted via transfer learning approach with VGG19 Deep Neural Network model.
Abstract: The advance of computer graphics techniques comes revolutionizing games and movie’s industries. Creating very realistic characters totally from computer graphics models is, nowadays, a reality. However, this advance comes with a big price: the realism of images is so big that it is difficult to realize when we are facing a computer generated image or a real photo. In this paper we propose a new approach for highly realistic computer generated images detection by exploring inconsistencies into the region of the eyes. Such inconsistencies are captured exploring the expression power of features extracted via transfer learning approach with VGG19 Deep Neural Network model. Unlike the state-of-the-art approaches, which looks to evaluate the entire image, proposed method focuses in specific regions (eyes) where computer graphics modeling still needs improvements. Experiments conducted over two different datasets containing extremely realistic images achieved an accuracy of 0.80 and an AUC of 0.88.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: In this paper, a sequence-to-sequence model was proposed to encode a set of objects and their locations as an input sequence using an LSTM network, and decodes this representation using a language model.
Abstract: Generating captions for images is a task that has recently received considerable attention. Another type of visual inputs are abstract scenes or object layouts where the only information provided is a set of objects and their locations. This type of imagery is commonly found in many applications in computer graphics, virtual reality, and storyboarding. We explore in this paper OBJ2TEXT, a sequence-to-sequence model that encodes a set of objects and their locations as an input sequence using an LSTM network, and decodes this representation using an LSTM language model. We show in our paper that this model despite using a sequence encoder can effectively represent complex spatial object-object relationships and produce descriptions that are globally coherent and semantically relevant. We test our approach for the task of describing object layouts in the MS-COCO dataset by producing sentences given only object annotations. We additionally show that our model combined with a state-of-the-art object detector can improve the accuracy of an image captioning model.

Journal ArticleDOI
16 Aug 2017
TL;DR: This paper aims to demonstrate the efforts towards in-situ applicability of EMMARM, which aims to provide real-time information about the response of the immune system to x-ray diffraction.
Abstract: Undergraduate physics students experience many difficulties describing vector fields both symbolically and graphically.

Journal ArticleDOI
TL;DR: This report presents the key research and models that exploit the limitations of perception to tackle visual quality and workload alike, and presents the open problems and promising future research targeting the question of how to minimize the effort to compute and display only the necessary pixels while still offering a user full visual experience.
Abstract: Advances in computer graphics enable us to create digital images of astonishing complexity and realism. However, processing resources are still a limiting factor. Hence, many costly but desirable aspects of realism are often not accounted for, including global illumination, accurate depth of field and motion blur, spectral effects, etc. especially in real-time rendering. At the same time, there is a strong trend towards more pixels per display due to larger displays, higher pixel densities or larger fields of view. Further observable trends in current display technology include more bits per pixel high dynamic range, wider color gamut/fidelity, increasing refresh rates better motion depiction, and an increasing number of displayed views per pixel stereo, multi-view, all the way to holographic or lightfield displays. These developments cause significant unsolved technical challenges due to aspects such as limited compute power and bandwidth. Fortunately, the human visual system has certain limitations, which mean that providing the highest possible visual quality is not always necessary. In this report, we present the key research and models that exploit the limitations of perception to tackle visual quality and workload alike. Moreover, we present the open problems and promising future research targeting the question of how we can minimize the effort to compute and display only the necessary pixels while still offering a user full visual experience.


Journal ArticleDOI
TL;DR: This STAR examines recent systems developed by the computer graphics community in which designers specify higher‐level goals ranging from structural integrity and deformation to appearance and aesthetics, with the final detailed shape and manufacturing instructions emerging as the result of computation.
Abstract: Computational manufacturing technologies such as 3D printing hold the potential for creating objects with previously undreamed-of combinations of functionality and physical properties. Human designers, however, typically cannot exploit the full geometric and often material complexity of which these devices are capable. This STAR examines recent systems developed by the computer graphics community in which designers specify higher-level goals ranging from structural integrity and deformation to appearance and aesthetics, with the final detailed shape and manufacturing instructions emerging as the result of computation. It summarizes frameworks for interaction, simulation, and optimization, as well as documents the range of general objectives and domain-specific goals that have been considered. An important unifying thread in this analysis is that different underlying geometric and physical representations are necessary for different tasks: we document over a dozen classes of representations that have been used for fabrication-aware design in the literature. We analyze how these classes possess obvious advantages for some needs, but have also been used in creative manners to facilitate unexpected problem solutions.

Journal ArticleDOI
TL;DR: JPEG is celebrating the 25th anniversary of its approval as a standard this year, and what are the fundamental components that have given it longevity?
Abstract: JPEG is celebrating the 25th anniversary of its approval as a standard this year. Where did JPEG come from, and what are the fundamental components that have given it longevity?

Journal ArticleDOI
TL;DR: A detailed description of the framework is provided, including a novel method for efficiently storing, evaluating, integrating, and sampling spherical and hemispherical datasets appropriate for the representation of modeled or measured bidirectional scattering, reflectance, and transmission distribution functions.
Abstract: The digital imaging and remote sensing image generation model is a physics-based image and data simulation model that is primarily used to generate synthetic imagery across the visible to thermal infrared regions using engineering-driven descriptions of remote sensing systems. The model recently went through a major redesign and reimplementation effort to address changes in user requirements and numerical computation trends that have emerged in the 15 years since the last major development effort. The new model architecture adopts some of the latest light transport algorithms matured by the computer graphics community and features a framework that is easily parallelized at the microscale (multithreading) and macroscale (cluster-based computing). A detailed description of the framework is provided, including a novel method for efficiently storing, evaluating, integrating, and sampling spherical and hemispherical datasets appropriate for the representation of modeled or measured bidirectional scattering, reflectance, and transmission distribution functions. The capabilities of the model are then briefly demonstrated and cross-verified with scenarios of interest to the remote sensing community.

Proceedings ArticleDOI
01 Jan 2017
TL;DR: It is shown that in many scenarios of practical importance such aligned data can be synthetically generated using computer graphics pipelines allowing domain adaptation through distillation, and the technique improves recognition performance on the low-quality data and beats strong baselines for domain adaptation.
Abstract: Model compression and knowledge distillation have been successfully applied for cross-architecture and cross-domain transfer learning. However, a key requirement is that training examples are in correspondence across the domains. We show that in many scenarios of practical importance such aligned data can be synthetically generated using computer graphics pipelines allowing domain adaptation through distillation. We apply this technique to learn models for recognizing low-resolution images using labeled high-resolution images, non-localized objects using labeled localized objects, line-drawings using labeled color images, etc. Experiments on various fine-grained recognition datasets demonstrate that the technique improves recognition performance on the low-quality data and beats strong baselines for domain adaptation. Finally, we present insights into workings of the technique through visualizations and relating it to existing literature.

Book ChapterDOI
01 Jan 2017
TL;DR: In this paper, the authors present a mobile service that allows exploring urban models at different Level of Details (LoDs) using well-known standards such as CityGML, which enables researchers, city planners and technicians to explore urban energy datasets in an interactive and immersive manner as Virtual Globes, Virtual Reality and Augmented Reality.
Abstract: The visualization of cross-domain spatial data sets has become an important task within the analysis of energy models. The representation of these models is especially important in urban areas, in which the under-standing of patterns of energy production and demand is key for an efficient city planning. Location Based Services (LBS) provide a valuable addition towards the analysis and visualization of those data sets as the user can explore the output of different models and simulations in the real environment at the location of interest. Towards this aim, the present research explores mobile alternatives to the visual analysis of temporal data series and 3D building models. Based on the fields of numerical simulation, GIS and computer graphics, this work presents a novel mobile service that allows exploring urban models at different Level of Details (LoDs) using well-known standards such as CityGML. Ultimately, the project enables researchers, city planners and technicians to explore urban energy datasets in an interactive and immersive manner as Virtual Globes, Virtual Reality and Augmented Reality. Using models of the city of Karlsruhe, the final service has been implemented and tested on the iOS platform providing an empirical insight on the performance of the system. In addition, this research provides a holistic approach by developing one application that is capable of seamlessly change the visualization mode.

Dissertation
01 Jan 2017
TL;DR: This PhD dissertation explores the role of language in the creation of digital media technologies and the role that language and culture play in the development of these technologies.
Abstract: Over the past few decades there have emerged greater possibilities for users and consumers of media to create or engage in the creation of digital media technologies. This PhD dissertation explores ...

Journal ArticleDOI
TL;DR: This paper proposes a method for realistically transferring details from existing high‐quality 3D models to simple shapes that may be created with easy‐to‐learn modeling tools using metric learning to find a combination of geometric features that successfully predicts detail‐map similarities on the source mesh.
Abstract: The visual richness of computer graphics applications is frequently limited by the difficulty of obtaining high-quality, detailed 3D models. This paper proposes a method for realistically transferring details specifically, displacement maps from existing high-quality 3D models to simple shapes that may be created with easy-to-learn modeling tools. Our key insight is to use metric learning to find a combination of geometric features that successfully predicts detail-map similarities on the source mesh; we use the learned feature combination to drive the detail transfer. The latter uses a variant of multi-resolution non-parametric texture synthesis, augmented by a high-frequency detail transfer step in texture space. We demonstrate that our technique can successfully transfer details among a variety of shapes including furniture and clothing.

Journal ArticleDOI
TL;DR: Mesh planarization algorithms, which are currently used in computer graphics and in digital fabrication methods, are used and adapted to automate and optimize the parsing of non-planar surfaces to EnergyPlus (E+), a popular BEM engine.

Journal ArticleDOI
TL;DR: The efforts and research results to meet the requirements for modeling, rendering, and animating clouds realistically, together with related researches on the visual simulation of clouds are explained.

Journal ArticleDOI
TL;DR: A set of visualization metrics to quantify visualization techniques and a framework for optimizing the layout of a visualization technique is presented, based on an evolutionary algorithm which uses treemaps as a case study.