scispace - formally typeset
Search or ask a question

Showing papers on "Software rendering published in 2009"


Proceedings ArticleDOI
01 Jan 2009
TL;DR: This STAR gives an account of recent developments in ToF-technology and discusses the current state of the integration of this technology into various graphics-related applications.
Abstract: A growing number of applications depend on accurate and fast 3D scene analysis. Examples are model and lightfield acquisition, collision prevention, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensive part of such systems. A lower-priced, fast and robust alternative for distance measurements are Time-of-Flight (ToF) cameras. Recently, significant advances have been made in producing low-cost and compact ToF-devices, which have the potential to revolutionize many fields of research, including Computer Graphics, Computer Vision and Human Machine Interaction (HMI). These technologies are starting to have an impact on research and commercial applications. The upcoming generation of ToF sensors, however, will be even more powerful and will have the potential to become “ubiquitous real-time geometry devices” for gaming, web-conferencing, and numerous other applications. This STAR gives an account of recent developments in ToF-technology and discusses the current state of the integration of this technology into various graphics-related applications.

234 citations


Journal ArticleDOI
TL;DR: A novel system called Equalizer is introduced, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines.
Abstract: Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantages over previous approaches, present example configurations and usage scenarios as well as scalability results.

180 citations


Journal ArticleDOI
Micah Dowty1, Jeremy Sugerman1
TL;DR: This paper describes in detail the specific GPU virtualization architecture developed for VMware's hosted products (VMware Workstation and VMware Fusion) and finds that taking advantage of hardware acceleration significantly closes the gap between pure emulation and native, but that different implementations and host graphics stacks show distinct variation.
Abstract: Modern graphics co-processors (GPUs) can produce high fidelity images several orders of magnitude faster than general purpose CPUs, and this performance expectation is rapidly becoming ubiquitous in personal computers. Despite this, GPU virtualization is a nascent field of research. This paper introduces a taxonomy of strategies for GPU virtualization and describes in detail the specific GPU virtualization architecture developed for VMware's hosted products (VMware Workstation and VMware Fusion).We analyze the performance of our GPU virtualization with a combination of applications and microbenchmarks. We also compare against software rendering, the GPU virtualization in Parallels Desktop 3.0, and the native GPU. We find that taking advantage of hardware acceleration significantly closes the gap between pure emulation and native, but that different implementations and host graphics stacks show distinct variation. The microbenchmarks show that our architecture amplifies the overheads in the traditional graphics API bottlenecks: draw calls, downloading buffers, and batch sizes.Our virtual GPU architecture runs modern graphics-intensive games and applications at interactive frame rates while preserving virtual machine portability. The applications we tested achieve from 86% to 12% of native rates and 43 to 18 frames per second with VMware Fusion 2.0.

179 citations


Proceedings ArticleDOI
27 Feb 2009
TL;DR: An approach for rendering the surface of a particle-based fluid that is simple to implement, has real-time performance with a configurable speed/quality trade-off, and smoothes the surface to prevent the fluid from looking "blobby" or jelly-like is presented.
Abstract: We present an approach for rendering the surface of a particle-based fluid that is simple to implement, has real-time performance with a configurable speed/quality trade-off, and smoothes the surface to prevent the fluid from looking "blobby" or jelly-like. The method is not based on polygonization and as such circumvents the usual grid artifacts of marching cubes. It only renders the surface where it is visible, and has inherent view-dependent level-of-detail. We use Perlin noise to add detail to the surface of the fluid. All the processing, rendering and shading steps are directly implemented on graphics hardware.

133 citations


Journal ArticleDOI
24 Feb 2009
TL;DR: An impact of new GPU features on development process of an efficient finite difference time domain (FDTD) implementation is described.
Abstract: Graphics processing units (GPUs) for years have been dedicated mostly to real time rendering. Recently leading GPU manufactures have extended their research area and decided to support also graphics computing. In this paper, we describe an impact of new GPU features on development process of an efficient finite difference time domain (FDTD) implementation.

86 citations


Journal ArticleDOI
TL;DR: A software system that enables path‐traced rendering of complex scenes with strong performance and scalability and an efficient implementation of a path tracer application, where GPUs perform functions such as ray tracing, shadow tracing, importance‐driven light sampling, and surface shading is presented.
Abstract: Author(s): Budge, Brian C.; Bernardin, Tony; Stuart, Jeff A.; Sengupta, Shubhabrata; Joy, Ken; Owens, John D. | Abstract: We present a software system that enables path-traced rendering of complex scenes. The system consists of two primary components: an application layer that implements the basic rendering algorithm, and an out-of-core scheduling and data-management layer designed to assist the application layer in exploiting hybrid computational resources (e.g., CPUs and GPUs) simultaneously. We describe the basic system architecture, discuss design decisions of the system's data-management layer, and outline an efficient implementation of a path tracer application, where GPUs perform functions such as ray tracing, shadow tracing, importance-driven light sampling, and surface shading. The use of GPUs speeds up the runtime of these components by factors ranging from two to twenty, resulting in a substantial overall increase in rendering speed. The path tracer scales well with respect to CPUs, GPUs and memory per node as well as scaling with the number of nodes. The result is a system that can render large complex scenes with strong performance and scalability.

65 citations


Proceedings ArticleDOI
19 Oct 2009
TL;DR: Experiments prove that the remote rendering framework can be effectively used for quality 3D video rendering on mobile devices in real time.
Abstract: At the convergence of computer vision, graphics, and multimedia, the emerging 3D video technology promises immersive experiences in a truly seamless environment. However, the requirements of huge network bandwidth and computing resources make it still a big challenge to render 3D video on mobile devices at real-time. In this paper, we present how remote rendering framework can be used to solve the problem. The differences between dynamic 3D video and static graphic models are analyzed. A general proxy-based framework is presented to render 3D video streams on the proxy and transmit the rendered scene to mobile devices over a wireless network. An image-based approach is proposed to enhance 3D interactivity and reduce the interaction delay. Experiments prove that the remote rendering framework can be effectively used for quality 3D video rendering on mobile devices in real time.

49 citations


Journal ArticleDOI
01 Dec 2009
TL;DR: This work presents a new GPU-based rendering system for ray casting of multiple volumes which provides interactive frame rates when concurrently rendering more than 50 arbitrarily overlapping volumes on current graphics hardware.
Abstract: We present a new GPU-based rendering system for ray casting of multiple volumes. Our approach supports a large number of volumes, complex translucent and concave polyhedral objects as well as CSG intersections of volumes and geometry in any combination. The system (including the rasterization stage) is implemented entirely in CUDA, which allows full control of the memory hierarchy, in particular access to high bandwidth and low latency shared memory. High depth complexity, which is problematic for conventional approaches based on depth peeling, can be handled successfully. As far as we know, our approach is the first framework for multivolume rendering which provides interactive frame rates when concurrently rendering more than 50 arbitrarily overlapping volumes on current graphics hardware.

44 citations



Book
16 Apr 2009
TL;DR: Precomputation-based relighting and radiance transfer has a long history with a spurt of renewed interest, including adoption in commercial video games, due to recent mathematical developments and hardware advances.
Abstract: High quality image synthesis is a long-standing goal in computer graphics. Complex lighting, reflection, shadow and global illumination effects can be rendered with modern image synthesis algorithms, but those methods are focused on offline computation of a single image. They are far from interactive, and the image must be recomputed from scratch when any aspect of the scene changes. On the other hand, real-time rendering often fixes the object geometry and other attributes, such as relighting a static image for lighting design. In these cases, the final image or rendering is a linear combination of basis images or radiance distributions due to individual lights. We can therefore precompute offline solutions to each individual light or lighting basis function, combining them efficiently for real-time image synthesis. Precomputation-based relighting and radiance transfer has a long history with a spurt of renewed interest, including adoption in commercial video games, due to recent mathematical developments and hardware advances. In this survey, we describe the mathematical foundations, history, current research and future directions for precomputation-based rendering.

37 citations


Patent
Adam Jackson1
26 Aug 2009
TL;DR: In this paper, a hypervisor that runs on the server is extended to include a redirection module, which receives a rendering request from a virtual machine and redirects the rendering request to a graphics driver.
Abstract: Graphics rendering in a virtual machine system is accelerated by utilizing host graphics hardware. In one embodiment, the virtual machine system includes a server that hosts a plurality of virtual machines. The server includes one or more graphics processing units. Each graphics processing unit can be allocated to multiple virtual machines to render images. A hypervisor that runs on the server is extended to include a redirection module, which receives a rendering request from a virtual machine and redirects the rendering request to a graphics driver. The graphics driver can commands an allocated portion of a graphics processing unit to render an image on the server.

Proceedings ArticleDOI
11 Oct 2009
TL;DR: This tutorial is an introduction to GPU programming using the OpenGL Shading Language – GLSL and comprises an overview of graphics concepts and a walk-through the graphics card rendering pipeline.
Abstract: One of the challenging advents in Computer Science in recent years was the fast evolution of parallel processors, specially the GPU – graphics processing unit GPUs today play a major role in many computational environments, most notably those regarding real-time graphics applications, such as games The digital game industry is one of the main driving forces behind GPUs, it persistently elevates the state-of-art in Computer Graphics, pushing outstanding realistic scenes to interactive levels The evolution of photo realistic scenes consequently demands better graphics cards from the hardware industry Over the last decade, the hardware has not only become a hundred times more powerful, but has also become increasingly customizable allowing programmers to alter some of previously fixed functionalities This tutorial is an introduction to GPU programming using the OpenGL Shading Language – GLSL It comprises an overview of graphics concepts and a walk-through the graphics card rendering pipeline A thorough understanding of the graphics pipeline is extremely important when designing a program in GPU, known as a shader Throughout this tutorial, the exposition of the GLSL language and GPU programming details are followed closely by examples ranging from very simple to more practical applications It is aimed at an audience with no or little knowledge on the subject

Patent
13 May 2009
TL;DR: In this paper, the authors present a system and method for providing an API that allows users to write complex graphics and visualization applications with little knowledge of how to parallelize or distribute the application across a graphics cluster.
Abstract: A system and method for providing an Application Programming Interface (API) that allows users to write complex graphics and visualization applications with little knowledge of how to parallelize or distribute the application across a graphics cluster. The interface enables users to develop an application program using a common programming paradigm (e.g., scene graph) in a manner that accommodates handling parallel rendering tasks and rendering environments. The visualization applications written by developers take better advantage of the aggregate resources of a cluster. The programming model provided by APT function calls handles scene-graph(s) data in a manner such that the scene and data management are decoupled from the rendering, compositing, and display. As a result, the system and method is not beholden to one particular graphics rendering API (e.g. OpenGL, Direct X, etc.) and provides the ability to switch between these APIs even during runtime.

Journal ArticleDOI
TL;DR: An algorithm for automatically computing tight positional and normal bounds on the fly for a base primitive derived from an arbitrary vertex shader program, which may include a curved surface evaluation and different types of displacements, for example.
Abstract: Graphics processing units supporting tessellation of curved surfaces with displacement mapping exist today. Still, to our knowledge, culling only occurs after tessellation, that is, after the base primitives have been tessellated into triangles. We introduce an algorithm for automatically computing tight positional and normal bounds on the fly for a base primitive. These bounds are derived from an arbitrary vertex shader program, which may include a curved surface evaluation and different types of displacements, for example. The obtained bounds are used for backface, view frustum, and occlusion culling before tessellation. For highly tessellated scenes, we show that up to 80p of the vertex shader instructions can be avoided, which implies an “instruction speedup” of 5×. Our technique can also be used for offline software rendering.

Patent
07 Dec 2009
TL;DR: In this paper, the authors describe techniques for removing vertex points during two-dimensional (2D) graphics rendering using three-dimensional graphics hardware (3D) for more efficient utilization of the hardware resources of the GPU.
Abstract: This disclosure describes techniques for removing vertex points during two dimensional (2D) graphics rendering using three-dimensional (3D) graphics hardware. In accordance with the described techniques one or more vertex points may be removed during 2D graphics rendering using 3D graphics hardware. For example, the techniques may remove redundant vertex points in the display coordinate space by discarding vertex points that have the substantially same positional coordinates in the display coordinate space as a previous vertex point. Alternatively or additionally, the techniques may remove excess vertex points that lie in a straight line. Removing the redundant vertex points or vertex points that lie in a straight line allow for more efficient utilization of the hardware resources of the GPU and increase the speed at which the GPU renders the image for display.

Book ChapterDOI
26 Nov 2009
TL;DR: This paper uses NMM, a distributed multimedia middleware, to build a powerful and flexible rendering framework that is highly modular, and can be easily reconfigured --- even at runtime --- to meet the changing demands of applications built on top of it.
Abstract: The available rendering performance on current computers increases constantly, primarily by employing parallel algorithms using the newest many-core hardware, as for example multi-core CPUs or GPUs. This development enables faster rasterization, as well as conspicuously faster software-based real-time ray tracing. Despite the tremendous progress in rendering power, there are and always will be applications in classical computer graphics and Virtual Reality, which require distributed configurations employing multiple machines for both rendering and display. In this paper we address this problem and use NMM, a distributed multimedia middleware, to build a powerful and flexible rendering framework. Our framework is highly modular, and can be easily reconfigured --- even at runtime --- to meet the changing demands of applications built on top of it. We show that the flexibility of our approach comes at a negligible cost in comparison to a specialized and highly-optimized implementation of distributed rendering.

Proceedings ArticleDOI
22 Sep 2009
TL;DR: A client-server framework for network distribution and real-time point-based rendering of large 3D models on commodity graphics platforms is presented and model inspection, based on a one-touch interface, is enriched by a bidirectional hyperlink system.
Abstract: We present a client-server framework for network distribution and real-time point-based rendering of large 3D models on commodity graphics platforms. Model inspection, based on a one-touch interface, is enriched by a bidirectional hyperlink system which provides access to multiple layers of multimedia contents linking different parts of the 3D model many information sources. In addition to view and light control, users can perform simple 3D operations like angle, distance and area measurements on the 3D model. An authoring tool derived from the basic client allows users to add multimedia content to the model description. Our rendering method is based on a coarse grained multiresolution structure, where each node contains thousands of point samples. At runtime, a view-dependent refinement process incrementally updates the current GPU-cached model representation from local or remote out-of-core data. Vertex and fragment shaders are used for high quality elliptical sample drawing and a variety of shading effects. The system is demonstrated with examples that range from documentation and inspection of small artifacts to exploration of large sites, in both a museum and a large scale distribution setting.

Book ChapterDOI
18 Aug 2009
TL;DR: Boosting Feature Selection (BFS) is used to greatly reduce the dimensionality of features while boosts the machine learning based classification performance to fairly high.
Abstract: Computer graphics identification has gained importance in digital era as it relates to image forgery detection and enhancement of high photorealistic rendering software. In this paper, statistical moments of 1-D and 2-D characteristic functions are employed to derive image features that can well capture the statistical differences between computer graphics and photographic images. YCbCr color system is selected because it has shown better performance in computer graphics classification than RGB color system and it has been adopted by the most popularly used JPEG images. Furthermore, only Y and Cb color channels are used in feature extraction due to our study showing features derived from Cb and Cr are so highly correlated that no need to use features extracted from both Cb and Cr components, which substantially reduces computational complexity. Concretely, in each selected color component, features are extracted from each image in both image pixel 2-D array and JPEG 2-D array (an 2-D array consisting of the magnitude of JPEG coefficients), their prediction-error 2-D arrays, and all of their three-level wavelet subbands, referred to as various 2-D arrays generated from a given image in this paper. The rationale behind using prediction-error image is to reduce the influence caused by image content. To generate image features from 1-D characteristic functions, the various 2-D arrays of a given image are the inputs, yielding 156 features in total. For the feature generated from 2-D characteristic functions, only JPEG 2-D array and its prediction-error 2-D array are the inputs, one-unit-apart 2-D histograms of the JPEG 2-D array along the horizontal, vertical and diagonal directions are utilized to generate 2-D characteristic functions, from which the marginal moments are generated to form 234 features. Together, the process then results in 390 features per color channel, and 780 features in total Finally, Boosting Feature Selection (BFS) is used to greatly reduce the dimensionality of features while boosts the machine learning based classification performance to fairly high.

Proceedings ArticleDOI
25 May 2009
TL;DR: An alpha-stable distribution model is built to characterize the wavelet decomposition coefficients of natural images and shows that the proposed method performs better than the previous higher-order statistical approaches.
Abstract: With the use of advanced computer graphics rendering software, computer generated images have become difficult to be visually differentiated from natural images captured using digital cameras. The need for automatically distinguishing computer generated images from natural images is becoming significantly important for image forensic techniques. In this paper, a novel approach is proposed to differentiate the two image categories. An alpha-stable distribution model is built to characterize the wavelet decomposition coefficients of natural images. The suitability of the model is then illustrated. The fractional lower order moments in the image wavelet domain are extracted and evaluated with the Support Vector Machine classifier. The experimental results show that the proposed method performs better than the previous higher-order statistical approaches.

Journal ArticleDOI
29 Jun 2009
TL;DR: This paper presents a computational pipeline for particle volume rendering that is easily accelerated by the current GPU, and demonstrates that high quality volume renderings can be easily produced from large particle datasets in time frames of a few seconds to less than a minute.
Abstract: Visualizing dynamic participating media in particle form by fully solving equations from the light transport theory is a computationally very expensive process. In this paper,we presenta computational pipeline for particle volume rendering that is easily accelerated by the currentGPU. To fully harness its massively parallel computing power, we transform input particles into a volumetric density field using a GPU-assisted, adaptive density estimation technique that iteratively adapts the smoothing length for local grid cells. Then, the volume data is visualized efficiently based on the volume photon mapping method where our GPU techniques further improve the rendering quality offered by previous implementations while performing rendering computation in acceptable time. It is demonstrated that high quality volume renderings can be easily produced from large particle datasets in time frames of a few seconds to less than a minute.

Patent
27 Jan 2009
TL;DR: In this article, the authors propose a method for rendering 3D data from the 3D graphics commands generated by an application executing on a first computing machine and then analyzing the characteristics associated with a remoting system to determine a location for rendering.
Abstract: Methods and systems for rendering three dimensional graphical data by intercepting a three dimensional graphics stream comprising three dimensional graphics commands generated by an application executing on a first computing machine, and then analyzing the characteristics associated with a remoting system to determine a location for rendering three dimensional data from the three dimensional graphics commands. The remoting system may comprise at least the first computing machine having a graphics rendering component, a second computing machine having a graphics rendering component and a network. Based on the analysis, a rendering location is determined and the application is induced to reinitialize a context for determining where to render three dimensional data. The three dimensional data is then rendered from the three dimensional graphics commands at the rendering location.

Patent
14 Sep 2009
TL;DR: A circuit arrangement, program product and circuit arrangement render stereoscopic images in a multithreaded rendering software pipeline using first and second rendering channels respectively configured to render left and right views for the stereoscopic image as discussed by the authors.
Abstract: A circuit arrangement, program product and circuit arrangement render stereoscopic images in a multithreaded rendering software pipeline using first and second rendering channels respectively configured to render left and right views for the stereoscopic image. Separate transformations are applied to received vertex data to generate transformed vertex data for use by each of the first and second rendering channels in rendering the left and right views for the stereoscopic image.

Proceedings ArticleDOI
01 Aug 2009
TL;DR: An adaptive anti-aliasing (AA) filter for real-time rendering on the GPU is described, using information from neighboring pixel samples to compute both an approximation of the gradient of primitive edges and the final pixel color.
Abstract: The latest generation of graphics hardware provides direct access to multisample anti-aliasing (MSAA) rendering data. By taking advantage of these existing pixel subsample values, an intelligent reconstruction filter can be computed using programmable GPU shader units. This paper describes an adaptive anti-aliasing (AA) filter for real-time rendering on the GPU. Improved quality is achieved by using information from neighboring pixel samples to compute both an approximation of the gradient of primitive edges and the final pixel color.

Patent
25 Jun 2009
TL;DR: In the field of rendering a 3D model into a 2D graphical representation, the resources available to a general purpose computer can be used to efficiently render the scene as discussed by the authors, and the high speed/low computer resource with which the rendering is performed results from utilizing different inputs to the rendering engine and from the manner in which rendering engine handles these inputs.
Abstract: In the field of rendering a three dimensional model into a two dimensional graphical representation, the resources available to a general purpose computer can be used to efficiently render the scene. The high speed/low computer resource with which the rendering is performed results from utilizing different inputs to the rendering engine and from the manner in which the rendering engine handles these inputs.

Proceedings Article
01 Jan 2009
TL;DR: This paper presents an approach to real-time rendering of non-planar projections with a single center and straight projection rays that operates entirely in object space to remove the need for image resampling.
Abstract: This paper presents an approach to real-time rendering of non-planar projections with a single center and straight projection rays. Its goal is to provide optimal and consistent image quality. It operates entirely in object space to remove the need for image resampling. In contrast to most other object-space approaches, it does not evaluate non-linear functions on the GPU, but approximates the projection itself by a set of perspective projection pieces. Within each piece, graphics hardware can provide optimal image quality. The result is a coherent and crisp rendering. Procedural textures and stylization effects greatly benefit from our method as they usually rely on screen-space operations. The real-time implementation runs entirely on GPU. It replicates input primitives on demand and renders them into all relevant projection pieces. The method is independent of the input mesh density and is not restricted to static meshes. Thus, it is well suited for interactive applications. We demonstrate it for an analytic and a freely designed projection.

Journal ArticleDOI
TL;DR: This paper describes a robust, modular, complete GPU architecture—the Tile-Load-Map (TLM)—designed for the real-time visualization of wide textured terrains created with arbitrary meshes, and shows that this texturing architecture is well suited to current challenges, and takes into account most of the distinctive aspects of terrain rendering.
Abstract: This paper describes a robust, modular, complete GPU architecture—the Tile-Load-Map (TLM)—designed for the real-time visualization of wide textured terrains created with arbitrary meshes. It extends and completes our previous succinct paper Amara et al. (ISVC 2007, Part 1, Lecture Notes in Computer Science, vol. 4841, pp. 586–597, Springer, Berlin, 2007) by giving further technical and implementation details. It provides new solutions to problems that had been left unresolved, in the context of a joint use of OpenGL and CUDA, optimized on the G80 graphics chip. We explain the crucial components of the shaders, and emphasize the progress we have proposed, while resolving some difficulties. We show that this texturing architecture is well suited to current challenges, and takes into account most of the distinctive aspects of terrain rendering. Finally, we demonstrate how the design of the TLM facilitates the integration of geomatic input-data into procedural selection/rendering tasks on the GPU, and immediate applications to amplification.

Patent
27 Oct 2009
TL;DR: In this paper, a technique for controlling animation rendering frame rate of an application is disclosed, where an animation rendering update interval of an animation timer may be adjusted based upon a rendering system state and/or an application state.
Abstract: Many computer applications incorporate and support animation (e.g., interactive user interfaces). Unfortunately, it may be challenging for computer applications and rendering systems to render animation frames at a smooth and consistent rate while conserving system resources. Accordingly, a technique for controlling animation rendering frame rate of an application is disclosed herein. An animation rendering update interval of an animation timer may be adjusted based upon a rendering system state (e.g., a rate of compositing visual layouts from animation frames) of a rendering system and/or an application state (e.g., a rate at which an application renders frames) of an application. Adjusting the animation rendering update interval allows the animation timer to adjust the frequency of performing rendering callback notifications (work requests to an application to render animation frames) to an application based upon rendering system performance and application performance.

Patent
02 Jul 2009
TL;DR: In this paper, the authors present a system that facilitates the execution of a web application by loading a native code module that includes a scenegraph renderer into a secure runtime environment.
Abstract: One embodiment provides a system that facilitates the execution of a web application. During operation, the system loads a native code module that includes a scenegraph renderer into a secure runtime environment. Next, the system uses the scenegraph renderer to create a scenegraph from a graphics model associated with the web application and generate a set of rendering commands from the scenegraph. The system then writes the rendering commands to a command buffer and reads the rendering commands from the command buffer. Finally, the system uses the rendering commands to render, for the web application, an image corresponding to the graphics model by executing the rendering commands using a graphics-processing unit (GPU).

Proceedings ArticleDOI
27 Feb 2009
TL;DR: This work presents a method that exploits graphics hardware to provide fast and robust line visibility in models of moderate complexity and provides much higher visual quality and flexibility for stylization.
Abstract: Lines drawn over or in place of shaded 3D models can often provide greater comprehensibility and stylistic freedom that shading alone. A substantial challenge for making stylized line drawings from 3D models is the visibility computation. Current algorithms for computing line visibility in models of moderate complexity are either too slow for interactive rendering, or too brittle for coherent animation. We present a method that exploits graphics hardware to provide fast and robust line visibility. Rendering speed for our system is usually within a factor of two of an optimized rendering pipeline using conventional lines, and our system provides much higher visual quality and flexibility for stylization.

01 May 2009
TL;DR: The notion of true back patch culling is introduced, allowing us to simplify crack hiding and sampling and to demonstrate that software rasterization with geometry amplification can work in real-time.
Abstract: Recently, sort-middle triangle rasterization, implemented as software on a manycore GPU with vector units (Larabee), has been proposed as an alternative to hardware rasterization. The main reasoning is, that only a fraction of the time per frame is spent sorting and rasterizing triangles. However is this still a valid argument in the context of geometry amplification when the number of primitives increases quickly? A REYES like approach, sorting parametric patches instead, could avoid many of the problems with tiny triangles. To demonstrate that software rasterization with geometry amplification can work in real-time, we implement a tile based sort-middle rasterizer in CUDA and analyze its behavior: First we adaptively subdivide rational bicubic B´ezier patches. Then we sort those into buckets, and for each bucket we dice, grid-shade, and rasterize the micropolygons into the corresponding tile using on-chip caches. Despite being limited by the amount of available shared memory, the number of registers and the lack of an L3 cache, we manage to rasterize 1600×1200 images, containing 200k sub-patches, at 10-12 fps on an nVidia GTX 280. This is about 3x to 5x slower than a hybrid approach subdividing with CUDA and using the HW rasterizer. We hide cracks caused by adaptive subdivision using a flatness metric in combination with rasterizing B´ezier convex hulls. Using a k-buffer with a fuzzy z-test we can render transparent objects despite the overlap our approach creates. Further, we introduce the notion of true back patch culling, allowing us to simplify crack hiding and sampling.