Author
Kenneth J. Mitchell
Bio: Kenneth J. Mitchell is an academic researcher from The Walt Disney Company. The author has contributed to research in topics: Rendering (computer graphics) & Augmented reality. The author has an hindex of 8, co-authored 23 publications receiving 171 citations.
Topics: Rendering (computer graphics), Augmented reality, Foveal, Pixel, Grid
Papers
More filters
•
15 Mar 2013
TL;DR: In this paper, the first physical object is identified as a first predetermined object type, based on one or more object identifiers associated with the first object, and a sequence of frames for display in which the captured visual scene is augmented based on the predefined geometric information.
Abstract: Techniques for displaying an augmented reality toy. Embodiments capture a visual scene for display. The visual scene includes a first physical object and is captured using one or more camera devices. The first physical object is identified as a first predetermined object type, based on one or more object identifiers associated with the first physical object. Embodiments retrieve predefined geometric information corresponding to the first predetermined object type and render a sequence of frames for display in which the captured visual scene is augmented, based on the predefined geometric information.
22 citations
•
11 Jan 2013
TL;DR: In this article, a multiuser augmented reality system was proposed for displaying a visual scene for display on a first augmented reality device, where visual scene data for a second user was received from a second AR device.
Abstract: Techniques for displaying a multiuser augmented reality world on a first augmented reality device. Embodiments capture a visual scene for display. The visual scene includes a first user and wherein the visual scene is captured using one or more camera devices. Visual scene data for a second user is received from a second augmented reality device. Embodiments rendering a sequence of frames for display which depict the first user and the second user in an augmented reality world, where the depiction of the first user is based on the captured visual scene, and where the depiction of the second user is based on the received visual scene data.
19 citations
•
24 Aug 2016
TL;DR: In this article, a focal area is defined as a region along the user's line of sight that permits high visual acuity with respect to a periphery of the line-of-sight.
Abstract: Individual images for individual frames of an animation may be rendered to include individual focal areas. A focal area may include one or more of a foveal region corresponding to a gaze direction of a user, an area surrounding the foveal region, and/or other components. The foveal region may comprise a region along the user's line of sight that permits high visual acuity with respect to a periphery of the line of sight. A focal area within an image may be rendered based on parameter values of rendering parameters that are different from parameter values for an area outside the focal area.
16 citations
•
30 Jun 2011TL;DR: In this article, a visual scene for display, the visual scene captured using one or more camera devices of the augmented reality device, is described, and the adjusted visual scene is then output for display on the AR device.
Abstract: Techniques for displaying content using an augmented reality device are described. Embodiments provide a visual scene for display, the visual scene captured using one or more camera devices of the augmented reality device. Embodiments adjust physical display geometry characteristics of the visual scene to correct for optimal projection. Additionally, illumination characteristics of the visual scene are modified based on environmental illumination data to improve realism of the visual scene when it is displayed. Embodiments further adjust display characteristics of the visual scene to improve tone mapping output. The adjusted visual scene is then output for display on the augmented reality device.
15 citations
•
30 Sep 2016TL;DR: In this paper, a physics-based tracking framework is proposed to train motion priors using different deep learning techniques, such as convolutional neural networks (CNN) and Recurrent Temporal Restricted Boltzmann Machines (RTRBMs).
Abstract: Training data from multiple types of sensors and captured in previous capture sessions can be fused within a physics-based tracking framework to train motion priors using different deep learning techniques, such as convolutional neural networks (CNN) and Recurrent Temporal Restricted Boltzmann Machines (RTRBMs). In embodiments employing one or more CNNs, two streams of filters can be used. In those embodiments, one stream of the filters can be used to learn the temporal information and the other stream of the filters can be used to learn spatial information. In embodiments employing one or more RTRBMs, all visible nodes of the RTRBMs can be clamped with values obtained from the training data or data synthesized from the training data. In cases where sensor data is unavailable, the input nodes may be unclamped and the one or more RTRBMs can generate the missing sensor data.
13 citations
Cited by
More filters
••
10 Nov 2017TL;DR: In this article, a spatio-temporal sub-pixel convolution network is proposed to exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed, and a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods.
Abstract: Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.
622 citations
01 Aug 2009
TL;DR: It is shown that motion induces a shear in the frequency domain, and that the spectrum of moving scenes can be approximated by a wedge, which allows us to compute adaptive space-time sampling rates, to accelerate rendering.
Abstract: Motion blur is crucial for high-quality rendering, but is also very expensive. Our first contribution is a frequency analysis of motion-blurred scenes, including moving objects, specular reflections, and shadows. We show that motion induces a shear in the frequency domain, and that the spectrum of moving scenes can be approximated by a wedge. This allows us to compute adaptive space-time sampling rates, to accelerate rendering. For uniform velocities and standard axis-aligned reconstruction, we show that the product of spatial and temporal bandlimits or sampling rates is constant, independent of velocity. Our second contribution is a novel sheared reconstruction filter that is aligned to the first-order direction of motion and enables even lower sampling rates. We present a rendering algorithm that computes a sheared reconstruction filter per pixel, without any intermediate Fourier representation. This often permits synthesis of motion-blurred images with far fewer rendering samples than standard techniques require.
72 citations
••
25 Jul 2018TL;DR: This paper parameterize foveated rendering by embedding polynomial kernel functions in the classic log-polar mapping, a GPU-driven technique that uses closed-form, parameterized foveation that mimics the distribution of photoreceptors in the human retina.
Abstract: Foveated rendering coupled with eye-tracking has the potential to dramatically accelerate interactive 3D graphics with minimal loss of perceptual detail. In this paper, we parameterize foveated rendering by embedding polynomial kernel functions in the classic log-polar mapping. Our GPU-driven technique uses closed-form, parameterized foveation that mimics the distribution of photoreceptors in the human retina. We present a simple two-pass kernel foveated rendering (KFR) pipeline that maps well onto modern GPUs. In the first pass, we compute the kernel log-polar transformation and render to a reduced-resolution buffer. In the second pass, we carry out the inverse-log-polar transformation with anti-aliasing to map the reduced-resolution rendering to the full-resolution screen. We have carried out pilot and formal user studies to empirically identify the KFR parameters. We observe a 2.8X -- 3.2X speedup in rendering on 4K UHD (2160p) displays with minimal perceptual loss of detail. The relevance of eye-tracking-guided kernel foveated rendering can only increase as the anticipated rise of display resolution makes it ever more difficult to resolve the mutually conflicting goals of interactive rendering and perceptual realism.
65 citations
•
24 Jul 2014
TL;DR: In this paper, a proprioceptive user interface defines interface locations relative to body parts of an observer, the body parts include the head and the shoulders of the observer, and the observer may enable a process associated with the interface location by indicating an intersection with interface location with a physical object.
Abstract: A display system renders a motion parallax view of object images based upon multiple observers. Also, a headset renders stereoscopic images that augment either physical objects viewed through the headset, or virtual objects projected by a stereoscopic display separate from the headset, or both. The headset includes a system for locating both physical objects and object images within a stereographic projection. Also, a proprioceptive user interface defines interface locations relative to body parts of an observer, the body parts include the head and the shoulders of the observer. The observer may enable a process associated with the interface location by indicating an intersection with the interface location with a physical object, such as a wand or a finger of the observer.
45 citations
•
31 Jan 2014TL;DR: In this paper, a display control device including a display controller configured to place a virtual object within an augmented reality space corresponding to a real space in accordance with a recognition result of a real object shown in an image captured by an imaging part, and an operation acquisition part configured to acquire a user operation.
Abstract: There is provided a display control device including a display controller configured to place a virtual object within an augmented reality space corresponding to a real space in accordance with a recognition result of a real object shown in an image captured by an imaging part, and an operation acquisition part configured to acquire a user operation. When the user operation is a first operation, the display controller causes the virtual object to move within the augmented reality space.
44 citations