scispace - formally typeset
Search or ask a question
Author

Vineet Thumuluri

Bio: Vineet Thumuluri is an academic researcher from Indian Institute of Technology Madras. The author has contributed to research in topics: Deep learning & Rendering (computer graphics). The author has an hindex of 1, co-authored 1 publications receiving 1 citations.

Papers
More filters
Proceedings ArticleDOI
15 Dec 2020
TL;DR: In this paper, an end-to-end convolutional neural network was designed to perform both foveated reconstruction and view synthesis using only 1.2% of the total light field data.
Abstract: Near-eye light field displays provide a solution to visual discomfort when using head mounted displays by presenting accurate depth and focal cues. However, light field HMDs require rendering the scene from a large number of viewpoints. This computational challenge of rendering sharp imagery of the foveal region and reproduce retinal defocus blur that correctly drives accommodation is tackled in this paper. We designed a novel end-to-end convolutional neural network that leverages human vision to perform both foveated reconstruction and view synthesis using only 1.2% of the total light field data. The proposed architecture comprises of log-polar sampling scheme followed by an interpolation stage and a convolutional neural network. To the best of our knowledge, this is the first attempt that synthesizes the entire light field from sparse RGB-D inputs and simultaneously addresses foveation rendering for computational displays. Our algorithm achieves fidelity in the fovea without any perceptible artifacts in the peripheral regions. The performance in fovea is comparable to the state-of-the-art view synthesis methods, despite using around 10x less light field data.

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Foveated rendering as mentioned in this paper adapts the image synthesis process to the user's gaze by exploiting the human visual system's limitations, in particular in terms of reduced acuity in peripheral vision, it strives to deliver high-quality visual experiences at very reduced computational, storage and transmission costs.

6 citations

Journal ArticleDOI
TL;DR: The depth estimation problem is revisits, avoiding the explicit stereo matching step using a simple two-tower convolutional neural network, and the proposed algorithm is entitled 2T-UNet, which surpasses state-of-the-art monocular and stereo depth estimation methods on the challenging Scene dataset.
Abstract: —Stereo correspondence matching is an essential part of the multi-step stereo depth estimation process. This paper revisits the depth estimation problem, avoiding the explicit stereo matching step using a simple two-tower convolutional neural network. The proposed algorithm is entitled as 2T-UNet. The idea behind 2T-UNet is to replace cost volume construction with twin convolution towers. These towers have an allowance for different weights between them. Additionally, the input for twin encoders in 2T-UNet are different compared to the existing stereo methods. Generally, a stereo network takes a right and left image pair as input to determine the scene geometry. However, in the 2T-UNet model, the right stereo image is taken as one input and the left stereo image along with its monocular depth clue information, is taken as the other input. Depth clues provide complementary suggestions that help enhance the quality of predicted scene geometry. The 2T-UNet surpasses state-of-the-art monocular and stereo depth estimation methods on the challenging Scene flow dataset, both quantitatively and qualitatively. The architecture performs incredibly well on complex natural scenes, highlight- ing its usefulness for various real-time applications. Pretrained weights and code will be made readily available.