Home
/
Authors
/
Rohit Pandey

Author

Rohit Pandey

Other affiliations: Graphic Era Hill University, University at Buffalo, Amity University ...read more

Bio: Rohit Pandey is an academic researcher from Google. The author has contributed to research in topics: Rendering (computer graphics) & Augmented reality. The author has an hindex of 17, co-authored 69 publications receiving 985 citations. Previous affiliations of Rohit Pandey include Graphic Era Hill University & University at Buffalo.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014

Papers

PDF

Open Access

More filters

Journal Article•DOI•

State of the Art on Neural Rendering

[...]

Ayush Tewari, Ohad Fried¹, Justus Thies², Vincent Sitzmann¹, Stephen Lombardi³, Kalyan Sunkavalli⁴, Ricardo Martin-Brualla⁵, Tomas Simon³, Jason Saragih³, Matthias Nießner², Rohit Pandey⁵, Sean Fanello⁵, Gordon Wetzstein¹, Jun-Yan Zhu⁴, Christian Theobalt, Maneesh Agrawala¹, Eli Shechtman⁴, Dan B. Goldman⁵, Michael Zollhöfer³ - Show less +15 more•Institutions (5)

Stanford University¹, Technische Universität München², Facebook³, Adobe Systems⁴, Google⁵

01 May 2020-Computer Graphics Forum

TL;DR: Neural rendering as discussed by the authors is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training.

...read moreread less

Abstract: Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer graphics more widely accessible. Concurrently, progress in computer vision and machine learning have given rise to a new approach to image synthesis and editing, namely deep generative models. Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. With a plethora of applications in computer graphics and vision, neural rendering is poised to become a new area in the graphics community, yet no survey of this emerging field exists. This state-of-the-art report summarizes the recent trends and applications of neural rendering. We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. Starting with an overview of the underlying computer graphics and machine learning concepts, we discuss critical aspects of neural rendering approaches. This state-of-the-art report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence. Finally, we conclude with a discussion of the social implications of such technology and investigate open research problems.

...read moreread less

190 citations

Proceedings Article•DOI•

Neural Rerendering in the Wild

[...]

Moustafa Meshry¹, Dan B. Goldman², Sameh Khamis³, Hugues Hoppe², Rohit Pandey², Noah Snavely², Ricardo Martin-Brualla² - Show less +3 more•Institutions (3)

University of Maryland, College Park¹, Google², Cornell University³

15 Jun 2019

TL;DR: This work applies traditional 3D reconstruction to register the photos and approximate the scene as a point cloud from Internet photos of a tourist landmark, and trains a deep neural network to learn the mapping of these initial renderings to the actual photos.

...read moreread less

Abstract: We explore total scene capture --- recording, modeling, and rerendering a scene under varying appearance such as season and time of day. Starting from Internet photos of a tourist landmark, we apply traditional 3D reconstruction to register the photos and approximate the scene as a point cloud. For each photo, we render the scene points into a deep framebuffer, and train a deep neural network to learn the mapping of these initial renderings to the actual photos. This rerendering network also takes as input a latent appearance vector and a semantic mask indicating the location of transient objects like pedestrians. The model is evaluated on several datasets of publicly available images spanning a broad range of illumination conditions. We create short videos that demonstrate realistic manipulation of the image viewpoint, appearance, and semantic labels. We also compare results to prior work on scene reconstruction from Internet photos.

...read moreread less

181 citations

Proceedings Article•DOI•

Advances in neural rendering

[...]

Ayush Tewari, Ohad Fried¹, Justus Thies², Vincent Sitzmann³, Stephen Lombardi⁴, Zexiang Xu⁵, Tomas Simon⁴, Matthias Nießner⁶, Edgar Tretschk, Lingjie Liu, Ben Mildenhall⁷, Pratul P. Srinivasan⁷, Rohit Pandey⁷, Sergio Orts-Escolano⁷, Sean Fanello⁷, M. Guo⁸, Gordon Wetzstein⁸, Jun-Yan Zhu⁹, Christian Theobalt, Maneesh Agrawala⁸, Dan B. Goldman⁷, Michael Zollhöfer⁴ - Show less +18 more•Institutions (9)

Interdisciplinary Center Herzliya¹, Max Planck Society², Massachusetts Institute of Technology³, Facebook⁴, Adobe Systems⁵, Technische Universität München⁶, Google⁷, Stanford University⁸, Carnegie Mellon University⁹

09 Aug 2021

TL;DR: Loss functions for Neural Rendering Jun-Yan Zhu shows the importance of knowing the number of neurons in the system and how many neurons are firing at the same time.

...read moreread less

Abstract: Loss functions for Neural Rendering Jun-Yan Zhu

...read moreread less

174 citations

Posted Content•

State of the Art on Neural Rendering

[...]

Stanford University¹, Technische Universität München², Facebook³, Adobe Systems⁴, Google⁵

08 Apr 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This state‐of‐the‐art report summarizes the recent trends and applications of neural rendering and focuses on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photorealistic outputs.

...read moreread less

149 citations

Journal Article•DOI•

LookinGood: enhancing performance capture with real-time neural re-rendering

[...]

Ricardo Martin-Brualla¹, Rohit Pandey¹, Shuoran Yang¹, Pavel Pidlypenskyi¹, Jonathan Taylor¹, Julien Valentin¹, Sameh Khamis¹, Philip Davidson¹, Anastasia Tkach¹, Peter Lincoln¹, Adarsh Kowdle¹, Christoph Rhemann¹, Dan B. Goldman¹, Cem Keskin¹, Steve Seitz¹, Shahram Izadi¹, Sean Fanello¹ - Show less +13 more•Institutions (1)

Google¹

04 Dec 2018-ACM Transactions on Graphics

TL;DR: The novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real- time is taken.

...read moreread less

Abstract: Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time. We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized reconstruction error that uses semantic information to focus on relevant parts of the subject, e.g. the face. We also introduce a salient reweighing scheme of the loss function that is able to discard outliers. We specifically design the system for virtual and augmented reality headsets where the consistency between the left and right eye plays a crucial role in the final user experience. Finally, we generate temporally stable results by explicitly minimizing the difference between two consecutive frames. We tested the proposed system in two different scenarios: one involving a single RGB-D sensor, and upper body reconstruction of an actor, the second consisting of full body 360° capture. Through extensive experimentation, we demonstrate how our system generalizes across unseen sequences and subjects.

...read moreread less

125 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16

Collapse

Cited by

PDF

Open Access

More filters

Reference Entry•DOI•

IEEE Transactions on Pattern Analysis and Machine Intelligence

[...]

King-Sun Fu

15 Oct 2004

2,118 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Book•

A reflectance model for computer graphics

[...]

Robert L. Cook, Kenneth E. Torrance

01 Dec 1988

TL;DR: In this paper, the spectral energy distribution of the reflected light from an object made of a specific real material is obtained and a procedure for accurately reproducing the color associated with the spectrum is discussed.

...read moreread less

Abstract: This paper presents a new reflectance model for rendering computer synthesized images. The model accounts for the relative brightness of different materials and light sources in the same scene. It describes the directional distribution of the reflected light and a color shift that occurs as the reflectance changes with incidence angle. The paper presents a method for obtaining the spectral energy distribution of the light reflected from an object made of a specific real material and discusses a procedure for accurately reproducing the color associated with the spectral energy distribution. The model is applied to the simulation of a metal and a plastic.

...read moreread less

1,401 citations

Proceedings Article•

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

[...]

Vincent Sitzmann¹, Michael Zollhoefer¹, Gordon Wetzstein¹•Institutions (1)

Stanford University¹

04 Jun 2019

TL;DR: The proposed Scene Representation Networks (SRNs), a continuous, 3D-structure-aware scene representation that encodes both geometry and appearance, are demonstrated by evaluating them for novel view synthesis, few-shot reconstruction, joint shape and appearance interpolation, and unsupervised discovery of a non-rigid face model.

...read moreread less

Abstract: Unsupervised learning with generative models has the potential of discovering rich representations of 3D scenes. While geometric deep learning has explored 3D-structure-aware representations of scene geometry, these models typically require explicit 3D supervision. Emerging neural scene representations can be trained only with posed 2D images, but existing methods ignore the three-dimensional structure of scenes. We propose Scene Representation Networks (SRNs), a continuous, 3D-structure-aware scene representation that encodes both geometry and appearance. SRNs represent scenes as continuous functions that map world coordinates to a feature representation of local scene properties. By formulating the image formation as a differentiable ray-marching algorithm, SRNs can be trained end-to-end from only 2D images and their camera poses, without access to depth or shape. This formulation naturally generalizes across scenes, learning powerful geometry and appearance priors in the process. We demonstrate the potential of SRNs by evaluating them for novel view synthesis, few-shot reconstruction, joint shape and appearance interpolation, and unsupervised discovery of a non-rigid face model.

...read moreread less

832 citations

Proceedings Article•DOI•

Everybody Dance Now

[...]

Caroline Chan¹, Shiry Ginosar¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

01 Oct 2019

TL;DR: This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, it is shown that it can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves.

...read moreread less

Abstract: This paper presents a simple method for “do as I do” motion transfer: given a source video of a person dancing, we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We approach this problem as video-to-video translation using pose as an intermediate representation. To transfer the motion, we extract poses from the source subject and apply the learned pose-to-appearance mapping to generate the target subject. We predict two consecutive frames for temporally coherent video results and introduce a separate pipeline for realistic face synthesis. Although our method is quite simple, it produces surprisingly compelling results (see video). This motivates us to also provide a forensics tool for reliable synthetic content detection, which is able to distinguish videos synthesized by our system from real data. In addition, we release a first-of-its-kind open-source dataset of videos that can be legally used for training and motion transfer.

...read moreread less

585 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse