Home
/
Authors
/
Maria Kolos

Author

Maria Kolos

Other affiliations: Skolkovo Institute of Science and Technology

Bio: Maria Kolos is an academic researcher from Samsung. The author has contributed to research in topics: Rendering (computer graphics) & Point cloud. The author has an hindex of 3, co-authored 6 publications receiving 148 citations. Previous affiliations of Maria Kolos include Skolkovo Institute of Science and Technology.

Papers

PDF

Open Access

More filters

Posted Content•

Neural Point-Based Graphics

[...]

Kara-Ali Aliev¹, Artem Sevastopolsky¹, Maria Kolos¹, Dmitry Ulyanov, Victor Lempitsky¹ - Show less +1 more•Institutions (1)

Samsung¹

19 Jun 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, a deep rendering network is learned in parallel with the descriptors, so that new views of the scene can be obtained by passing the rasterizations of a point cloud from new viewpoints through this network.

...read moreread less

Abstract: We present a new point-based approach for modeling the appearance of real scenes. The approach uses a raw point cloud as the geometric representation of a scene, and augments each point with a learnable neural descriptor that encodes local geometry and appearance. A deep rendering network is learned in parallel with the descriptors, so that new views of the scene can be obtained by passing the rasterizations of a point cloud from new viewpoints through this network. The input rasterizations use the learned descriptors as point pseudo-colors. We show that the proposed approach can be used for modeling complex scenes and obtaining their photorealistic views, while avoiding explicit surface estimation and meshing. In particular, compelling results are obtained for scene scanned using hand-held commodity RGB-D sensors as well as standard RGB cameras even in the presence of objects that are challenging for standard mesh-based modeling.

...read moreread less

161 citations

Book Chapter•DOI•

Neural Point-Based Graphics

[...]

Kara-Ali Aliev¹, Artem Sevastopolsky¹, Maria Kolos¹, Dmitry Ulyanov, Victor Lempitsky¹ - Show less +1 more•Institutions (1)

Samsung¹

23 Aug 2020

TL;DR: This work presents a new point-based approach for modeling the appearance of real scenes that uses a raw point cloud as the geometric representation of a scene, and augments each point with a learnable neural descriptor that encodes local geometry and appearance.

...read moreread less

Abstract: We present a new point-based approach for modeling the appearance of real scenes. The approach uses a raw point cloud as the geometric representation of a scene, and augments each point with a learnable neural descriptor that encodes local geometry and appearance. A deep rendering network is learned in parallel with the descriptors, so that new views of the scene can be obtained by passing the rasterizations of a point cloud from new viewpoints through this network. The input rasterizations use the learned descriptors as point pseudo-colors. We show that the proposed approach can be used for modeling complex scenes and obtaining their photorealistic views, while avoiding explicit surface estimation and meshing. In particular, compelling results are obtained for scenes scanned using hand-held commodity RGB-D sensors as well as standard RGB cameras even in the presence of objects that are challenging for standard mesh-based modeling.

...read moreread less

80 citations

Book Chapter•DOI•

Procedural Synthesis of Remote Sensing Images for Robust Change Detection with Neural Networks

[...]

Maria Kolos¹, Anton Marin¹, Alexey Artemov¹, Evgeny Burnaev¹•Institutions (1)

Skolkovo Institute of Science and Technology¹

10 Jul 2019

TL;DR: In this article, the authors propose a method for creating realistic targeted synthetic datasets in the remote sensing domain, leveraging the opportunities offered by game development engines, and provide a description of the pipeline for procedural geometry generation and rendering as well as an evaluation of the efficiency of produced datasets in a change detection scenario.

...read moreread less

Abstract: Data-driven methods such as convolutional neural networks (CNNs) are known to deliver state-of-the-art performance on image recognition tasks when the training data are abundant. However, in some instances, such as change detection in remote sensing images, annotated data cannot be obtained in sufficient quantities. In this work, we propose a simple and efficient method for creating realistic targeted synthetic datasets in the remote sensing domain, leveraging the opportunities offered by game development engines. We provide a description of the pipeline for procedural geometry generation and rendering as well as an evaluation of the efficiency of produced datasets in a change detection scenario. Our evaluations demonstrate that our pipeline helps to improve the performance and convergence of deep learning models when the amount of real-world data is severely limited.

...read moreread less

11 citations

Posted Content•

TRANSPR: Transparency Ray-Accumulating Neural 3D Scene Point Renderer

[...]

Maria Kolos¹, Artem Sevastopolsky¹, Victor Lempitsky¹•Institutions (1)

Samsung¹

06 Sep 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A neural point-based graphics method that can model semi-transparent scene parts using point clouds to model proxy geometry, and augments each point with a neural descriptor, and a learnable transparency value is introduced in this approach.

...read moreread less

Abstract: We propose and evaluate a neural point-based graphics method that can model semi-transparent scene parts. Similarly to its predecessor pipeline, ours uses point clouds to model proxy geometry, and augments each point with a neural descriptor. Additionally, a learnable transparency value is introduced in our approach for each point. Our neural rendering procedure consists of two steps. Firstly, the point cloud is rasterized using ray grouping into a multi-channel image. This is followed by the neural rendering step that "translates" the rasterized image into an RGB output using a learnable convolutional network. New scenes can be modeled using gradient-based optimization of neural descriptors and of the rendering network. We show that novel views of semi-transparent point cloud scenes can be generated after training with our approach. Our experiments demonstrate the benefit of introducing semi-transparency into the neural point-based modeling for a range of scenes with semi-transparent parts.

...read moreread less

9 citations

Proceedings Article•DOI•

TRANSPR: Transparency Ray-Accumulating Neural 3D Scene Point Renderer

[...]

Maria Kolos¹, Artem Sevastopolsky¹, Victor Lempitsky¹•Institutions (1)

Samsung¹

06 Sep 2020

TL;DR: In this article, a neural point-based graphics method that can model semi-transparent scene parts is proposed and evaluated, which uses point clouds to model proxy geometry, and augments each point with a neural descriptor.

...read moreread less

Abstract: We propose and evaluate a neural point-based graphics method that can model semi-transparent scene parts. Similarly to its predecessor pipeline, ours uses point clouds to model proxy geometry, and augments each point with a neural descriptor. Additionally, a learnable transparency value is introduced in our approach for each point.Our neural rendering procedure consists of two steps. Firstly, the point cloud is rasterized using ray marching into a multi-channel image. This is followed by the neural rendering step that “translates” the rasterized image into an RGB output using a learnable convolutional network. New scenes can be modeled using gradient-based optimization of neural descriptors and of the rendering network.We show that novel views of semi-transparent point cloud scenes can be generated after training with our approach. Our experiments demonstrate the benefit of introducing semi-transparency into the neural point-based modeling for a range of scenes with semi-transparent parts. The project materials and the code are available at http://saic-violet.github.io/transpr.

...read moreread less

8 citations

Cited by

PDF

Open Access

More filters

Posted Content•

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

[...]

Ricardo Martin-Brualla¹, Noha Radwan¹, Mehdi S. M. Sajjadi¹, Jonathan T. Barron¹, Alexey Dosovitskiy¹, Daniel Duckworth¹ - Show less +2 more•Institutions (1)

Google¹

05 Aug 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs, and applies it to internet photo collections of famous landmarks, to demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.

...read moreread less

Abstract: We present a learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs. We build on Neural Radiance Fields (NeRF), which uses the weights of a multilayer perceptron to model the density and color of a scene as a function of 3D coordinates. While NeRF works well on images of static subjects captured under controlled settings, it is incapable of modeling many ubiquitous, real-world phenomena in uncontrolled images, such as variable illumination or transient occluders. We introduce a series of extensions to NeRF to address these issues, thereby enabling accurate reconstructions from unstructured image collections taken from the internet. We apply our system, dubbed NeRF-W, to internet photo collections of famous landmarks, and demonstrate temporally consistent novel view renderings that are significantly closer to photorealism than the prior state of the art.

...read moreread less

476 citations

Posted Content•

Neural Sparse Voxel Fields

[...]

Lingjie Liu¹, Jiatao Gu¹, Kyaw Zaw Lin², Tat-Seng Chua³, Christian Theobalt³ - Show less +1 more•Institutions (3)

Max Planck Society¹, Facebook², National University of Singapore³

22 Jul 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work introduces Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering that is over 10 times faster than the state-of-the-art (namely, NeRF) at inference time while achieving higher quality results.

...read moreread less

Abstract: Photo-realistic free-viewpoint rendering of real-world scenes using classical computer graphics techniques is challenging, because it requires the difficult step of capturing detailed appearance and geometry models. Recent studies have demonstrated promising results by learning scene representations that implicitly encode both geometry and appearance without 3D supervision. However, existing approaches in practice often show blurry renderings caused by the limited network capacity or the difficulty in finding accurate intersections of camera rays with the scene geometry. Synthesizing high-resolution imagery from these representations often requires time-consuming optical ray marching. In this work, we introduce Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering. NSVF defines a set of voxel-bounded implicit fields organized in a sparse voxel octree to model local properties in each cell. We progressively learn the underlying voxel structures with a differentiable ray-marching operation from only a set of posed RGB images. With the sparse voxel octree structure, rendering novel views can be accelerated by skipping the voxels containing no relevant scene content. Our method is typically over 10 times faster than the state-of-the-art (namely, NeRF(Mildenhall et al., 2020)) at inference time while achieving higher quality results. Furthermore, by utilizing an explicit sparse voxel representation, our method can easily be applied to scene editing and scene composition. We also demonstrate several challenging tasks, including multi-scene learning, free-viewpoint rendering of a moving human, and large-scale scene rendering. Code and data are available at our website: this https URL.

...read moreread less

405 citations

Proceedings Article•DOI•

IBRNet: Learning Multi-View Image-Based Rendering

[...]

Qianqian Wang¹, Zhicheng Wang¹, Kyle Genova¹, Pratul P. Srinivasan¹, Howard Zhou¹, Jonathan T. Barron¹, Ricardo Martin-Brualla¹, Noah Snavely¹, Thomas Funkhouser¹ - Show less +5 more•Institutions (1)

Google¹

20 Jun 2021

TL;DR: A method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views using a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations.

...read moreread less

Abstract: We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. The core of our method is a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations (3D spatial locations and 2D viewing directions), drawing appearance information on the fly from multiple source views. By drawing on source views at render time, our method hearkens back to classic work on image-based rendering (IBR), and allows us to render high-resolution imagery. Unlike neural scene representation work that optimizes per-scene functions for rendering, we learn a generic view interpolation function that generalizes to novel scenes. We render images using classic volume rendering, which is fully differentiable and allows us to train using only multi-view posed images as supervision. Experiments show that our method outperforms recent novel view synthesis methods that also seek to generalize to novel scenes. Further, if fine-tuned on each scene, our method is competitive with state-of-the-art single-scene neural rendering methods.1

...read moreread less

402 citations

Proceedings Article•DOI•

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

[...]

Sida Peng¹, Yuanqing Zhang¹, Yinghao Xu², Qianqian Wang³, Qing Shuai¹, Hujun Bao¹, Xiaowei Zhou¹ - Show less +3 more•Institutions (3)

Zhejiang University¹, The Chinese University of Hong Kong², Cornell University³

20 Jun 2021

TL;DR: In this paper, the authors propose Neural Body, a new human body representation which assumes that learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.

...read moreread less

Abstract: This paper addresses the challenge of novel view synthesis for a human performer from a very sparse set of camera views. Some recent works have shown that learning implicit neural representations of 3D scenes achieves remarkable view synthesis quality given dense input views. However, the representation learning will be ill-posed if the views are highly sparse. To solve this ill-posed problem, our key idea is to integrate observations over video frames. To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated. The deformable mesh also provides geometric guidance for the network to learn 3D representations more efficiently. To evaluate our approach, we create a multi-view dataset named ZJU-MoCap that captures performers with complex motions. Experiments on ZJU-MoCap show that our approach outperforms prior works by a large margin in terms of novel view synthesis quality. We also demonstrate the capability of our approach to reconstruct a moving person from a monocular video on the People-Snapshot dataset.

...read moreread less

364 citations

Proceedings Article•DOI•

SynSin: End-to-End View Synthesis From a Single Image

[...]

Olivia Wiles¹, Georgia Gkioxari², Richard Szeliski², Justin Johnson³•Institutions (3)

University of Oxford¹, Facebook², University of Michigan³

14 Jun 2020

TL;DR: This work proposes a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view and outperforms baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.

...read moreread less

Abstract: View synthesis allows for the generation of new views of a scene given one or more images. This is challenging; it requires comprehensively understanding the 3D scene from images. As a result, current methods typically use multiple images, train on ground-truth depth, or are limited to synthetic data. We propose a novel end-to-end model for this task using a single image at test time; it is trained on real images without any ground-truth 3D information. To this end, we introduce a novel differentiable point cloud renderer that is used to transform a latent 3D point cloud of features into the target view. The projected features are decoded by our refinement network to inpaint missing regions and generate a realistic output image. The 3D component inside of our generative model allows for interpretable manipulation of the latent feature space at test time, e.g. we can animate trajectories from a single image. Additionally, we can generate high resolution images and generalise to other input resolutions. We outperform baselines and prior work on the Matterport, Replica, and RealEstate10K datasets.

...read moreread less

298 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Collapse