Home
/
Authors
/
Dushyant Rao

Author

Dushyant Rao

Other affiliations: University of Sydney, University of Illinois at Urbana–Champaign, AmeriCorps VISTA ...read more

Bio: Dushyant Rao is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 16, co-authored 33 publications receiving 1952 citations. Previous affiliations of Dushyant Rao include University of Sydney & University of Illinois at Urbana–Champaign.

Papers

PDF

Open Access

More filters

Posted Content•

Meta-Learning with Latent Embedding Optimization

[...]

Andrei Rusu¹, Dushyant Rao², Jakub Sygnowski¹, Oriol Vinyals¹, Razvan Pascanu¹, Simon Osindero¹, Raia Hadsell¹ - Show less +3 more•Institutions (2)

Google¹, Carnegie Mellon University²

16 Jul 2018-arXiv: Learning

TL;DR: In this article, a data-dependent latent generative representation of model parameters is learned and a gradient-based meta-learning is performed in a low-dimensional latent space for few-shot learning.

...read moreread less

Abstract: Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space. The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space.

...read moreread less

807 citations

Proceedings Article•

Meta-Learning with Latent Embedding Optimization

[...]

Andrei Rusu¹, Dushyant Rao², Jakub Sygnowski¹, Oriol Vinyals¹, Razvan Pascanu¹, Simon Osindero¹, Raia Hadsell¹ - Show less +3 more•Institutions (2)

Google¹, Carnegie Mellon University²

27 Sep 2018

TL;DR: In this paper, a data-dependent latent generative representation of model parameters is learned and a gradient-based meta-learning is performed in a low-dimensional latent space for few-shot learning.

...read moreread less

447 citations

Proceedings Article•DOI•

Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks

[...]

Martin Engelcke¹, Dushyant Rao¹, Dominic Zeng Wang¹, Chi Hay Tong¹, Ingmar Posner¹ - Show less +1 more•Institutions (1)

University of Oxford¹

01 May 2017

TL;DR: VoteSDeep as mentioned in this paper leverages a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input, and additionally proposes to use an L 1 penalty on the filter activations to further encourage sparsity in the intermediate representations.

...read moreread less

Abstract: This paper proposes a computationally efficient approach to detecting objects natively in 3D point clouds using convolutional neural networks (CNNs). In particular, this is achieved by leveraging a feature-centric voting scheme to implement novel convolutional layers which explicitly exploit the sparsity encountered in the input. To this end, we examine the trade-off between accuracy and speed for different architectures and additionally propose to use an L 1 penalty on the filter activations to further encourage sparsity in the intermediate representations. To the best of our knowledge, this is the first work to propose sparse convolutional layers and L 1 regularisation for efficient large-scale processing of 3D data. We demonstrate the efficacy of our approach on the KITTI object detection benchmark and show that VoteSDeep models with as few as three layers outperform the previous state of the art in both laser and laser-vision based approaches by margins of up to 40% while remaining highly competitive in terms of processing time.

...read moreread less

436 citations

Journal Article•DOI•

Embracing Change: Continual Learning in Deep Neural Networks

[...]

Raia Hadsell, Dushyant Rao, Andrei Rusu, Razvan Pascanu

01 Dec 2020-Trends in Cognitive Sciences

TL;DR: This review relates continual learning to the learning dynamics of neural networks, highlighting the potential it has to considerably improve data efficiency and consider the many new biologically inspired approaches that have emerged in recent years.

...read moreread less

222 citations

Proceedings Article•

Continual Unsupervised Representation Learning.

[...]

Dushyant Rao¹, Francesco Visin², Andrei Rusu³, Razvan Pascanu², Yee Whye Teh⁴, Raia Hadsell² - Show less +2 more•Institutions (4)

Carnegie Mellon University¹, Google², West University of Timișoara³, University of Oxford⁴

01 Jan 2019

TL;DR: The proposed approach (CURL) performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting.

...read moreread less

Abstract: Continual learning aims to improve the ability of modern learning systems to deal with non-stationary distributions, typically by attempting to learn a series of tasks sequentially. Prior art in the field has largely considered supervised or reinforcement learning tasks, and often assumes full knowledge of task labels and boundaries. In this work, we propose an approach (CURL) to tackle a more general problem that we will refer to as unsupervised continual learning. The focus is on learning representations without any knowledge about task identity, and we explore scenarios when there are abrupt changes between tasks, smooth transitions from one task to another, or even when the data is shuffled. The proposed approach performs task inference directly within the model, is able to dynamically expand to capture new concepts over its lifetime, and incorporates additional rehearsal-based techniques to deal with catastrophic forgetting. We demonstrate the efficacy of CURL in an unsupervised learning setting with MNIST and Omniglot, where the lack of labels ensures no information is leaked about the task. Further, we demonstrate strong performance compared to prior art in an i.i.d setting, or when adapting the technique to supervised tasks such as incremental class learning.

...read moreread less

170 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Proceedings Article•DOI•

Multi-view 3D Object Detection Network for Autonomous Driving

[...]

Xiaozhi Chen¹, Huimin Ma¹, Ji Wan², Bo Li², Xia Tian² - Show less +1 more•Institutions (2)

Tsinghua University¹, Baidu²

21 Jul 2017

TL;DR: This paper proposes Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes and designs a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths.

...read moreread less

Abstract: This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the birds eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 14.9% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.

...read moreread less

2,569 citations

Proceedings Article•DOI•

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

[...]

Yin Zhou¹, Oncel Tuzel¹•Institutions (1)

Apple Inc.¹

18 Jun 2018

TL;DR: Zhou et al. as mentioned in this paper propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network.

...read moreread less

Abstract: Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. To interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird's eye view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KITTI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR.

...read moreread less

1,948 citations

Proceedings Article•DOI•

Frustum PointNets for 3D Object Detection from RGB-D Data

[...]

Charles R. Qi¹, Wei Liu, Chenxia Wu, Hao Su², Leonidas J. Guibas¹ - Show less +1 more•Institutions (2)

Stanford University¹, University of California, San Diego²

18 Jun 2018

TL;DR: This work directly operates on raw point clouds by popping up RGBD scans and leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects.

...read moreread less

Abstract: In this work, we study 3D object detection from RGBD data in both indoor and outdoor scenes. While previous methods focus on images or 3D voxels, often obscuring natural 3D patterns and invariances of 3D data, we directly operate on raw point clouds by popping up RGB-D scans. However, a key challenge of this approach is how to efficiently localize objects in point clouds of large-scale scenes (region proposal). Instead of solely relying on 3D proposals, our method leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Benefited from learning directly in raw point clouds, our method is also able to precisely estimate 3D bounding boxes even under strong occlusion or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection benchmarks, our method outperforms the state of the art by remarkable margins while having real-time capability.

...read moreread less

1,947 citations

Posted Content•

PointPillars: Fast Encoders for Object Detection from Point Clouds

[...]

Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom - Show less +2 more

14 Dec 2018-arXiv: Learning

TL;DR: PointPillars as mentioned in this paper utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars), which can be used with any standard 2D convolutional detection architecture.

...read moreread less

Abstract: Object detection in point clouds is an important aspect of many robotics applications such as autonomous driving. In this paper we consider the problem of encoding a point cloud into a format appropriate for a downstream detection pipeline. Recent literature suggests two types of encoders; fixed encoders tend to be fast but sacrifice accuracy, while encoders that are learned from data are more accurate, but slower. In this work we propose PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). While the encoded features can be used with any standard 2D convolutional detection architecture, we further propose a lean downstream network. Extensive experimentation shows that PointPillars outperforms previous encoders with respect to both speed and accuracy by a large margin. Despite only using lidar, our full detection pipeline significantly outperforms the state of the art, even among fusion methods, with respect to both the 3D and bird's eye view KITTI benchmarks. This detection performance is achieved while running at 62 Hz: a 2 - 4 fold runtime improvement. A faster version of our method matches the state of the art at 105 Hz. These benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds.

...read moreread less

1,311 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse