Home
/
Authors
/
Samyak Parajuli

Author

Samyak Parajuli

Bio: Samyak Parajuli is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Artificial neural network & Reinforcement learning. The author has an hindex of 3, co-authored 6 publications receiving 188 citations.

Papers

PDF

Open Access

More filters

Posted Content•

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

[...]

Dan Hendrycks¹, Steven Basart², Norman Mu³, Saurav Kadavath¹, Frank Wang³, Evan Dorundo³, Rahul Desai¹, Tyler Zhu³, Samyak Parajuli¹, Mike Guo¹, Dawn Song¹, Jacob Steinhardt¹, Justin Gilmer³ - Show less +9 more•Institutions (3)

University of California, Berkeley¹, University of Chicago², Google³

29 Jun 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is found that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work.

...read moreread less

Abstract: We introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more With our new datasets, we take stock of previously proposed methods for improving out-of-distribution robustness and put them to the test We find that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work We find improvements in artificial robustness benchmarks can transfer to real-world distribution shifts, contrary to claims in prior work Motivated by our observation that data augmentations can help with real-world distribution shifts, we also introduce a new data augmentation method which advances the state-of-the-art and outperforms models pretrained with 1000 times more labeled data Overall we find that some methods consistently help with distribution shifts in texture and local image statistics, but these methods do not help with some other distribution shifts like geographic changes Our results show that future research must study multiple distribution shifts simultaneously, as we demonstrate that no evaluated method consistently improves robustness

...read moreread less

739 citations

Proceedings Article•

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

[...]

University of California, Berkeley¹, University of Chicago², Google³

29 Jun 2020

TL;DR: In this article, the authors introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more.

...read moreread less

Abstract: We introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more. With our new datasets, we take stock of previously proposed methods for improving out-of-distribution robustness and put them to the test. We find that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work. We find improvements in artificial robustness benchmarks can transfer to real-world distribution shifts, contrary to claims in prior work. Motivated by our observation that data augmentations can help with real-world distribution shifts, we also introduce a new data augmentation method which advances the state-of-the-art and outperforms models pretrained with 1000 times more labeled data. Overall we find that some methods consistently help with distribution shifts in texture and local image statistics, but these methods do not help with some other distribution shifts like geographic changes. Our results show that future research must study multiple distribution shifts simultaneously, as we demonstrate that no evaluated method consistently improves robustness.

...read moreread less

15 citations

Posted Content•

Inter-Level Cooperation in Hierarchical Reinforcement Learning

[...]

Abdul Rahman Kreidieh¹, Samyak Parajuli¹, Nathan Lichtle¹, Yiling You¹, Rayyan Nasr¹, Alexandre M. Bayen¹ - Show less +2 more•Institutions (1)

University of California, Berkeley¹

05 Dec 2019-arXiv: Learning

TL;DR: It is hypothesized that improved cooperation between the internal agents of a hierarchy can simplify the credit assignment problem from the perspective of the high-level policies, thereby leading to significant improvements to training in situations where intricate sets of action primitives must be performed to yield improvements in performance.

...read moreread less

Abstract: Hierarchical models for deep reinforcement learning (RL) have emerged as powerful methods for generating meaningful control strategies in difficult long time horizon tasks. Training of said hierarchical models, however, continue to suffer from instabilities that limit their applicability. In this paper, we address instabilities that arise from the concurrent optimization of goal-assignment and goal-achievement policies. Drawing connections between this concurrent optimization scheme and communication and cooperation in multi-agent RL, we redefine the standard optimization procedure to explicitly promote cooperation between these disparate tasks. Our method is demonstrated to achieve superior results to existing techniques in a set of difficult long time horizon tasks, and serves to expand the scope of solvable tasks by hierarchical reinforcement learning. Videos of the results are available at: this https URL.

...read moreread less

8 citations

Posted Content•

Generalized Ternary Connect: End-to-End Learning and Compression of Multiplication-Free Deep Neural Networks

[...]

Samyak Parajuli, Aswin Raghavan, Sek M. Chai

12 Nov 2018-arXiv: Learning

TL;DR: Generalized Ternary Connect is proposed, which allows an arbitrary number of levels while at the same time eliminating multiplications by restricting the parameters to integer powers of two, and demonstrates superior compression and similar accuracy of GTC in comparison to several state-of-the-art methods for neural network compression.

...read moreread less

Abstract: The use of deep neural networks in edge computing devices hinges on the balance between accuracy and complexity of computations. Ternary Connect (TC) \cite{lin2015neural} addresses this issue by restricting the parameters to three levels $-1, 0$, and $+1$, thus eliminating multiplications in the forward pass of the network during prediction. We propose Generalized Ternary Connect (GTC), which allows an arbitrary number of levels while at the same time eliminating multiplications by restricting the parameters to integer powers of two. The primary contribution is that GTC learns the number of levels and their values for each layer, jointly with the weights of the network in an end-to-end fashion. Experiments on MNIST and CIFAR-10 show that GTC naturally converges to an `almost binary' network for deep classification networks (e.g. VGG-16) and deep variational auto-encoders, with negligible loss of classification accuracy and comparable visual quality of generated samples respectively. We demonstrate superior compression and similar accuracy of GTC in comparison to several state-of-the-art methods for neural network compression. We conclude with simulations showing the potential benefits of GTC in hardware.

...read moreread less

6 citations

Posted Content•

Dynamically Throttleable Neural Networks (TNN).

[...]

Hengyue Liu, Samyak Parajuli, Jesse Hostetler, Sek M. Chai, Bir Bhanu - Show less +1 more

01 Nov 2020-arXiv: Learning

TL;DR: This work presents a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources and designed TNN with several properties that enable more flexibility for dynamic execution based on runtime context.

...read moreread less

Abstract: Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network. In this work, we present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources. We designed TNN with several properties that enable more flexibility for dynamic execution based on runtime context. TNNs are defined as throttleable modules gated with a separately trained controller that generates a single utilization control parameter. We validate our proposal on a number of experiments, including Convolution Neural Networks (CNNs such as VGG, ResNet, ResNeXt, DenseNet) using CiFAR-10 and ImageNet dataset, for object classification and recognition tasks. We also demonstrate the effectiveness of dynamic TNN execution on a 3D Convolustion Network (C3D) for a hand gesture task. Results show that TNN can maintain peak accuracy performance compared to vanilla solutions, while providing a graceful reduction in computational requirement, down to 74% reduction in latency and 52% energy savings.

...read moreread less

4 citations

Cited by

PDF

Open Access

More filters

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Proceedings Article•DOI•

A ConvNet for the 2020s

[...]

Zhuang Liu, Hanzi Mao, Chaozheng Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie - Show less +2 more

10 Jan 2022

TL;DR: This work gradually “modernize” a standard ResNet toward the design of a vision Transformer, and discovers several key components that contribute to the performance difference along the way, leading to a family of pure ConvNet models dubbed ConvNeXt.

...read moreread less

Abstract: The “Roaring 20s” of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions. In this work, we reexamine the design spaces and test the limits of what a pure ConvNet can achieve. We gradually “modernize” a standard ResNet toward the design of a vision Transformer, and discover several key components that contribute to the performance difference along the way. The outcome of this exploration is a family of pure ConvNet models dubbed ConvNeXt. Constructed entirely from standard ConvNet modules, ConvNeXts compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNets.

...read moreread less

1,203 citations

Posted Content•

WILDS: A Benchmark of in-the-Wild Distribution Shifts

[...]

Pang Wei Koh¹, Shiori Sagawa¹, Henrik Marklund¹, Sang Michael Xie², Marvin Zhang¹, Akshay Balsubramani¹, Weihua Hu¹, Michihiro Yasunaga³, Richard Lanas Phillips¹, Irena Gao¹, Tony Lee¹, Etienne David⁴, Ian Stavness⁵, Wei Guo⁵, Berton A. Earnshaw, Imran S. Haque⁶, Sara Beery¹, Jure Leskovec¹, Anshul Kundaje⁷, Emma Pierson², Sergey Levine¹, Chelsea Finn¹, Percy Liang¹ - Show less +19 more•Institutions (7)

Stanford University¹, University of California, Berkeley², Cornell University³, University of Saskatchewan⁴, University of Tokyo⁵, California Institute of Technology⁶, Microsoft⁷

14 Dec 2020-arXiv: Learning

TL;DR: WILDS is presented, a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, and is hoped to encourage the development of general-purpose methods that are anchored to real-world distribution shifts and that work well across different applications and problem settings.

...read moreread less

Abstract: Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity, these real-world distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts which naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training results in substantially lower out-of-distribution than in-distribution performance, and that this gap remains even with models trained by existing methods for handling distribution shifts. This underscores the need for new training methods that produce models which are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open-source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. Code and leaderboards are available at this https URL.

...read moreread less

579 citations

Posted Content•

Natural Adversarial Examples

[...]

Dan Hendrycks¹, Kevin Zhao², Steven Basart³, Jacob Steinhardt¹, Dawn Song¹ - Show less +1 more•Institutions (3)

University of California, Berkeley¹, University of Washington², University of Chicago³

16 Jul 2019-arXiv: Learning

TL;DR: This work introduces two challenging datasets that reliably cause machine learning model performance to substantially degrade and curates an adversarial out-of-distribution detection dataset called IMAGENET-O, which is the first out- of-dist distribution detection dataset created for ImageNet models.

...read moreread less

Abstract: We introduce two challenging datasets that reliably cause machine learning model performance to substantially degrade. The datasets are collected with a simple adversarial filtration technique to create datasets with limited spurious cues. Our datasets' real-world, unmodified examples transfer to various unseen models reliably, demonstrating that computer vision models have shared weaknesses. The first dataset is called ImageNet-A and is like the ImageNet test set, but it is far more challenging for existing models. We also curate an adversarial out-of-distribution detection dataset called ImageNet-O, which is the first out-of-distribution detection dataset created for ImageNet models. On ImageNet-A a DenseNet-121 obtains around 2% accuracy, an accuracy drop of approximately 90%, and its out-of-distribution detection performance on ImageNet-O is near random chance levels. We find that existing data augmentation techniques hardly boost performance, and using other public training datasets provides improvements that are limited. However, we find that improvements to computer vision architectures provide a promising path towards robust models.

...read moreread less

550 citations

Proceedings Article•DOI•

A ConvNet for the 2020s

[...]

01 Jun 2022

TL;DR: ConvNeXt as discussed by the authors is a family of pure ConvNet models, which compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation.

...read moreread less

502 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155

Collapse