Home
/
Authors
/
Sven Gowal

Author

Sven Gowal

Other affiliations: École Polytechnique Fédérale de Lausanne

Bio: Sven Gowal is an academic researcher from Google. The author has contributed to research in topics: Robustness (computer science) & Artificial neural network. The author has an hindex of 20, co-authored 58 publications receiving 1877 citations. Previous affiliations of Sven Gowal include École Polytechnique Fédérale de Lausanne.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Posted Content•

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models

[...]

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin, Jonathan Uesato, Relja Arandjelovic, Timothy A. Mann, Pushmeet Kohli - Show less +5 more

30 Oct 2018-arXiv: Learning

TL;DR: This work shows how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy and allows the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

...read moreread less

Abstract: Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations. Most of these methods are based on minimizing an upper bound on the worst-case loss over all possible adversarial perturbations. While these techniques show promise, they often result in difficult optimization procedures that remain hard to scale to larger networks. Through a comprehensive analysis, we show how a simple bounding technique, interval bound propagation (IBP), can be exploited to train large provably robust neural networks that beat the state-of-the-art in verified accuracy. While the upper bound computed by IBP can be quite weak for general networks, we demonstrate that an appropriate loss and clever hyper-parameter schedule allow the network to adapt such that the IBP bound is tight. This results in a fast and stable learning algorithm that outperforms more sophisticated methods and achieves state-of-the-art results on MNIST, CIFAR-10 and SVHN. It also allows us to train the largest model to be verified beyond vacuous bounds on a downscaled version of ImageNet.

...read moreread less

423 citations

Journal Article•DOI•

Competition-level code generation with AlphaCode

[...]

Yujia Li, David H. Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom, Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de, Masson d’Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey, Cherepanov, James L. Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de, Freitas, Koray Kavukcuoglu, Oriol Vinyals - Show less +26 more

08 Feb 2022-Science

TL;DR: Yujia Li*, David Choi*, Junyoung Chung*, Nate Kushman*, Julian Schrittwieser*, Rémi Leblond*, Tom Eccles*, James Keeling*, Felix Gimeno*, Agustin Dal Lago*, Thomas Hubert*, Peter Choy*, Cyprien de Masson d’Autume*, Igor Babuschkin, Xinyun Chen

...read moreread less

Abstract: Programming is a powerful and ubiquitous problem-solving tool. Systems that can assist programmers or even generate programs themselves could make programming more productive and accessible. Recent transformer-based neural network models show impressive code generation abilities yet still perform poorly on more complex tasks requiring problem-solving skills, such as competitive programming problems. Here, we introduce AlphaCode, a system for code generation that achieved an average ranking in the top 54.3% in simulated evaluations on recent programming competitions on the Codeforces platform. AlphaCode solves problems by generating millions of diverse programs using specially trained transformer-based networks and then filtering and clustering those programs to a maximum of just 10 submissions. This result marks the first time an artificial intelligence system has performed competitively in programming competitions. Description Machine learning systems can program too Computer programming competitions are popular tests among programmers that require critical thinking informed by experience and creating solutions to unforeseen problems, both of which are key aspects of human intelligence but challenging to mimic by machine learning models. Using self-supervised learning and an encoder-decoder transformer architecture, Li et al. developed AlphaCode, a deep-learning model that can achieve approximately human-level performance on the Codeforces platform, which regularly hosts these competitions and attracts numerous participants worldwide (see the Perspective by Kolter). The development of such coding platforms could have a huge impact on programmers’ productivity. It may even change the culture of programming by shifting human work to formulating problems, with machine learning being the main one responsible for generating and executing codes. —YS Modern machine learning systems can achieve average human-level performance in popular competitive programming contests.

...read moreread less

329 citations

Posted Content•

A Dual Approach to Scalable Verification of Deep Networks

[...]

Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy A. Mann, Pushmeet Kohli - Show less +2 more

17 Mar 2018-arXiv: Learning

TL;DR: This paper addresses the problem of formally verifying desirable properties of neural networks by forming verification as an optimization problem and solving a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified.

...read moreread less

Abstract: This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.

...read moreread less

285 citations

Posted Content•

Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

[...]

Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy A. Mann, Pushmeet Kohli - Show less +1 more

07 Oct 2020-arXiv: Machine Learning

TL;DR: This paper systematically study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness, and discovers that it is possible to train robust models that go well beyond state-of-the-art results.

...read moreread less

Abstract: Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in a bid to uncover its limits. We systematically study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness. We discover that it is possible to train robust models that go well beyond state-of-the-art results by combining larger models, Swish/SiLU activations and model weight averaging. We demonstrate large improvements on CIFAR-10 and CIFAR-100 against $\ell_\infty$ and $\ell_2$ norm-bounded perturbations of size $8/255$ and $128/255$, respectively. In the setting with additional unlabeled data, we obtain an accuracy under attack of 65.88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6.35% with respect to prior art). Without additional data, we obtain an accuracy under attack of 57.20% (+3.46%). To test the generality of our findings and without any additional modifications, we obtain an accuracy under attack of 80.53% (+7.62%) against $\ell_2$ perturbations of size $128/255$ on CIFAR-10, and of 36.88% (+8.46%) against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-100. All models are available at this https URL.

...read moreread less

196 citations

Journal Article•DOI•

Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

[...]

Gabriel Dulac-Arnold¹, Nir Levine, Daniel J. Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, Todd Hester - Show less +3 more•Institutions (1)

Google¹

22 Apr 2021-Machine Learning

TL;DR: This work identifies and formalizes a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems and proposes an as an open-source benchmark.

...read moreread less

Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.

...read moreread less

195 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Matrix Factorization Techniques for Recommender Systems

[...]

Patrick Seemann

01 Jan 2014

2,080 citations

Journal Article•

PaLM: Scaling Language Modeling with Pathways

[...]

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Velu Prabhakaran, Emily Reif, Nan Du, B. C. Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Peng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, L Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Oliveira Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zong Tuan Zhou, Xuezhi Wang, Brennan Saeta, Mark Díaz, Orhan Firat, M. Catasta, Jason Loh Seong Wei, Kathleen S. Meier-Hellstern, Douglas Eck, Jeffrey Dean, Slav Petrov, Noah Fiedel - Show less +63 more

05 Apr 2022-arXiv.org

TL;DR: A 540-billion parameter, densely activated, Transformer language model, which is called PaLM achieves breakthrough performance, outperforming the state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark.

...read moreread less

Abstract: Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning , which drastically reduces the number of task-speciﬁc training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model (PaLM). We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly eﬃcient training across multiple TPU Pods. We demonstrate continued beneﬁts of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the ﬁnetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A signiﬁcant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies.

...read moreread less

1,429 citations

Posted Content•

Certified Adversarial Robustness via Randomized Smoothing

[...]

Jeremy M. Cohen¹, Elan Rosenfeld¹, J. Zico Kolter¹•Institutions (1)

Carnegie Mellon University¹

08 Feb 2019-arXiv: Learning

TL;DR: Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.

...read moreread less

Abstract: We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at this http URL.

...read moreread less

719 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse