Home
/
Authors
/
Adrian Vladu

Author

Adrian Vladu

Other affiliations: Boston University, Brown University, Massachusetts Institute of Technology

Bio: Adrian Vladu is an academic researcher from University of Paris. The author has contributed to research in topics: Time complexity & Maximum flow problem. The author has an hindex of 15, co-authored 38 publications receiving 7091 citations. Previous affiliations of Adrian Vladu include Boston University & Brown University.

Papers

PDF

Open Access

More filters

Posted Content•

Towards Deep Learning Models Resistant to Adversarial Attacks

[...]

Aleksander Madry¹, Aleksandar Makelov¹, Ludwig Schmidt¹, Dimitris Tsipras¹, Adrian Vladu¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

19 Jun 2017-arXiv: Machine Learning

TL;DR: This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

...read moreread less

Abstract: Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples---inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at this https URL and this https URL.

...read moreread less

5,789 citations

Proceedings Article•

Towards Deep Learning Models Resistant to Adversarial Attacks.

[...]

Aleksander Madry¹, Aleksandar Makelov¹, Ludwig Schmidt¹, Dimitris Tsipras¹, Adrian Vladu² - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Boston University²

15 Feb 2018

TL;DR: This article studied the adversarial robustness of neural networks through the lens of robust optimization and identified methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

...read moreread less

Abstract: Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples—inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at this https URL and this https URL.

...read moreread less

3,581 citations

Proceedings Article•DOI•

Matrix Scaling and Balancing via Box Constrained Newton's Method and Interior Point Methods

[...]

Michael B. Cohen¹, Aleksander Madry¹, Dimitris Tsipras¹, Adrian Vladu¹•Institutions (1)

Massachusetts Institute of Technology¹

07 Apr 2017

TL;DR: A new second-order optimization framework is developed that enables the treatment of matrix scaling and balancing in a unified and principled manner and identifies a certain generalization of linear system solving that can be used to efficiently minimize a broad class of functions, which is called second- order robust.

...read moreread less

Abstract: In this paper, we study matrix scaling and balancing, which are fundamental problems in scientific computing, with a long line of work on them that dates back to the 1960s. We provide algorithms for both these problems that, ignoring logarithmic factors involving the dimension of the input matrix and the size of its entries, both run in time \widetilde{O}(m\log \kappa \log^2 (1/≥ilon)) where ≥ilon is the amount of error we are willing to tolerate. Here, \kappa represents the ratio between the largest and the smallest entries of the optimal scalings. This implies that our algorithms run in nearly-linear time whenever \kappa is quasi-polynomial, which includes, in particular, the case of strictly positive matrices. We complement our results by providing a separate algorithm that uses an interior-point method and runs in time \widetilde{O}(m^{3/2} \log (1/≥ilon)).In order to establish these results, we develop a new second-order optimization framework that enables us to treat both problems in a unified and principled manner. This framework identifies a certain generalization of linear system solving that we can use to efficiently minimize a broad class of functions, which we call second-order robust. We then show that in the context of the specific functions capturing matrix scaling and balancing, we can leverage and generalize the work on Laplacian system solving to make the algorithms obtained via this framework very efficient.

...read moreread less

75 citations

Proceedings Article•DOI•

Almost-linear-time algorithms for Markov chains and new spectral primitives for directed graphs

[...]

Michael B. Cohen¹, Jonathan A. Kelner¹, John Peebles¹, Richard Peng², Anup Rao², Aaron Sidford³, Adrian Vladu¹ - Show less +3 more•Institutions (3)

Massachusetts Institute of Technology¹, Georgia Institute of Technology², Stanford University³

19 Jun 2017

TL;DR: In this paper, an almost linear time algorithm for computing the stationary distribution of a Markov chain, as well as the expected commute times in a directed graph, was proposed, with a running time of O((nm3/4 + n2/3 m) logO(1) (n κ e-1)) where m is the number of vertices in the graph, e is the desired accuracy.

...read moreread less

Abstract: In this paper, we begin to address the longstanding algorithmic gap between general and reversible Markov chains. We develop directed analogues of several spectral graph-theoretic tools that had previously been available only in the undirected setting, and for which it was not clear that directed versions even existed. In particular, we provide a notion of approximation for directed graphs, prove sparsifiers under this notion always exist, and show how to construct them in almost linear time. Using this notion of approximation, we design the first almost-linear-time directed Laplacian system solver, and, by leveraging the recent framework of [Cohen-Kelner-Peebles-Peng-Sidford-Vladu, FOCS '16], we also obtain almost-linear-time algorithms for computing the stationary distribution of a Markov chain, computing expected commute times in a directed graph, and more. For each problem, our algorithms improve the previous best running times of O((nm3/4 + n2/3 m) logO(1) (n κ e-1)) to O((m + n2O(√lognloglogn)) logO(1) (n κe-1)) where n is the number of vertices in the graph, m is the number of edges, κ is a natural condition number associated with the problem, and e is the desired accuracy. We hope these results open the door for further studies into directed spectral graph theory, and that they will serve as a stepping stone for designing a new generation of fast algorithms for directed graphs.

...read moreread less

72 citations

Proceedings Article•DOI•

Improved Parallel Algorithms for Spanners and Hopsets

[...]

Gary L. Miller¹, Richard Peng², Adrian Vladu², Shen Chen Xu¹•Institutions (2)

Carnegie Mellon University¹, Massachusetts Institute of Technology²

13 Jun 2015

TL;DR: In this paper, the authors use exponential start time clustering to design faster parallel graph algorithms involving distances, and give linear work parallel algorithms that construct spanners with O(k) stretch and size O(n 1+1/k log k) in unweighted graphs, and O(m poly log n) in weighted graphs.

...read moreread less

Abstract: We use exponential start time clustering to design faster parallel graph algorithms involving distances. Previous algorithms usually rely on graph decomposition routines with strict restrictions on the diameters of the decomposed pieces. We weaken these bounds in favor of stronger local probabilistic guarantees. This allows more direct analyses of the overall process, giving: Linear work parallel algorithms that construct spanners with O(k) stretch and size O(n1+1/k) in unweighted graphs, and size O(n1+1/k log k) in weighted graphs.Hopsets that lead to the first parallel algorithm for approximating shortest paths in undirected graphs with O(m poly log n) work.

...read moreread less

71 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

[...]

Sangdoo Yun¹, Dongyoon Han¹, Sanghyuk Chun¹, Seong Joon Oh, Youngjoon Yoo¹, Junsuk Choe² - Show less +2 more•Institutions (2)

Naver Corporation¹, Yonsei University²

07 Aug 2019

TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.

...read moreread less

Abstract: Regional dropout strategies have been proposed to enhance performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout removes informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it suffers from information loss causing inefficiency in training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gain in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix can improve the model robustness against input corruptions and its out-of distribution detection performance.

...read moreread less

3,013 citations

Posted Content•

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

[...]

Anish Athalye¹, Nicholas Carlini², David Wagner²•Institutions (2)

Massachusetts Institute of Technology¹, University of California, Berkeley²

01 Feb 2018-arXiv: Learning

TL;DR: This work identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples, and develops attack techniques to overcome this effect.

...read moreread less

Abstract: We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

...read moreread less

1,757 citations

Proceedings Article•DOI•

Self-Training With Noisy Student Improves ImageNet Classification

[...]

Qizhe Xie¹, Minh-Thang Luong¹, Eduard Hovy², Quoc V. Le¹•Institutions (2)

Google¹, Carnegie Mellon University²

14 Jun 2020

TL;DR: A simple self-training method that achieves 88.4% top-1 accuracy on ImageNet, which is 2.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images.

...read moreread less

Abstract: We present a simple self-training method that achieves 88.4% top-1 accuracy on ImageNet, which is 2.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to 83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces ImageNet-P mean flip rate from 27.8 to 12.2. To achieve this result, we first train an EfficientNet model on labeled ImageNet images and use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher. During the generation of the pseudo labels, the teacher is not noised so that the pseudo labels are as accurate as possible. However, during the learning of the student, we inject noise such as dropout, stochastic depth and data augmentation via RandAugment to the student so that the student generalizes better than the teacher.

...read moreread less

1,696 citations

Journal Article•DOI•

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

[...]

Naveed Akhtar¹, Ajmal Mian¹•Institutions (1)

University of Western Australia¹

19 Feb 2018-IEEE Access

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.

...read moreread less

Abstract: Deep learning is at the heart of the current rise of artificial intelligence. In the field of computer vision, it has become the workhorse for applications ranging from self-driving cars to surveillance and security. Whereas, deep neural networks have demonstrated phenomenal success (often beyond human capabilities) in solving complex problems, recent studies show that they are vulnerable to adversarial attacks in the form of subtle perturbations to inputs that lead a model to predict incorrect outputs. For images, such perturbations are often too small to be perceptible, yet they completely fool the deep learning models. Adversarial attacks pose a serious threat to the success of deep learning in practice. This fact has recently led to a large influx of contributions in this direction. This paper presents the first comprehensive survey on adversarial attacks on deep learning in computer vision. We review the works that design adversarial attacks, analyze the existence of such attacks and propose defenses against them. To emphasize that adversarial attacks are possible in practical conditions, we separately review the contributions that evaluate adversarial attacks in the real-world scenarios. Finally, drawing on the reviewed literature, we provide a broader outlook of this research direction.

...read moreread less

1,542 citations

Posted Content•

Computational Optimal Transport

[...]

Gabriel Peyré, Marco Cuturi

01 Mar 2018-arXiv: Machine Learning

TL;DR: This short book reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications.

...read moreread less

Abstract: Optimal transport (OT) theory can be informally described using the words of the French mathematician Gaspard Monge (1746-1818): A worker with a shovel in hand has to move a large pile of sand lying on a construction site. The goal of the worker is to erect with all that sand a target pile with a prescribed shape (for example, that of a giant sand castle). Naturally, the worker wishes to minimize her total effort, quantified for instance as the total distance or time spent carrying shovelfuls of sand. Mathematicians interested in OT cast that problem as that of comparing two probability distributions, two different piles of sand of the same volume. They consider all of the many possible ways to morph, transport or reshape the first pile into the second, and associate a "global" cost to every such transport, using the "local" consideration of how much it costs to move a grain of sand from one place to another. Recent years have witnessed the spread of OT in several fields, thanks to the emergence of approximate solvers that can scale to sizes and dimensions that are relevant to data sciences. Thanks to this newfound scalability, OT is being increasingly used to unlock various problems in imaging sciences (such as color or texture processing), computer vision and graphics (for shape manipulation) or machine learning (for regression, classification and density fitting). This short book reviews OT with a bias toward numerical methods and their applications in data sciences, and sheds lights on the theoretical properties of OT that make it particularly useful for some of these applications.

...read moreread less

1,355 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse