Home
/
Authors
/
Saeed Ghadimi

Author

Saeed Ghadimi

Other affiliations: University of Waterloo, University of Florida

Bio: Saeed Ghadimi is an academic researcher from Princeton University. The author has contributed to research in topics: Stochastic optimization & Stochastic approximation. The author has an hindex of 17, co-authored 32 publications receiving 2361 citations. Previous affiliations of Saeed Ghadimi include University of Waterloo & University of Florida.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming

[...]

Saeed Ghadimi, Guanghui Lan

03 Dec 2013-Siam Journal on Optimization

TL;DR: The randomized stochastic gradient (RSG) algorithm as mentioned in this paper is a type of approximation algorithm for non-convex nonlinear programming problems, and it has a nearly optimal rate of convergence if the problem is convex.

...read moreread less

Abstract: In this paper, we introduce a new stochastic approximation type algorithm, namely, the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a postoptimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and we show that such modification allows us to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

...read moreread less

599 citations

Journal Article•DOI•

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

[...]

Saeed Ghadimi¹, Guanghui Lan¹•Institutions (1)

University of Florida¹

01 Mar 2016-Mathematical Programming

TL;DR: The AG method is generalized to solve nonconvex and possibly stochastic optimization problems and it is demonstrated that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general non Convex smooth optimization problems by using first-order information, similarly to the gradient descent method.

...read moreread less

Abstract: In this paper, we generalize the well-known Nesterov's accelerated gradient (AG) method, originally designed for convex smooth optimization, to solve nonconvex and possibly stochastic optimization problems. We demonstrate that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general nonconvex smooth optimization problems by using first-order information, similarly to the gradient descent method. We then consider an important class of composite optimization problems and show that the AG method can solve them uniformly, i.e., by using the same aggressive stepsize policy as in the convex case, even if the problem turns out to be nonconvex. We demonstrate that the AG method exhibits an optimal rate of convergence if the composite problem is convex, and improves the best known rate of convergence if the problem is nonconvex. Based on the AG method, we also present new nonconvex stochastic approximation methods and show that they can improve a few existing rates of convergence for nonconvex stochastic optimization. To the best of our knowledge, this is the first time that the convergence of the AG method has been established for solving nonconvex nonlinear programming in the literature.

...read moreread less

578 citations

Journal Article•DOI•

Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization

[...]

Saeed Ghadimi¹, Guanghui Lan¹, Hongchao Zhang²•Institutions (2)

University of Florida¹, Louisiana State University²

01 Jan 2016-Mathematical Programming

TL;DR: In this paper, a randomized stochastic projected gradient (RSPG) algorithm was proposed to solve the convex composite optimization problem, in which proper mini-batch of samples are taken at each iteration depending on the total budget of stochiastic samples allowed, and a post-optimization phase was also proposed to reduce the variance of the solutions returned by the algorithm.

...read moreread less

Abstract: This paper considers a class of constrained stochastic composite optimization problems whose objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a certain non-differentiable (but convex) component. In order to solve these problems, we propose a randomized stochastic projected gradient (RSPG) algorithm, in which proper mini-batch of samples are taken at each iteration depending on the total budget of stochastic samples allowed. The RSPG algorithm also employs a general distance function to allow taking advantage of the geometry of the feasible region. Complexity of this algorithm is established in a unified setting, which shows nearly optimal complexity of the algorithm for convex stochastic programming. A post-optimization phase is also proposed to significantly reduce the variance of the solutions returned by the algorithm. In addition, based on the RSPG algorithm, a stochastic gradient free algorithm, which only uses the stochastic zeroth-order information, has been also discussed. Some preliminary numerical results are also provided.

...read moreread less

375 citations

Journal Article•DOI•

Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework

[...]

Saeed Ghadimi¹, Guanghui Lan¹•Institutions (1)

University of Florida¹

06 Nov 2012-Siam Journal on Optimization

TL;DR: This paper investigates the AC-SA algorithms for solving strongly convex stochastic composite optimization problems in more detail by establishing the large-deviation results associated with the convergence rates and introducing an efficient validation procedure to check the accuracy of the generated solutions.

...read moreread less

Abstract: In this paper we present a generic algorithmic framework, namely, the accelerated stochastic approximation (AC-SA) algorithm, for solving strongly convex stochastic composite optimization (SCO) problems. While the classical stochastic approximation algorithms are asymptotically optimal for solving differentiable and strongly convex problems, the AC-SA algorithm, when employed with proper stepsize policies, can achieve optimal or nearly optimal rates of convergence for solving different classes of SCO problems during a given number of iterations. Moreover, we investigate these AC-SA algorithms in more detail, such as by establishing the large-deviation results associated with the convergence rates and introducing an efficient validation procedure to check the accuracy of the generated solutions.

...read moreread less

366 citations

Posted Content•

Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming

[...]

Saeed Ghadimi, Guanghui Lan

22 Sep 2013-arXiv: Optimization and Control

TL;DR: This paper discusses a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and shows that such modification allows to improve significantly the large-deviation properties of the algorithms.

...read moreread less

Abstract: In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems. We establish the complexity of this method for computing an approximate stationary point of a nonlinear programming problem. We also show that this method possesses a nearly optimal rate of convergence if the problem is convex. We discuss a variant of the algorithm which consists of applying a post-optimization phase to evaluate a short list of solutions generated by several independent runs of the RSG method, and show that such modification allows to improve significantly the large-deviation properties of the algorithm. These methods are then specialized for solving a class of simulation-based optimization problems in which only stochastic zeroth-order information is available.

...read moreread less

293 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Proceedings Article•DOI•

ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models

[...]

Pin-Yu Chen¹, Huan Zhang², Yash Sharma¹, Jinfeng Yi¹, Cho-Jui Hsieh² - Show less +1 more•Institutions (2)

IBM¹, University of California, Davis²

03 Nov 2017

TL;DR: An effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN is proposed, sparing the need for training substitute models and avoiding the loss in attack transferability.

...read moreread less

Abstract: Deep neural networks (DNNs) are one of the most prominent technologies of our time, as they achieve state-of-the-art performance in many machine learning tasks, including but not limited to image classification, text mining, and speech processing. However, recent research on DNNs has indicated ever-increasing concern on the robustness to adversarial examples, especially for security-critical tasks such as traffic sign identification for autonomous driving. Studies have unveiled the vulnerability of a well-trained DNN by demonstrating the ability of generating barely noticeable (to both human and machines) adversarial images that lead to misclassification. Furthermore, researchers have shown that these adversarial images are highly transferable by simply training and attacking a substitute model built upon the target model, known as a black-box attack to DNNs. Similar to the setting of training substitute models, in this paper we propose an effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN. However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples. We use zeroth order stochastic coordinate descent along with dimension reduction, hierarchical attack and importance sampling techniques to efficiently attack black-box models. By exploiting zeroth order optimization, improved attacks to the targeted DNN can be accomplished, sparing the need for training substitute models and avoiding the loss in attack transferability. Experimental results on MNIST, CIFAR10 and ImageNet show that the proposed ZOO attack is as effective as the state-of-the-art white-box attack (e.g., Carlini and Wagner's attack) and significantly outperforms existing black-box attacks via substitute models.

...read moreread less

1,271 citations

Journal Article•

Riemann manifold Langevin and Hamiltonian Monte Carlo methods

[...]

Mark Girolami, Ben Calderhead

01 Jan 2011-Journal of the Royal Statistical Society

TL;DR: The methodology proposed automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density, and substantial improvements in the time‐normalized effective sample size are reported when compared with alternative sampling approaches.

...read moreread less

Abstract: The paper proposes Metropolis adjusted Langevin and Hamiltonian Monte Carlo sampling methods defined on the Riemann manifold to resolve the shortcomings of existing Monte Carlo algorithms when sampling from target densities that may be high dimensional and exhibit strong correlations. The methods provide fully automated adaptation mechanisms that circumvent the costly pilot runs that are required to tune proposal densities for Metropolis-Hastings or indeed Hamiltonian Monte Carlo and Metropolis adjusted Langevin algorithms. This allows for highly efficient sampling even in very high dimensions where different scalings may be required for the transient and stationary phases of the Markov chain. The methodology proposed exploits the Riemann geometry of the parameter space of statistical models and thus automatically adapts to the local structure when simulating paths across this manifold, providing highly efficient convergence and exploration of the target density. The performance of these Riemann manifold Monte Carlo methods is rigorously assessed by performing inference on logistic regression models, log-Gaussian Cox point processes, stochastic volatility models and Bayesian estimation of dynamic systems described by non-linear differential equations. Substantial improvements in the time-normalized effective sample size are reported when compared with alternative sampling approaches. MATLAB code that is available from http://www.ucl.ac.uk/statistics/research/rmhmc allows replication of all the results reported.

...read moreread less

1,031 citations

Proceedings Article•

A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets

[...]

Nicolas Le Roux¹, Mark Schmidt¹, Francis Bach¹•Institutions (1)

École Normale Supérieure¹

03 Dec 2012

TL;DR: In this paper, a new stochastic gradient method was proposed to optimize the sum of a finite set of smooth functions, where the sum is strongly convex, with a memory of previous gradient values in order to achieve a linear convergence rate.

...read moreread less

Abstract: We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms, both in terms of optimizing the training error and reducing the test error quickly.

...read moreread less

781 citations

Proceedings Article•DOI•

ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models

[...]

Pin-Yu Chen¹, Huan Zhang², Yash Sharma¹, Jinfeng Yi¹, Cho-Jui Hsieh² - Show less +1 more•Institutions (2)

IBM¹, University of California, Davis²

14 Aug 2017-arXiv: Machine Learning

TL;DR: Zeroth order optimization (ZOO) as discussed by the authors was proposed to estimate the gradients of the target DNN for generating adversarial examples, which was shown to be as effective as the state-of-the-art white-box attack.

...read moreread less

Abstract: Deep neural networks (DNNs) are one of the most prominent technologies of our time, as they achieve state-of-the-art performance in many machine learning tasks, including but not limited to image classification, text mining, and speech processing. However, recent research on DNNs has indicated ever-increasing concern on the robustness to adversarial examples, especially for security-critical tasks such as traffic sign identification for autonomous driving. Studies have unveiled the vulnerability of a well-trained DNN by demonstrating the ability of generating barely noticeable (to both human and machines) adversarial images that lead to misclassification. Furthermore, researchers have shown that these adversarial images are highly transferable by simply training and attacking a substitute model built upon the target model, known as a black-box attack to DNNs. Similar to the setting of training substitute models, in this paper we propose an effective black-box attack that also only has access to the input (images) and the output (confidence scores) of a targeted DNN. However, different from leveraging attack transferability from substitute models, we propose zeroth order optimization (ZOO) based attacks to directly estimate the gradients of the targeted DNN for generating adversarial examples. We use zeroth order stochastic coordinate descent along with dimension reduction, hierarchical attack and importance sampling techniques to efficiently attack black-box models. By exploiting zeroth order optimization, improved attacks to the targeted DNN can be accomplished, sparing the need for training substitute models and avoiding the loss in attack transferability. Experimental results on MNIST, CIFAR10 and ImageNet show that the proposed ZOO attack is as effective as the state-of-the-art white-box attack and significantly outperforms existing black-box attacks via substitute models.

...read moreread less

770 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse