Home
/
Authors
/
Stephen J. Wright

Author

Stephen J. Wright

Other affiliations: Argonne National Laboratory, Birkbeck, University of London

Bio: Stephen J. Wright is an academic researcher from University of Wisconsin-Madison. The author has contributed to research in topics: Interior point method & Nonlinear programming. The author has an hindex of 61, co-authored 294 publications receiving 46774 citations. Previous affiliations of Stephen J. Wright include Argonne National Laboratory & Birkbeck, University of London.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1987
1986
1970

Papers

PDF

Open Access

More filters

Book•

Numerical Optimization

[...]

Jorge Nocedal¹, Stephen J. Wright²•Institutions (2)

Northwestern University¹, University of Wisconsin-Madison²

01 Nov 2008

TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.

...read moreread less

Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

...read moreread less

17,420 citations

Journal Article•DOI•

Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems

[...]

Mário A. T. Figueiredo, Robert Nowak, Stephen J. Wright

01 Dec 2007-IEEE Journal of Selected Topics in Signal Processing

TL;DR: This paper proposes gradient projection algorithms for the bound-constrained quadratic programming (BCQP) formulation of these problems and test variants of this approach that select the line search parameters in different ways, including techniques based on the Barzilai-Borwein method.

...read moreread less

Abstract: Many problems in signal processing and statistical inference involve finding sparse solutions to under-determined, or ill-conditioned, linear systems of equations. A standard approach consists in minimizing an objective function which includes a quadratic (squared ) error term combined with a sparseness-inducing regularization term. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), wavelet-based deconvolution, and compressed sensing are a few well-known examples of this approach. This paper proposes gradient projection (GP) algorithms for the bound-constrained quadratic programming (BCQP) formulation of these problems. We test variants of this approach that select the line search parameters in different ways, including techniques based on the Barzilai-Borwein method. Computational experiments show that these GP approaches perform well in a wide range of applications, often being significantly faster (in terms of computation time) than competing methods. Although the performance of GP methods tends to degrade as the regularization term is de-emphasized, we show how they can be embedded in a continuation scheme to recover their efficient practical performance.

...read moreread less

3,488 citations

Book•

Primal-Dual Interior-Point Methods

[...]

Stephen J. Wright¹•Institutions (1)

Argonne National Laboratory¹

01 Jan 1987

TL;DR: This chapter discusses Primal Method Primal-Dual Methods, Path-Following Algorithm, and Infeasible-Interior-Point Algorithms, and their applications to Linear Programming and Interior-Point Methods.

...read moreread less

Abstract: Preface Notation 1. Introduction. Linear Programming Primal-Dual Methods The Central Path A Primal-Dual Framework Path-Following Methods Potential-Reduction Methods Infeasible Starting Points Superlinear Convergence Extensions Mehrotra's Predictor-Corrector Algorithm Linear Algebra Issues Karmarkar's Algorithm 2. Background. Linear Programming and Interior-Point Methods Standard Form Optimality Conditions, Duality, and Solution Sets The B {SYMBOL 200 \f "Symbol"} N Partition and Strict Complementarity A Strictly Interior Point Rank of the Matrix A Bases and Vertices Farkas's Lemma and a Proof of the Goldman-Tucker Result The Central Path Background. Primal Method Primal-Dual Methods. Development of the Fundamental Ideas Notes and References 3. Complexity Theory. Polynomial Versus Exponential, Worst Case vs Average Case Storing the Problem Data. Dimension and Size The Turing Machine and Rational Arithmetic Primal-Dual Methods and Rational Arithmetic Linear Programming and Rational Numbers Moving to a Solution from an Interior Point Complexity of Simplex, Ellipsoid, and Interior-Point Methods Polynomial and Strongly Polynomial Algorithms Beyond the Turing Machine Model More on the Real-Number Model and Algebraic Complexity A General Complexity Theorem for Path-Following Methods Notes and References 4. Potential-Reduction Methods. A Primal-Dual Potential-Reduction Algorithm Reducing Forces Convergence A Quadratic Estimate of \Phi _{\rho } along a Feasible Direction Bounding the Coefficients in The Quadratic Approximation An Estimate of the Reduction in \Phi _{\rho } and Polynomial Complexity What About Centrality? Choosing {SYMBOL 114 \f "Symbol"} and {SYMBOL 97 \f "Symbol"} Notes and References 5. Path-Following Algorithms. The Short-Step Path-Following Algorithm Technical Results The Predictor-Corrector Method A Long-Step Path-Following Algorithm Limit Points of the Iteration Sequence Proof of Lemma 5.3 Notes and References 6. Infeasible-Interior-Point Algorithms. The Algorithm Convergence of Algorithm IPF Technical Results I. Bounds on u _k \delimiter "026B30D (x^k,s^k) \delimiter "026B30D Technical Results II. Bounds on (D^k)^{-1} \Delta x^k and D^k \Delta s^k Technical Results III. A Uniform Lower Bound on {SYMBOL 97 \f "Symbol"}k Proofs of Theorems 6.1 and 6.2 Limit Points of the Iteration Sequence 7. Superlinear Convergence and Finite Termination. Affine-Scaling Steps An Estimate of ({SYMBOL 68 \f "Symbol"}x, {SYMBOL 68 \f "Symbol"} s). The Feasible Case An Estimate of ({SYMBOL 68 \f "Symbol"} x, {SYMBOL 68 \f "Symbol"} s). The Infeasible Case Algorithm PC Is Superlinear Nearly Quadratic Methods Convergence of Algorithm LPF+ Convergence of the Iteration Sequence {SYMBOL 206 \f "Symbol"}(A,b,c) and Finite Termination A Finite Termination Strategy Recovering an Optimal Basis More on {SYMBOL 206 \f "Symbol"} (A,b,c) Notes and References 8. Extensions. The Monotone LCP Mixed and Horizontal LCP Strict Complementarity and LCP Convex QP Convex Programming Monotone Nonlinear Complementarity and Variational Inequalities Semidefinite Programming Proof of Theorem 8.4. Notes and References 9. Detecting Infeasibility. Self-Duality The Simplified HSD Form The HSDl Form Identifying a Solution-Free Region Implementations of the HSD Formulations Notes and References 10. Practical Aspects of Primal-Dual Algorithms. Motivation for Mehrotra's Algorithm The Algorithm Superquadratic Convergence Second-Order Trajectory-Following Methods Higher-Order Methods Further Enhancements Notes and References 11. Implementations. Three Forms of the Step Equation The Cholesky Factorization Sparse Cholesky Factorization. Minimum-Degree Orderings Other Orderings Small Pivots in the Cholesky Factorization Dense Columns in A The Augmented System Formulat

...read moreread less

2,277 citations

Book•

Numerical Optimization (Springer Series in Operations Research and Financial Engineering)

[...]

Jorge Nocedal, Stephen J. Wright

28 Apr 2000

TL;DR: Numerical optimization presents a graduate text, in continuous presents, that talks extensively about algorithmic performance and thinking, and about mathematical optimization in understanding of initiative.

...read moreread less

Abstract: Optimization is an important tool used in decision science and for the analysis of physical systems used in This space but at the book requires substantial background. Numerical optimization presents a graduate text, in continuous presents. Mmor mathematical optimization in understanding of initiative. I acknowledge the necessary to use, optimization it talks extensively about algorithmic performance and thinking. The optimization they delve into, the new ideas progressing through more thoroughly updated throughout.

...read moreread less

2,193 citations

Proceedings Article•

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

[...]

Benjamin Recht¹, Christopher Ré¹, Stephen J. Wright¹, Feng Niu¹•Institutions (1)

University of Wisconsin-Madison¹

12 Dec 2011

TL;DR: In this paper, the authors present an update scheme called HOGWILD!, which allows processors access to shared memory with the possibility of overwriting each other's work, which achieves a nearly optimal rate of convergence.

...read moreread less

Abstract: Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented without any locking. We present an update scheme called HOGWILD! which allows processors access to shared memory with the possibility of overwriting each other's work. We show that when the associated optimization problem is sparse, meaning most gradient updates only modify small parts of the decision variable, then HOGWILD! achieves a nearly optimal rate of convergence. We demonstrate experimentally that HOGWILD! outperforms alternative schemes that use locking by an order of magnitude.

...read moreread less

1,939 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

Collapse

Cited by

PDF

Open Access

More filters

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Journal Article•DOI•

AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading

[...]

Oleg Trott¹, Arthur J. Olson¹•Institutions (1)

Scripps Research Institute¹

04 Jun 2009-Journal of Computational Chemistry

TL;DR: AutoDock Vina achieves an approximately two orders of magnitude speed‐up compared with the molecular docking software previously developed in the lab, while also significantly improving the accuracy of the binding mode predictions, judging by tests on the training set used in AutoDock 4 development.

...read moreread less

Abstract: AutoDock Vina, a new program for molecular docking and virtual screening, is presented. AutoDock Vina achieves an approximately two orders of magnitude speed-up compared with the molecular docking software previously developed in our lab (AutoDock 4), while also significantly improving the accuracy of the binding mode predictions, judging by our tests on the training set used in AutoDock 4 development. Further speed-up is achieved from parallelism, by using multithreading on multicore machines. AutoDock Vina automatically calculates the grid maps and clusters the results in a way transparent to the user.

...read moreread less

20,059 citations

Book•

Compressed sensing

[...]

D.L. Donoho¹•Institutions (1)

Stanford University¹

01 Jan 2004

TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.

...read moreread less

Abstract: Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0

...read moreread less

18,609 citations

Book•

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

[...]

Stephen Boyd¹, Neal Parikh¹, Eric Chu¹, Borja Peleato¹, Jonathan Eckstein² - Show less +1 more•Institutions (2)

Stanford University¹, Rutgers University²

23 May 2011

TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.

...read moreread less

Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

...read moreread less

17,433 citations

Book•

Numerical Optimization

[...]

Jorge Nocedal¹, Stephen J. Wright²•Institutions (2)

Northwestern University¹, University of Wisconsin-Madison²

01 Nov 2008

...read moreread less

17,420 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse