Home
/
Authors
/
Brendan O'Donoghue

Author

Brendan O'Donoghue

Bio: Brendan O'Donoghue is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Convex optimization. The author has an hindex of 22, co-authored 48 publications receiving 4408 citations. Previous affiliations of Brendan O'Donoghue include Stanford University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Clinically applicable deep learning for diagnosis and referral in retinal disease

[...]

Jeffrey De Fauw, Joseph R. Ledsam, Bernardino Romera-Paredes, Stanislav Nikolov, Nenad Tomasev, Sam Blackwell, Harry Askham, Xavier Glorot, Brendan O'Donoghue, Daniel Visentin, George van den Driessche, Balaji Lakshminarayanan, Clemens Meyer, Faith Mackinder, Simon Bouton, Kareem Ayoub, Reena Chopra¹, Dominic King, Alan Karthikesalingam, Cian Hughes², Rosalind Raine², Julian Hughes¹, Dawn A Sim¹, Catherine A Egan¹, Adnan Tufail¹, Hugh Montgomery², Demis Hassabis, Geraint Rees², Trevor Back, Peng T. Khaw¹, Mustafa Suleyman, Julien Cornebise², Pearse A. Keane¹, Olaf Ronneberger - Show less +30 more•Institutions (2)

UCL Institute of Ophthalmology¹, University College London²

13 Aug 2018-Nature Medicine

TL;DR: A novel deep learning architecture performs device-independent tissue segmentation of clinical 3D retinal images followed by separate diagnostic classification that meets or exceeds human expert clinical diagnoses of retinal disease.

...read moreread less

Abstract: The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting.

...read moreread less

1,665 citations

Journal Article•DOI•

Fast Alternating Direction Optimization Methods

[...]

Tom Goldstein¹, Brendan O'Donoghue², Simon Setzer³, Richard G. Baraniuk⁴•Institutions (4)

University of Maryland, College Park¹, Stanford University², University of Mannheim³, Houston Methodist Hospital⁴

05 Aug 2014-Siam Journal on Imaging Sciences

TL;DR: This paper considers accelerated variants of two common alternating direction methods: the alternating direction method of multipliers (ADMM) and the alternating minimization algorithm (AMA), of the form first proposed by Nesterov for gradient descent methods.

...read moreread less

Abstract: Alternating direction methods are a common tool for general mathematical programming and optimization. These methods have become particularly important in the field of variational image processing, which frequently requires the minimization of nondifferentiable objectives. This paper considers accelerated (i.e., fast) variants of two common alternating direction methods: the alternating direction method of multipliers (ADMM) and the alternating minimization algorithm (AMA). The proposed acceleration is of the form first proposed by Nesterov for gradient descent methods. In the case that the objective function is strongly convex, global convergence bounds are provided for both classical and accelerated variants of the methods. Numerical examples are presented to demonstrate the superior performance of the fast methods for a wide variety of problems.

...read moreread less

772 citations

Journal Article•DOI•

Adaptive Restart for Accelerated Gradient Schemes

[...]

Brendan O'Donoghue¹, Emmanuel J. Candès¹•Institutions (1)

Stanford University¹

01 Jun 2015-Foundations of Computational Mathematics

TL;DR: In this paper, a simple heuristic adaptive restart technique that can dramatically improve the convergence rate of accelerated gradient schemes is proposed. But it is not known whether the adaptive restart interval is proportional to the square root of the local condition number of the objective function.

...read moreread less

Abstract: In this paper we introduce a simple heuristic adaptive restart technique that can dramatically improve the convergence rate of accelerated gradient schemes. The analysis of the technique relies on the observation that these schemes exhibit two modes of behavior depending on how much momentum is applied at each iteration. In what we refer to as the `high momentum' regime the iterates generated by an accelerated gradient scheme exhibit a periodic behavior, where the period is proportional to the square root of the local condition number of the objective function. Separately, it is known that the optimal restart interval is proportional to this same quantity. This suggests a restart technique whereby we reset the momentum whenever we observe periodic behavior. We provide a heuristic analysis that suggests that in many cases adaptively restarting allows us to recover the optimal rate of convergence with no prior knowledge of function parameters.

...read moreread less

664 citations

Journal Article•DOI•

Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding

[...]

Brendan O'Donoghue¹, Eric Chu¹, Neal Parikh¹, Stephen Boyd¹•Institutions (1)

Stanford University¹

01 Jun 2016-Journal of Optimization Theory and Applications

TL;DR: In this article, the alternating directions method of multipliers is used to solve the homogeneous self-dual embedding, an equivalent feasibility problem involving finding a nonzero point in the intersection of a subspace and a cone.

...read moreread less

Abstract: We introduce a first-order method for solving very large convex cone programs. The method uses an operator splitting method, the alternating directions method of multipliers, to solve the homogeneous self-dual embedding, an equivalent feasibility problem involving finding a nonzero point in the intersection of a subspace and a cone. This approach has several favorable properties. Compared to interior-point methods, first-order methods scale to very large problems, at the cost of requiring more time to reach very high accuracy. Compared to other first-order methods for cone programs, our approach finds both primal and dual solutions when available or a certificate of infeasibility or unboundedness otherwise, is parameter free, and the per-iteration cost of the method is the same as applying a splitting method to the primal or dual alone. We discuss efficient implementation of the method in detail, including direct and indirect methods for computing projection onto the subspace, scaling the original problem data, and stopping criteria. We describe an open-source implementation, which handles the usual (symmetric) nonnegative, second-order, and semidefinite cones as well as the (non-self-dual) exponential and power cones and their duals. We report numerical results that show speedups over interior-point cone solvers for large problems, and scaling to very large general cone programs.

...read moreread less

597 citations

Posted Content•

Adversarial Risk and the Dangers of Evaluating Against Weak Attacks

[...]

Jonathan Uesato¹, Brendan O'Donoghue², Aaron van den Oord², Pushmeet Kohli²•Institutions (2)

Massachusetts Institute of Technology¹, Google²

15 Feb 2018-arXiv: Learning

TL;DR: In this paper, the authors use adversarial risk as an objective, although it cannot easily be computed exactly, and frame commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarial risks.

...read moreread less

Abstract: This paper investigates recently proposed approaches for defending against adversarial examples and evaluating adversarial robustness The existence of adversarial examples in trained neural networks reflects the fact that expected risk alone does not capture the model's performance against worst-case inputs We motivate the use of adversarial risk as an objective, although it cannot easily be computed exactly We then frame commonly used attacks and evaluation metrics as defining a tractable surrogate objective to the true adversarial risk This suggests that models may be obscured to adversaries, by optimizing this surrogate rather than the true adversarial risk We demonstrate that this is a significant problem in practice by repurposing gradient-free optimization techniques into adversarial attacks, which we use to decrease the accuracy of several recently proposed defenses to near zero Our hope is that our formulations and results will help researchers to develop more powerful defenses

...read moreread less

207 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Convex Analysisの二,三の進展について

[...]

徹丸山

01 Feb 1977

5,933 citations

Book•

Proximal Algorithms

[...]

Neal Parikh¹, Stephen Boyd¹•Institutions (1)

Stanford University¹

27 Nov 2013

TL;DR: The many different interpretations of proximal operators and algorithms are discussed, their connections to many other topics in optimization and applied mathematics are described, some popular algorithms are surveyed, and a large number of examples of proxiesimal operators that commonly arise in practice are provided.

...read moreread less

Abstract: This monograph is about a class of optimization algorithms called proximal algorithms. Much like Newton's method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton's method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point onto a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here, we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that commonly arise in practice.

...read moreread less

3,627 citations

Posted Content•

SGDR: Stochastic Gradient Descent with Warm Restarts

[...]

Ilya Loshchilov¹, Frank Hutter¹•Institutions (1)

University of Freiburg¹

13 Aug 2016-arXiv: Learning

TL;DR: In this paper, a simple warm restart technique for stochastic gradient descent was proposed to improve its anytime performance when training deep neural networks, which achieved state-of-the-art results on both the CIFAR-10 and CifAR-100 datasets.

...read moreread less

Abstract: Restart techniques are common in gradient-free optimization to deal with multimodal functions. Partial warm restarts are also gaining popularity in gradient-based optimization to improve the rate of convergence in accelerated gradient schemes to deal with ill-conditioned functions. In this paper, we propose a simple warm restart technique for stochastic gradient descent to improve its anytime performance when training deep neural networks. We empirically study its performance on the CIFAR-10 and CIFAR-100 datasets, where we demonstrate new state-of-the-art results at 3.14% and 16.21%, respectively. We also demonstrate its advantages on a dataset of EEG recordings and on a downsampled version of the ImageNet dataset. Our source code is available at this https URL

...read moreread less

3,497 citations

Posted Content•

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

[...]

Tuomas Haarnoja¹, Aurick Zhou¹, Pieter Abbeel¹, Sergey Levine¹•Institutions (1)

University of California, Berkeley¹

04 Jan 2018-arXiv: Learning

TL;DR: In this article, an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework is proposed, where the actor aims to maximize expected reward while also maximizing entropy.

...read moreread less

Abstract: Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods to complex, real-world domains. In this paper, we propose soft actor-critic, an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework. In this framework, the actor aims to maximize expected reward while also maximizing entropy. That is, to succeed at the task while acting as randomly as possible. Prior deep RL methods based on this framework have been formulated as Q-learning methods. By combining off-policy updates with a stable stochastic actor-critic formulation, our method achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off-policy methods. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving very similar performance across different random seeds.

...read moreread less

3,141 citations

Journal Article•DOI•

A Survey on Mobile Edge Computing: The Communication Perspective

[...]

Yuyi Mao¹, Changsheng You², Jun Zhang¹, Kaibin Huang², Khaled Ben Letaief¹ - Show less +1 more•Institutions (2)

Hong Kong University of Science and Technology¹, University of Hong Kong²

25 Aug 2017-IEEE Communications Surveys and Tutorials

TL;DR: A comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management is provided in this paper, where a set of issues, challenges, and future research directions for MEC are discussed.

...read moreread less

Abstract: Driven by the visions of Internet of Things and 5G communications, recent years have seen a paradigm shift in mobile computing, from the centralized mobile cloud computing toward mobile edge computing (MEC). The main feature of MEC is to push mobile computing, network control and storage to the network edges (e.g., base stations and access points) so as to enable computation-intensive and latency-critical applications at the resource-limited mobile devices. MEC promises dramatic reduction in latency and mobile energy consumption, tackling the key challenges for materializing 5G vision. The promised gains of MEC have motivated extensive efforts in both academia and industry on developing the technology. A main thrust of MEC research is to seamlessly merge the two disciplines of wireless communications and mobile computing, resulting in a wide-range of new designs ranging from techniques for computation offloading to network architectures. This paper provides a comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management. We also discuss a set of issues, challenges, and future research directions for MEC research, including MEC system deployment, cache-enabled MEC, mobility management for MEC, green MEC, as well as privacy-aware MEC. Advancements in these directions will facilitate the transformation of MEC from theory to practice. Finally, we introduce recent standardization efforts on MEC as well as some typical MEC application scenarios.

...read moreread less

2,992 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse