Home
/
Authors
/
Alban Desmaison

Author

Alban Desmaison

Other affiliations: École Centrale Paris

Bio: Alban Desmaison is an academic researcher from University of Oxford. The author has contributed to research in topics: Conditional random field & Graphical model. The author has an hindex of 12, co-authored 26 publications receiving 21773 citations. Previous affiliations of Alban Desmaison include École Centrale Paris.

Topics: Conditional random field, Graphical model, Compiler, CRFS, Linear programming ...read more

Papers

PDF

Open Access

More filters

Automatic differentiation in PyTorch

[...]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Z. Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, Adam Lerer - Show less +6 more

28 Oct 2017

TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

...read moreread less

Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

...read moreread less

13,268 citations

Posted Content•

PyTorch: An Imperative Style, High-Performance Deep Learning Library

[...]

Adam Paszke¹, Sam Gross², Francisco Massa², Adam Lerer², James Bradbury³, Gregory Chanan², Trevor Killeen⁴, Zeming Lin², Natalia Gimelshein⁵, Luca Antiga⁶, Alban Desmaison⁷, Andreas Kopf⁸, Edward Z. Yang², Zachary DeVito⁹, Martin Raison², Alykhan Tejani¹⁰, Sasank Chilamkurthy, Benoit Steiner², Lu Fang², Junjie Bai², Soumith Chintala² - Show less +17 more•Institutions (10)

University of Warsaw¹, Facebook², Salesforce.com³, University of Washington⁴, Nvidia⁵, Mario Negri Institute for Pharmacological Research⁶, University of Oxford⁷, ETH Zurich⁸, Stanford University⁹, Twitter¹⁰

03 Dec 2019-arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several common benchmarks.

...read moreread less

12,767 citations

Proceedings Article•

PyTorch: An Imperative Style, High-Performance Deep Learning Library

[...]

Adam Paszke¹, Sam Gross², Francisco Massa², Adam Lerer², James Bradbury³, Gregory Chanan², Trevor Killeen⁴, Zeming Lin², Natalia Gimelshein⁵, Luca Antiga⁶, Alban Desmaison⁷, Andreas Kopf⁸, Edward Z. Yang², Zachary DeVito⁹, Martin Raison², Alykhan Tejani¹⁰, Sasank Chilamkurthy, Benoit Steiner², Lu Fang¹¹, Junjie Bai², Soumith Chintala² - Show less +17 more•Institutions (11)

01 Jan 2019

TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

...read moreread less

Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

...read moreread less

10,045 citations

Proceedings Article•DOI•

Learning Disentangled Representations with Semi-Supervised Deep Generative Models

[...]

N. Siddharth¹, Brooks Paige², Brooks Paige³, Jan-Willem van de Meent⁴, Alban Desmaison¹, Noah D. Goodman⁵, Pushmeet Kohli⁶, Frank Wood¹, Philip H. S. Torr¹ - Show less +5 more•Institutions (6)

University of Oxford¹, University of Cambridge², The Turing Institute³, Northeastern University⁴, Stanford University⁵, Google⁶

04 Sep 2017

TL;DR: The authors propose to learn disentangled representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder, which allows to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks for the remaining variables.

...read moreread less

Abstract: Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi-supervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework's ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.

...read moreread less

223 citations

Posted Content•

Learning Disentangled Representations with Semi-Supervised Deep Generative Models

[...]

University of Oxford¹, University of Cambridge², The Turing Institute³, Northeastern University⁴, Stanford University⁵, Google⁶

01 Jun 2017-arXiv: Machine Learning

...read moreread less

207 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

RoBERTa: A Robustly Optimized BERT Pretraining Approach

[...]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Michael Lewis, Luke Zettlemoyer, Veselin Stoyanov - Show less +6 more

26 Jul 2019-arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Abstract: Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results highlight the importance of previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code.

...read moreread less

13,994 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Posted Content•

PyTorch: An Imperative Style, High-Performance Deep Learning Library

[...]

03 Dec 2019-arXiv: Learning

...read moreread less

12,767 citations

Proceedings Article•

PyTorch: An Imperative Style, High-Performance Deep Learning Library

[...]

Adam Paszke¹, Sam Gross², Francisco Massa², Adam Lerer², James Bradbury³, Gregory Chanan², Trevor Killeen⁴, Zeming Lin², Natalia Gimelshein⁵, Luca Antiga⁶, Alban Desmaison⁷, Andreas Kopf⁸, Edward Z. Yang², Zachary DeVito⁹, Martin Raison², Alykhan Tejani¹⁰, Sasank Chilamkurthy, Benoit Steiner², Lu Fang¹¹, Junjie Bai², Soumith Chintala² - Show less +17 more•Institutions (11)

01 Jan 2019

...read moreread less

Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

...read moreread less

10,045 citations

Journal Article•DOI•

Array programming with NumPy

[...]

Charles R. Harris, K. Jarrod Millman¹, Stefan van der Walt², Stefan van der Walt¹, Ralf Gommers, Pauli Virtanen³, David Cournapeau, Eric Wieser⁴, Julian Taylor, Sebastian Berg¹, Nathaniel J. Smith, Robert Kern, Matti Picus¹, Stephan Hoyer⁵, Marten H. van Kerkwijk⁶, Matthew Brett⁷, Matthew Brett¹, Allan Haldane⁸, Jaime Fernández del Río⁵, Mark Wiebe⁹, Mark Wiebe¹⁰, Pearu Peterson, Pierre Gérard-Marchant¹¹, Kevin Sheppard¹², Tyler Reddy¹³, Warren Weckesser¹, Hameer Abbasi, Christoph Gohlke¹⁴, Travis E. Oliphant - Show less +25 more•Institutions (14)

University of California, Berkeley¹, Stellenbosch University², University of Jyväskylä³, University of Cambridge⁴, Google⁵, University of Toronto⁶, University of Birmingham⁷, Temple University⁸, Amazon.com⁹, University of British Columbia¹⁰, University of Georgia¹¹, University of Oxford¹², Los Alamos National Laboratory¹³, University of California, Irvine¹⁴

16 Sep 2020-Nature

TL;DR: In this paper, the authors review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data, and their evolution into a flexible interoperability layer between increasingly specialized computational libraries is discussed.

...read moreread less

Abstract: Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis. NumPy is the primary array programming library for Python; here its fundamental concepts are reviewed and its evolution into a flexible interoperability layer between increasingly specialized computational libraries is discussed.

...read moreread less

7,624 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse