Home
/
Authors
/
Matt J. Kusner

Author

Matt J. Kusner

Other affiliations: University of Warwick, University of Oxford, Washington University in St. Louis ...read more

Bio: Matt J. Kusner is an academic researcher from University College London. The author has contributed to research in topics: Generative model & Computer science. The author has an hindex of 28, co-authored 71 publications receiving 4975 citations. Previous affiliations of Matt J. Kusner include University of Warwick & University of Oxford.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012

Papers

PDF

Open Access

More filters

Proceedings Article•

From Word Embeddings To Document Distances

[...]

Matt J. Kusner¹, Yu Sun¹, Nicholas I. Kolkin¹, Kilian Q. Weinberger¹•Institutions (1)

Washington University in St. Louis¹

06 Jul 2015

TL;DR: It is demonstrated on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the Word Mover's Distance metric leads to unprecedented low k-nearest neighbor document classification error rates.

...read moreread less

Abstract: We present the Word Mover's Distance (WMD), a novel distance function between text documents. Our work is based on recent results in word embeddings that learn semantically meaningful representations for words from local cooccurrences in sentences. The WMD distance measures the dissimilarity between two text documents as the minimum amount of distance that the embedded words of one document need to "travel" to reach the embedded words of another document. We show that this distance metric can be cast as an instance of the Earth Mover's Distance, a well studied transportation problem for which several highly efficient solvers have been developed. Our metric has no hyperparameters and is straight-forward to implement. Further, we demonstrate on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the WMD metric leads to unprecedented low k-nearest neighbor document classification error rates.

...read moreread less

1,786 citations

Proceedings Article•

Counterfactual fairness

[...]

Matt J. Kusner¹, Joshua R. Loftus², Chris Russell³, Ricardo Silva⁴•Institutions (4)

University of Warwick¹, New York University², University of Surrey³, University College London⁴

04 Dec 2017

TL;DR: This paper develops a framework for modeling fairness using tools from causal inference and demonstrates the framework on a real-world problem of fair prediction of success in law school.

...read moreread less

Abstract: Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices. In this paper, we develop a framework for modeling fairness using tools from causal inference. Our definition of counterfactual fairness captures the intuition that a decision is fair towards an individual if it the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. We demonstrate our framework on a real-world problem of fair prediction of success in law school.

...read moreread less

975 citations

Posted Content•

Counterfactual Fairness

[...]

Matt J. Kusner¹, Joshua R. Loftus², Chris Russell³, Ricardo Silva⁴•Institutions (4)

University of Warwick¹, New York University², University of Surrey³, University College London⁴

20 Mar 2017-arXiv: Machine Learning

TL;DR: In this article, a framework for modeling fairness using tools from causal inference is presented. But the authors focus on the counterfactual fairness, which captures the intuition that a decision is fair towards an individual if it is the same in (a) the actual world and (b) a counter-factual world where the individual belonged to a different demographic group.

...read moreread less

Abstract: Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices. In this paper, we develop a framework for modeling fairness using tools from causal inference. Our definition of counterfactual fairness captures the intuition that a decision is fair towards an individual if it is the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. We demonstrate our framework on a real-world problem of fair prediction of success in law school.

...read moreread less

400 citations

Proceedings Article•

Bayesian Optimization with Inequality Constraints

[...]

Jacob R. Gardner¹, Matt J. Kusner¹, Zhixiang¹, Kilian Q. Weinberger¹, John P. Cunningham² - Show less +1 more•Institutions (2)

Washington University in St. Louis¹, Columbia University²

21 Jun 2014

TL;DR: This work presents constrained Bayesian optimization, which places a prior distribution on both the objective and the constraint functions, and evaluates this method on simulated and real data, demonstrating that constrainedBayesian optimization can quickly find optimal and feasible points, even when small feasible regions cause standard methods to fail.

...read moreread less

Abstract: Bayesian optimization is a powerful framework for minimizing expensive objective functions while using very few function evaluations. It has been successfully applied to a variety of problems, including hyperparameter tuning and experimental design. However, this framework has not been extended to the inequality-constrained optimization setting, particularly the setting in which evaluating feasibility is just as expensive as evaluating the objective. Here we present constrained Bayesian optimization, which places a prior distribution on both the objective and the constraint functions. We evaluate our method on simulated and real data, demonstrating that constrained Bayesian optimization can quickly find optimal and feasible points, even when small feasible regions cause standard methods to fail.

...read moreread less

333 citations

Posted Content•

Grammar Variational Autoencoder

[...]

Matt J. Kusner¹, Brooks Paige², José Miguel Hernández-Lobato²•Institutions (2)

University of Warwick¹, University of Cambridge²

06 Mar 2017-arXiv: Machine Learning

TL;DR: Surprisingly, it is shown that not only does the model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs.

...read moreread less

Abstract: Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as video and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which encodes and decodes directly to and from these parse trees, ensuring the generated outputs are always valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their improved performance in Bayesian optimization for symbolic regression and molecular synthesis.

...read moreread less

308 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Densely Connected Convolutional Networks

[...]

Gao Huang¹, Zhuang Liu², Laurens van der Maaten³, Kilian Q. Weinberger¹•Institutions (3)

Cornell University¹, Tsinghua University², Facebook³

21 Jul 2017

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

...read moreread less

Abstract: Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections—one between each layer and its subsequent layer—our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.

...read moreread less

27,821 citations

Journal Article•DOI•

A Comprehensive Survey on Graph Neural Networks

[...]

Zonghan Wu¹, Shirui Pan², Fengwen Chen¹, Guodong Long¹, Chengqi Zhang¹, Philip S. Yu³ - Show less +2 more•Institutions (3)

University of Technology, Sydney¹, Monash University, Clayton campus², University of Illinois at Chicago³

01 Jan 2021-IEEE Transactions on Neural Networks

TL;DR: This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.

...read moreread less

Abstract: Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications, where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on the existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this article, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNs, convolutional GNNs, graph autoencoders, and spatial–temporal GNNs. We further discuss the applications of GNNs across various domains and summarize the open-source codes, benchmark data sets, and model evaluation of GNNs. Finally, we propose potential research directions in this rapidly growing field.

...read moreread less

4,584 citations

Journal Article•DOI•

Taking the Human Out of the Loop: A Review of Bayesian Optimization

[...]

Bobak Shahriari¹, Kevin Swersky², Ziyu Wang³, Ryan P. Adams⁴, Nando de Freitas³ - Show less +1 more•Institutions (4)

University of British Columbia¹, University of Toronto², University of Oxford³, Harvard University⁴

01 Jan 2016

TL;DR: This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.

...read moreread less

Abstract: Big Data applications are typically associated with systems involving large numbers of users, massive complex software systems, and large-scale heterogeneous computing and storage architectures. The construction of such systems involves many distributed design choices. The end products (e.g., recommendation systems, medical analysis tools, real-time game engines, speech recognizers) thus involve many tunable configuration parameters. These parameters are often specified and hard-coded into the software by various developers or teams. If optimized jointly, these parameters can result in significant improvements. Bayesian optimization is a powerful tool for the joint optimization of design choices that is gaining great popularity in recent years. It promises greater automation so as to increase both product quality and human productivity. This review paper introduces Bayesian optimization, highlights some of its methodological aspects, and showcases a wide range of applications.

...read moreread less

3,703 citations

Journal Article•DOI•

Federated Machine Learning: Concept and Applications

[...]

Qiang Yang¹, Yang Liu, Tianjian Chen, Yongxin Tong²•Institutions (2)

Hong Kong University of Science and Technology¹, Beihang University²

28 Jan 2019-ACM Transactions on Intelligent Systems and Technology

TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.

...read moreread less

Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

...read moreread less

2,593 citations

Proceedings Article•DOI•

Membership Inference Attacks Against Machine Learning Models

[...]

Reza Shokri¹, Marco Stronati², Congzheng Song¹, Vitaly Shmatikov¹•Institutions (2)

Cornell University¹, French Institute for Research in Computer Science and Automation²

22 May 2017

TL;DR: This work quantitatively investigates how machine learning models leak information about the individual data records on which they were trained and empirically evaluates the inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon.

...read moreread less

Abstract: We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model's predictions on the inputs that it trained on versus the inputs that it did not train on. We empirically evaluate our inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon. Using realistic datasets and classification tasks, including a hospital discharge dataset whose membership is sensitive from the privacy perspective, we show that these models can be vulnerable to membership inference attacks. We then investigate the factors that influence this leakage and evaluate mitigation strategies.

...read moreread less

2,059 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse