Home
/
Authors
/
Jean Pouget-Abadie

Author

Jean Pouget-Abadie

Other affiliations: Google, Harvard University

Bio: Jean Pouget-Abadie is an academic researcher from Université de Montréal. The author has contributed to research in topics: Graph (abstract data type) & Computer science. The author has an hindex of 10, co-authored 21 publications receiving 32708 citations. Previous affiliations of Jean Pouget-Abadie include Google & Harvard University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Causal Inference with Bipartite Designs

[...]

Nick Doudchenko, Minzhengxiong Zhang¹, Evgeni Drynkin, Edoardo M. Airoldi², Vahab Mirrokni³, Jean Pouget-Abadie - Show less +2 more•Institutions (3)

Temple University¹, Harvard University², Google³

16 Nov 2020-Social Science Research Network

TL;DR: The authors leverage the generalized propensity score literature to obtain unbiased estimates of causal effects for bipartite experiments under a standard set of assumptions, and also discuss the construction of confidence sets with proper coverage probabilities.

...read moreread less

Abstract: Bipartite experiments are a recent object of study in causal inference, whereby treatment is applied to one set of units and outcomes of interest are measured on a different set of units. These experiments are particularly useful in settings where strong interference effects occur between units of a bipartite graph. In market experiments, for example, assigning treatment at the seller-level and measuring outcomes at the buyer-level (or vice-versa) may lead to causal models that better account for the interference that naturally occurs between buyers and sellers. While bipartite experiments have been shown to improve the estimation of causal effects in certain settings, the analysis must be done carefully so as to not introduce unnecessary bias. We leverage the generalized propensity score literature to show that we can obtain unbiased estimates of causal effects for bipartite experiments under a standard set of assumptions. We also discuss the construction of confidence sets with proper coverage probabilities. We evaluate these methods using a bipartite graph from a publicly available dataset studied in previous work on bipartite experiments, showing through simulations a significant bias reduction and improved coverage.

...read moreread less

8 citations

Posted Content•

Optimizing cluster-based randomized experiments under a monotonicity assumption

[...]

Jean Pouget-Abadie, David C. Parkes, Vahab Mirrokni, Edoardo M. Airoldi

07 Mar 2018-arXiv: Methodology

TL;DR: A monotonicity condition is introduced under which a novel two-stage experimental design allows us to determine which of two cluster-based designs yields the least biased estimator.

...read moreread less

Abstract: Cluster-based randomized experiments are popular designs for mitigating the bias of standard estimators when interference is present and classical causal inference and experimental design assumptions (such as SUTVA or ITR) do not hold. Without an exact knowledge of the interference structure, it can be challenging to understand which partitioning of the experimental units is optimal to minimize the estimation bias. In the paper, we introduce a monotonicity condition under which a novel two-stage experimental design allows us to determine which of two cluster-based designs yields the least biased estimator. We then consider the setting of online advertising auctions and show that reserve price experiments verify the monotonicity condition and the proposed framework and methodology applies. We validate our findings on an advertising auction dataset.

...read moreread less

3 citations

Posted Content•

Randomized Experimental Design via Geographic Clustering.

[...]

David Rolnick¹, Kevin Aydin², Jean Pouget-Abadie², Shahab Kamali², Vahab Mirrokni², Amir Najmi² - Show less +2 more•Institutions (2)

University of Pennsylvania¹, Google²

11 Nov 2016-arXiv: Social and Information Networks

TL;DR: This paper presents GeoCUTS, a novel algorithm that forms geographical clusters to minimize interference while preserving balance in cluster size, and uses a random sample of anonymized traffic from Google Search to form a graph representing user movements, then constructs a geographically coherent clustering of the graph.

...read moreread less

Abstract: Web-based services often run randomized experiments to improve their products. A popular way to run these experiments is to use geographical regions as units of experimentation, since this does not require tracking of individual users or browser cookies. Since users may issue queries from multiple geographical locations, geo-regions cannot be considered independent and interference may be present in the experiment. In this paper, we study this problem, and first present GeoCUTS, a novel algorithm that forms geographical clusters to minimize interference while preserving balance in cluster size. We use a random sample of anonymized traffic from Google Search to form a graph representing user movements, then construct a geographically coherent clustering of the graph. Our main technical contribution is a statistical framework to measure the effectiveness of clusterings. Furthermore, we perform empirical evaluations showing that the performance of GeoCUTS is comparable to hand-crafted geo-regions with respect to both novel and existing metrics.

...read moreread less

3 citations

Dissertation•

Dealing with Interference on Experimentation Platforms

[...]

Jean Pouget-Abadie

16 Sep 2018

TL;DR: This work explores how multi-level designs, Experiment-of-Experiments, can allow us to detect and mitigate the effects of interference on experimentation platforms, and develops a design-based statistical test for the no-interference assumption.

...read moreread less

Abstract: The theory of causal inference, as formalized by the potential outcomes framework, relies on an assumption that the experimental units are independent. When independence is not tenable, we say there is interference, and the core results of causal inference can no longer be guaranteed. Recent research efforts have focused on extending the theory to a setting where interference is present. The many advantages of experimentation platforms over more traditional settings of causal inference—no issue of non-compliance, large number of experimental units, ease of collecting outcomes over the course of an experiment—make them an ideal setting for studying causality with interference. With this setting in mind, we explore how multi-level designs, Experiment-of-Experiments, can allow us to detect and mitigate the effects of interference on experimentation platforms. In particular, we develop a design-based statistical test for the no-interference assumption. We further design an empirical procedure for comparing the effectiveness of cluster-based randomized designs. Finally, we show that randomized saturation designs can be optimized to improve the bias and variance of standard estimators, and extend these results to a new category of randomized designs: optimized saturation designs.

...read moreread less

2 citations

Posted Content•

Design and Analysis of Bipartite Experiments under a Linear Exposure-Response Model

[...]

Christopher Harshaw, Fredrik Sävje, David Eisenstat, Vahab Mirrokni, Jean Pouget-Abadie - Show less +1 more

11 Mar 2021-arXiv: Methodology

TL;DR: In this paper, an unbiased linear estimator of the average treatment effect in the bipartite experimental framework is proposed, which is consistent and asymptotically normal, provided that the graph is sufficiently sparse.

...read moreread less

Abstract: In a bipartite experiment, units that are assigned treatments differ from the units for which we measure outcomes. The two groups of units are connected by a bipartite graph, governing how the treated units can affect the outcome units. Often motivated by experiments in marketplaces, the bipartite experimental framework has been used for example to investigate the causal effects of supply-side changes on demand-side behavior. In this paper, we consider the problem of estimating the average total treatment effect in the bipartite experimental framework under a linear exposure-response model. We introduce the Exposure Reweighted Linear (ERL) Estimator, an unbiased linear estimator of the average treatment effect in this setting. We show that the estimator is consistent and asymptotically normal, provided that the bipartite graph is sufficiently sparse. We derive a variance estimator which facilitates confidence intervals based on a normal approximation. In addition, we introduce Exposure-Design, a cluster-based design which aims to increase the precision of the ERL estimator by realizing desirable exposure distributions. Finally, we demonstrate the effectiveness of the described estimator and design with an application using a publicly available Amazon user-item review graph.

...read moreread less

2 citations

…
1
2
3
4
5

Cited by

PDF

Open Access

More filters

Proceedings Article•

Neural Machine Translation by Jointly Learning to Align and Translate

[...]

Dzmitry Bahdanau¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Jacobs University Bremen¹, Université de Montréal²

01 Jan 2015

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Abstract: Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

...read moreread less

20,027 citations

Posted Content•

Neural Machine Translation by Jointly Learning to Align and Translate

[...]

Dzmitry Bahdanau¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Jacobs University Bremen¹, Université de Montréal²

01 Sep 2014-arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

14,077 citations

Proceedings Article•

Sequence to Sequence Learning with Neural Networks

[...]

Ilya Sutskever¹, Oriol Vinyals¹, Quoc V. Le¹•Institutions (1)

Google¹

08 Dec 2014

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Abstract: Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

12,299 citations

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

11,958 citations

Posted Content•

Sequence to Sequence Learning with Neural Networks

[...]

Ilya Sutskever¹, Oriol Vinyals¹, Quoc V. Le¹•Institutions (1)

Google¹

10 Sep 2014-arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

Abstract: Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT'14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less

11,936 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse