Open AccessJournal Article
Counterfactual reasoning and learning systems: the example of computational advertising
Léon Bottou,Jonas Peters,Joaquin Quiñonero-Candela,Denis X. Charles,D. Max Chickering,Elon Portugaly,Dipankar Ray,Patrice Y. Simard,Ed Snelson +8 more
Reads0
Chats0
TLDR
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system and allow both humans and algorithms to select the changes that would have improved the system performance.Abstract:
This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experiments on the ad placement system associated with the Bing search engine.read more
Citations
More filters
Posted Content
Causal Bandits: Learning Good Interventions via Causal Inference
TL;DR: In this paper, the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment has been studied and a new algorithm that exploits the causal feedback and proves a bound on its simple regret is proposed.
Proceedings ArticleDOI
Effective Evaluation Using Logged Bandit Feedback from Multiple Loggers
TL;DR: In this article, the authors address the question of how to estimate the performance of a new target policy when we have log data from multiple historic policies and show that combining data from different logging policies can be highly suboptimal.
Proceedings ArticleDOI
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Carolin Lawrence,Stefan Riezler +1 more
TL;DR: The authors propose counterfactual learning from human bandit feedback to improve neural semantic parsing, where user feedback on the quality of outputs of a historic system is logged and used to improve a target system.
Proceedings ArticleDOI
Temporal-Contextual Recommendation in Real-Time
TL;DR: This work presents a black-box recommender system that can adapt to a diverse set of scenarios without the need for manual tuning, and introduces a compact model, which is called hierarchical recurrent network with meta data (HRNN-meta) to address the real-time and diverse metadata needs.
Proceedings Article
Doubly robust off-policy evaluation with shrinkage
TL;DR: This work proposes a new framework for designing estimators for off-policy evaluation in contextual bandits based on the asymptotically optimal doubly robust estimator, but shrink the importance weights to minimize a bound on the mean squared error, which results in a better bias-variance tradeoff in finite samples.
References
More filters
Book
Reinforcement Learning: An Introduction
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
MonographDOI
Causality: models, reasoning, and inference
TL;DR: The art and science of cause and effect have been studied in the social sciences for a long time as mentioned in this paper, see, e.g., the theory of inferred causation, causal diagrams and the identification of causal effects.
Journal ArticleDOI
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Book
Introduction to Reinforcement Learning
TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.