Showing papers by "Yishay Mansour published in 2013"

PDF

Open Access

Proceedings Article•DOI•

Regret minimization for reserve prices in second-price auctions

[...]

Nicolò Cesa-Bianchi¹, Claudio Gentile, Yishay Mansour²•Institutions (2)

University of Milan¹, Tel Aviv University²

06 Jan 2013

TL;DR: A regret minimization algorithm for setting the reserve price in a sequence of second-price auctions, under the assumption that all bids are independently drawn from the same unknown and arbitrary distribution, achieves a regret of Õ(√T) in asequence of T auctions.

...read moreread less

Abstract: We show a regret minimization algorithm for setting the reserve price in second-price auctions. We make the assumption that all bidders draw their bids from the same unknown and arbitrary distribution. Our algorithm is computationally efficient, and achieves a regret of O(√T), even when the number of bidders is stochastic with a known distribution.

...read moreread less

130 citations

Proceedings Article•

From Bandits to Experts: A Tale of Domination and Independence

[...]

Noga Alon¹, Nicolò Cesa-Bianchi², Claudio Gentile³, Yishay Mansour¹•Institutions (3)

Tel Aviv University¹, University of Milan², University of Insubria³

05 Dec 2013

TL;DR: This work characterization of regret in the directed observability model in terms of the dominating and independence numbers of the observability graph (which must be accessible before selecting an action) and in the undirected case it is shown that the learner can achieve optimal regret without even accessing the observable graph before selected an action.

...read moreread less

Abstract: We consider the partial observability model for multi-armed bandits, introduced by Mannor and Shamir [14]. Our main result is a characterization of regret in the directed observability model in terms of the dominating and independence numbers of the observability graph (which must be accessible before selecting an action). In the undirected case, we show that the learner can achieve optimal regret without even accessing the observability graph before selecting an action. Both results are shown using variants of the Exp3 algorithm operating on the observability graph in a time-efficient manner.

...read moreread less

71 citations

Book Chapter•DOI•

A Local Computation Approximation Scheme to Maximum Matching

[...]

Yishay Mansour¹, Shai Vardi¹•Institutions (1)

Tel Aviv University¹

21 Aug 2013

TL;DR: In this paper, a polylogarithmic local computation matching algorithm is presented which guarantees a (1 - e)-approximation to the maximum matching in graphs of bounded degree.

...read moreread less

Abstract: We present a polylogarithmic local computation matching algorithm which guarantees a (1 - e)-approximation to the maximum matching in graphs of bounded degree

...read moreread less

57 citations

Journal Article•DOI•

Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior

[...]

Maria-Florina Balcan¹, Avrim Blum², Yishay Mansour³•Institutions (3)

Georgia Institute of Technology¹, Carnegie Mellon University², Tel Aviv University³

29 Jan 2013-SIAM Journal on Computing

TL;DR: In this article, a theory of how well-motivated multiagent dynamics can make use of global information about the game, which might be common knowledge or injected into the system by a helpful central agency, is initiated.

...read moreread less

Abstract: Many natural games have a dramatic difference between the quality of their best and worst Nash equilibria, even in pure strategies. Yet, nearly all results to date on dynamics in games show only convergence to some equilibrium, especially within a polynomial number of steps. In this work we initiate a theory of how well-motivated multiagent dynamics can make use of global information about the game---which might be common knowledge or injected into the system by a helpful central agency---and show that in a wide range of interesting games this can allow the dynamics to quickly reach (within a polynomial number of steps) states of cost comparable to the best Nash equilibrium. We present several natural models for dynamics that can use such additional information and analyze their ability to reach low-cost states for two important and widely studied classes of potential games: network design with fair cost-sharing and party affiliation games (which include consensus and cut games). From the perspective of a...

...read moreread less

37 citations

Posted Content•

From Bandits to Experts: A Tale of Domination and Independence

[...]

Noga Alon¹, Nicolò Cesa-Bianchi², Claudio Gentile³, Yishay Mansour¹•Institutions (3)

Tel Aviv University¹, University of Milan², University of Insubria³

17 Jul 2013-arXiv: Learning

TL;DR: In this article, the authors considered the partial observability model for multi-armed bandits, introduced by Mannor and Shamir, and characterized regret in directed observability in terms of the dominating and independence numbers of the observability graph.

...read moreread less

Abstract: We consider the partial observability model for multi-armed bandits, introduced by Mannor and Shamir. Our main result is a characterization of regret in the directed observability model in terms of the dominating and independence numbers of the observability graph. We also show that in the undirected case, the learner can achieve optimal regret without even accessing the observability graph before selecting an action. Both results are shown using variants of the Exp3 algorithm operating on the observability graph in a time-efficient manner.

...read moreread less

29 citations

Proceedings Article•

Exploiting Ontology Structures and Unlabeled Data for Learning

[...]

Nina Balcan¹, Avrim Blum², Yishay Mansour³•Institutions (3)

Georgia Institute of Technology¹, Carnegie Mellon University², Tel Aviv University³

16 Jun 2013

TL;DR: It is shown in this model that an ontology, which specifies the relationships between multiple outputs, in some cases is sufficient to completely learn a classification using a large unlabeled data source.

...read moreread less

Abstract: We present and analyze a theoretical model designed to understand and explain the effectiveness of ontologies for learning multiple related tasks from primarily unlabeled data. We present both information-theoretic results as well as efficient algorithms. We show in this model that an ontology, which specifies the relationships between multiple outputs, in some cases is sufficient to completely learn a classification using a large unlabeled data source.

...read moreread less

28 citations

Posted Content•

A Local Computation Approximation Scheme to Maximum Matching

[...]

Yishay Mansour¹, Shai Vardi¹•Institutions (1)

Tel Aviv University¹

20 Jun 2013-arXiv: Data Structures and Algorithms

TL;DR: A polylogarithmic local computation matching algorithm which guarantees a (1 - e)-approximation to the maximum matching in graphs of bounded degree is presented.

...read moreread less

Abstract: We present a polylogarithmic local computation matching algorithm which guarantees a $(1-\eps)$-approximation to the maximum matching in graphs of bounded degree.

...read moreread less

22 citations

Proceedings Article•

Regret Minimization for Branching Experts

[...]

Eyal Gofer¹, Nicolò Cesa-Bianchi², Claudio Gentile³, Yishay Mansour¹•Institutions (3)

Tel Aviv University¹, University of Milan², University of Insubria³

13 Jun 2013

TL;DR: This work studies regret minimization bounds in which the dependence on the number of experts is replaced by measures of the realized complexity of the expert class, which serves as a measure of complexity.

...read moreread less

Abstract: We study regret minimization bounds in which the dependence on the number of experts is replaced by measures of the realized complexity of the expert class. The measures we consider are defined in retrospect given the realized losses. We concentrate on two interesting cases. In the first, our measure of complexity is the number of different “leading experts”, namely, experts that were best at some point in time. We derive regret bounds that depend only on this measure, independent of the total number of experts. We also consider a case where all experts remain grouped in just a few clusters in terms of their realized cumulative losses. Here too, our regret bounds depend only on the number of clusters determined in retrospect, which serves as a measure of complexity. Our results are obtained as special cases of a more general analysis for a setting of branching experts, where the set of experts may grow over time according to a tree-like structure, determined by an adversary. For this setting of branching experts, we give algorithms and analysis that cover both the full information and the bandit scenarios.

...read moreread less

20 citations

Journal Article•DOI•

Beyond myopic best response (in Cournot competition)

[...]

Amos Fiat¹, Elias Koutsoupias², Elias Koutsoupias³, Katrina Ligett⁴, Yishay Mansour¹, Svetlana Olonetsky¹ - Show less +2 more•Institutions (4)

Tel Aviv University¹, National and Kapodistrian University of Athens², University of Oxford³, California Institute of Technology⁴

30 Dec 2013-Games and Economic Behavior

TL;DR: In this paper, the authors consider a non-myopic version of Cournot competition, where each firm selects either profit maximization (as in the classical model) or revenue maximization by masquerading as a firm with zero production costs.

...read moreread less

19 citations

Posted Content•

An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering

[...]

Michael Kearns¹, Yishay Mansour², Andrew Y. Ng³•Institutions (3)

AT&T Labs¹, Tel Aviv University², Carnegie Mellon University³

06 Feb 2013-arXiv: Learning

TL;DR: In this article, a simple decomposition of the expected distortion is presented, showing that K-means and EM must implicitly manage a trade-off between how similar the data assigned to each cluster are, and how the data are balanced among the clusters.

...read moreread less

Abstract: Assignment methods are at the heart of many algorithms for unsupervised learning and clustering - in particular, the well-known K-means and Expectation-Maximization (EM) algorithms. In this work, we study several different methods of assignment, including the "hard" assignments used by K-means and the ?soft' assignments used by EM. While it is known that K-means minimizes the distortion on the data and EM maximizes the likelihood, little is known about the systematic differences of behavior between the two algorithms. Here we shed light on these differences via an information-theoretic analysis. The cornerstone of our results is a simple decomposition of the expected distortion, showing that K-means (and its extension for inferring general parametric densities from unlabeled sample data) must implicitly manage a trade-off between how similar the data assigned to each cluster are, and how the data are balanced among the clusters. How well the data are balanced is measured by the entropy of the partition defined by the hard assignments. In addition to letting us predict and verify systematic differences between K-means and EM on specific examples, the decomposition allows us to give a rather general argument showing that K ?means will consistently find densities with less "overlap" than EM. We also study a third natural assignment method that we call posterior assignment, that is close in spirit to the soft assignments of EM, but leads to a surprisingly different algorithm.

...read moreread less

17 citations

Proceedings Article•DOI•

Differential pricing with inequity aversion in social networks

[...]

Noga Alon¹, Yishay Mansour¹, Moshe Tenneholtz²•Institutions (2)

Tel Aviv University¹, Technion – Israel Institute of Technology²

16 Jun 2013

TL;DR: The algorithmic problem of maximizing revenue in a network using differential pricing, where the prices offered to neighboring vertices cannot be substantially different, is introduced and it is shown that the optimal pricing can be computed efficiently, even for arbitrary revenue functions.

...read moreread less

Abstract: We introduce and study the algorithmic problem of maximizing revenue in a network using differential pricing, where the prices offered to neighboring vertices cannot be substantially different. Our most surprising result is that the optimal pricing can be computed efficiently, even for arbitrary revenue functions. In contrast, we show that if one is allowed to introduce discontinuities (by deleting vertices) the optimization problem becomes computationally hard, and we exhibit algorithms for special classes of graphs. We also study a stochastic model, and show that a similar contrast exists there: For pricing without discontinuities the benefit of differential pricing over a single price is negligible, while for differential pricing with discontinuities the difference is substantial.

...read moreread less

Posted Content•

Thompson Sampling for Complex Bandit Problems

[...]

Aditya Gopalan, Shie Mannor, Yishay Mansour

03 Nov 2013-arXiv: Machine Learning

TL;DR: It is proved a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them, and the first nontrivial regret bounds for nonlinear MAX reward feedback from subsets are derived.

...read moreread less

Abstract: We consider stochastic multi-armed bandit problems with complex actions over a set of basic arms, where the decision maker plays a complex action rather than a basic arm in each round. The reward of the complex action is some function of the basic arms' rewards, and the feedback observed may not necessarily be the reward per-arm. For instance, when the complex actions are subsets of the arms, we may only observe the maximum reward over the chosen subset. Thus, feedback across complex actions may be coupled due to the nature of the reward function. We prove a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them. The bound holds for discretely-supported priors over the parameter space and without additional structural properties such as closed-form posteriors, conjugate prior structure or independence across arms. The regret bound scales logarithmically with time but, more importantly, with an improved constant that non-trivially captures the coupling across complex actions due to the structure of the rewards. As applications, we derive improved regret bounds for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear MAX reward feedback from subsets.

...read moreread less

Posted Content•

Local computation mechanism design

[...]

Avinatan Hassidim¹, Yishay Mansour², Shai Vardi²•Institutions (2)

Bar-Ilan University¹, Tel Aviv University²

15 Nov 2013-arXiv: Computer Science and Game Theory

TL;DR: Local Computation Mechanism Design (LCD) as discussed by the authors is a technique for designing game theoretic mechanisms which run in polylogarithmic time and space, where each query can reply to each query with a global feasible solution.

...read moreread less

Abstract: We introduce the notion of Local Computation Mechanism Design - designing game theoretic mechanisms which run in polylogarithmic time and space. Local computation mechanisms reply to each query in polylogarithmic time and space, and the replies to different queries are consistent with the same global feasible solution. In addition, the computation of the payments is also done in polylogarithmic time and space. Furthermore, the mechanisms need to maintain incentive compatibility with respect to the allocation and payments. We present local computation mechanisms for a variety of classical game-theoretical problems: 1. stable matching, 2. job scheduling, 3. combinatorial auctions for unit-demand and k-minded bidders, and 4. the housing allocation problem. For stable matching, some of our techniques may have general implications. Specifically, we show that when the men's preference lists are bounded, we can achieve an arbitrarily good approximation to the stable matching within a fixed number of iterations of the Gale-Shapley algorithm.

...read moreread less

Proceedings Article•DOI•

Implementing the "Wisdom of the Crowd"

[...]

Ilan Kremer¹, Yishay Mansour, Motty Perry²•Institutions (2)

Hebrew University of Jerusalem¹, University of Warwick²

16 Jun 2013

TL;DR: In this article, the authors study a mechanism design model in which agents arrive sequentially and each in turn chooses one action from a set of actions with unknown rewards, and characterize the optimal disclosure policy of a planner whose goal is to maximize social welfare.

...read moreread less

Abstract: We study a novel mechanism design model in which agents arrive sequentially and each in turn chooses one action from a set of actions with unknown rewards. The information revealed by the principal affects the incentives of an agent to explore and generate new information. We characterize the optimal disclosure policy of a planner whose goal is to maximizes social welfare. One interpretation for our result is the implementation of what is known as the 'wisdom of the crowd'. This topic has become more relevant with the rapid adaptation of the Internet over the past decade.

...read moreread less

Posted Content•

Fast Planning in Stochastic Games

[...]

Michael Kearns¹, Yishay Mansour², Satinder Singh¹•Institutions (2)

AT&T Labs¹, Tel Aviv University²

16 Jan 2013-arXiv: Computer Science and Game Theory

TL;DR: A simple generalization of finite-horizon value iteration that computes a Nash strategy for each player in general-sum stochastic games and an algorithm for computing near-Nash equilibria in large or infinite state spaces.

...read moreread less

Abstract: Stochastic games generalize Markov decision processes (MDPs) to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards determined by multiplayer matrix games at each state. We consider the problem of computing Nash equilibria in stochastic games, the analogue of planning in MDPs. We begin by providing a generalization of finite-horizon value iteration that computes a Nash strategy for each player in generalsum stochastic games. The algorithm takes an arbitrary Nash selection function as input, which allows the translation of local choices between multiple Nash equilibria into the selection of a single global Nash equilibrium. Our main technical result is an algorithm for computing near-Nash equilibria in large or infinite state spaces. This algorithm builds on our finite-horizon value iteration algorithm, and adapts the sparse sampling methods of Kearns, Mansour and Ng (1999) to stochastic games. We conclude by descrbing a counterexample showing that infinite-horizon discounted value iteration, which was shown by shaplely to converge in the zero-sum case (a result we give extend slightly here), does not converge in the general-sum case.

...read moreread less

Posted Content•

Exact Inference of Hidden Structure from Sample Data in Noisy-OR Networks

[...]

Michael Kearns¹, Yishay Mansour²•Institutions (2)

AT&T Labs¹, Tel Aviv University²

30 Jan 2013-arXiv: Artificial Intelligence

TL;DR: This work examines some restricted settings in which perfectly reconstruct the hidden structure solely on the basis of observed sample data.

...read moreread less

Abstract: In the literature on graphical models, there has been increased attention paid to the problems of learning hidden structure (see Heckerman [H96] for survey) and causal mechanisms from sample data [H96, P88, S93, P95, F98]. In most settings we should expect the former to be difficult, and the latter potentially impossible without experimental intervention. In this work, we examine some restricted settings in which perfectly reconstruct the hidden structure solely on the basis of observed sample data.

...read moreread less

Posted Content•

On the Complexity of Policy Iteration

[...]

Yishay Mansour¹, Satinder Singh¹•Institutions (1)

AT&T Labs¹

23 Jan 2013-arXiv: Artificial Intelligence

TL;DR: In this article, the authors prove the first non-trivial, worst-case, upper bound on the number of iterations required by policy iteration to converge to the optimal policy.

...read moreread less

Abstract: Decision-making problems in uncertain or stochastic domains are often formulated as Markov decision processes (MDPs). Policy iteration (PI) is a popular algorithm for searching over policy-space, the size of which is exponential in the number of states. We are interested in bounds on the complexity of PI that do not depend on the value of the discount factor. In this paper we prove the first such non-trivial, worst-case, upper bounds on the number of iterations required by PI to converge to the optimal policy. Our analysis also sheds new light on the manner in which PI progresses through the space of policies.

...read moreread less

Posted Content•

Probe Scheduling for Efficient Detection of Silent Failures

[...]

Edith Cohen¹, Edith Cohen², Avinatan Hassidim³, Avinatan Hassidim⁴, Haim Kaplan¹, Yishay Mansour¹, Danny Raz⁴, Danny Raz⁵, Yoav Tzur⁴ - Show less +5 more•Institutions (5)

Tel Aviv University¹, Microsoft², Bar-Ilan University³, Google⁴, Technion – Israel Institute of Technology⁵

04 Feb 2013-arXiv: Networking and Internet Architecture

TL;DR: In this paper, the authors formulate a general model which unifies the treatment of probe scheduling mechanisms, stochastic or deterministic, and different cost objectives - minimizing average detection time (SUM) or worst-case detection times (MAX).

...read moreread less

Abstract: Most discovery systems for silent failures work in two phases: a continuous monitoring phase that detects presence of failures through probe packets and a localization phase that pinpoints the faulty element(s). This separation is important because localization requires significantly more resources than detection and should be initiated only when a fault is present. We focus on improving the efficiency of the detection phase, where the goal is to balance the overhead with the cost associated with longer failure detection times. We formulate a general model which unifies the treatment of probe scheduling mechanisms, stochastic or deterministic, and different cost objectives - minimizing average detection time (SUM) or worst-case detection time (MAX). We then focus on two classes of schedules. {\em Memoryless schedules} -- a subclass of stochastic schedules which is simple and suitable for distributed deployment. We show that the optimal memorlyess schedulers can be efficiently computed by convex programs (for SUM objectives) or linear programs (for MAX objectives), and surprisingly perhaps, are guaranteed to have expected detection times that are not too far off the (NP hard) stochastic optima. {\em Deterministic schedules} allow us to bound the maximum (rather than expected) cost of undetected faults, but like stochastic schedules, are NP hard to optimize. We develop novel efficient deterministic schedulers with provable approximation ratios. An extensive simulation study on real networks, demonstrates significant performance gains of our memoryless and deterministic schedulers over previous approaches. Our unified treatment also facilitates a clear comparison between different objectives and scheduling mechanisms.

...read moreread less

Posted Content•

Nash Convergence of Gradient Dynamics in Iterated General-Sum Games

[...]

Satinder Singh, Michael Kearns, Yishay Mansour

16 Jan 2013-arXiv: Computer Science and Game Theory

TL;DR: This work analyzes the behavior of agents that incrementally adapt their strategy through gradient ascent on expected payoff, in the simple setting of two-player, two-action, iterated general-sum games, and shows that either the agents will converge to Nash equilibrium, or if the strategies themselves do not converge, then their average payoffs will nevertheless converge to the payoffs of a Nash equilibrium.

...read moreread less

Abstract: Multi-agent games are becoming an increasing prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of electronic marketplaces makes the study of computationally simple agents an appealing direction. In this work, we analyze the behavior of agents that incrementally adapt their strategy through gradient ascent on expected payoff, in the simple setting of two-player, two-action, iterated general-sum games, and present a surprising result. We show that either the agents will converge to Nash equilibrium, or if the strategies themselves do not converge, then their average payoffs will nevertheless converge to the payoffs of a Nash equilibrium.

...read moreread less

Book Chapter•DOI•

Scheduling Subset Tests: One-Time, Continuous, and How They Relate

[...]

Edith Cohen¹, Edith Cohen², Haim Kaplan¹, Yishay Mansour²•Institutions (2)

Tel Aviv University¹, Microsoft²

21 Aug 2013

TL;DR: The modeling considered both SUM e and MAX e objectives, which correspond to average or worst-case cover times over elements (weighted by priority), and both one-time testing, where the goal is to detect if a fault is currently present, and continuous testing, performed in the background in order to detect presence of failures soon after they occur.

...read moreread less

Abstract: A test scheduling instance is specified by a set of elements, a set of tests, which are subsets of elements, and numeric priorities assigned to elements. The schedule is a sequence of test invocations with the goal of covering all elements. This formulation had been used to model problems in multiple application domains from network failure detection to broadcast scheduling. The modeling considered both SUM e and MAX e objectives, which correspond to average or worst-case cover times over elements (weighted by priority), and both one-time testing, where the goal is to detect if a fault is currently present, and continuous testing, performed in the background in order to detect presence of failures soon after they occur. Since all variants are NP hard, the focus is on approximations.

...read moreread less

Proceedings Article•DOI•

An empirical study of trading agent robustness

[...]

Shai Hertz¹, Mariano Schain¹, Yishay Mansour¹•Institutions (1)

Tel Aviv University¹

06 May 2013

TL;DR: The robustness of trading agents to deviations from the game's specified environment is investigated and it is indicated that most agents, especially the top-scoring ones, are surprisingly robust.

...read moreread less

Abstract: We study the empirical behavior of trading agents participating in the Ad-Auction game of the Trading Agent Competition (TAC-AA). Aiming to understand the applicability of optimal trading strategies in synthesized environments to real-life settings, we investigate the robustness of the agents to deviations from the game's specified environment. Our results indicate that most agents, especially the top-scoring ones, are surprisingly robust. In addition, using the game logs, we derive for each agent a strategic fingerprint and show that it almost uniquely identifies it. Finally, we show that although the Machine Learning modeling in TAC-AA is inherently inaccurate, further improvement in modeling accuracy is likely to have only a limited contribution to the overall performance of TAC-AA agents.

...read moreread less