scispace - formally typeset
Search or ask a question

Showing papers by "Yishay Mansour published in 2012"


16 Apr 2012
TL;DR: In this article, the authors consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved, providing general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teachingdimension and mistake bound of a class play an important role.
Abstract: We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1= for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

133 citations


Posted Content
TL;DR: In this paper, the authors consider PAC-learning from distributed data and analyze fundamental communication complexity questions involved, and provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teachingdimension and mistake bound of a class play an important role.
Abstract: We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/epsilon for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

94 citations


Proceedings Article
03 Dec 2012
TL;DR: This work considers a setting where a very large number of related tasks with few examples from each individual task, and considers learning a small pool of shared hypotheses, which derives VC dimension generalization bounds for the model based on the number of tasks, shared hypothesis and the VC dimension of the hypotheses class.
Abstract: In this work we consider a setting where we have a very large number of related tasks with few examples from each individual task. Rather than either learning each task individually (and having a large generalization error) or learning all the tasks together using a single hypothesis (and suffering a potentially large inherent error), we consider learning a small pool of shared hypotheses. Each task is then mapped to a single hypothesis in the pool (hard association). We derive VC dimension generalization bounds for our model, based on the number of tasks, shared hypothesis and the VC dimension of the hypotheses class. We conducted experiments with both synthetic problems and sentiment of reviews, which strongly support our approach.

50 citations


Posted Content
TL;DR: In this paper, the authors introduce a general representation of large-population games in which each player s influence ON the others is centralized and limited, but may otherwise be arbitrary, which significantly generalizes the class known as congestion games in a natural way.
Abstract: We introduce a general representation of large-population games in which each player s influence ON the others IS centralized AND limited, but may otherwise be arbitrary.This representation significantly generalizes the class known AS congestion games IN a natural way.Our main results are provably correct AND efficient algorithms FOR computing AND learning approximate Nash equilibria IN this general framework.

49 citations


Posted Content
TL;DR: A general method for converting online algorithms to local computation algorithms, by selecting a random permutation of the input, and simulating running the online algorithm is proposed, which gives a local computation algorithm for maximal matching in graphs of bounded degree, which runs in time and space O(log3n).
Abstract: We propose a general method for converting online algorithms to local computation algorithms by selecting a random permutation of the input, and simulating running the online algorithm. We bound the number of steps of the algorithm using a query tree, which models the dependencies between queries. We improve previous analyses of query trees on graphs of bounded degree, and extend the analysis to the cases where the degrees are distributed binomially, and to a special case of bipartite graphs. Using this method, we give a local computation algorithm for maximal matching in graphs of bounded degree, which runs in time and space O(log^3 n). We also show how to convert a large family of load balancing algorithms (related to balls and bins problems) to local computation algorithms. This gives several local load balancing algorithms which achieve the same approximation ratios as the online algorithms, but run in O(log n) time and space. Finally, we modify existing local computation algorithms for hypergraph 2-coloring and k-CNF and use our improved analysis to obtain better time and space bounds, of O(log^4 n), removing the dependency on the maximal degree of the graph from the exponent.

47 citations


Book ChapterDOI
09 Jul 2012
TL;DR: In this paper, the authors propose a general method for converting online algorithms to local computation algorithms, by selecting a random permutation of the input, and simulating running the online algorithm.
Abstract: We propose a general method for converting online algorithms to local computation algorithms, by selecting a random permutation of the input, and simulating running the online algorithm. We bound the number of steps of the algorithm using a query tree, which models the dependencies between queries. We improve previous analyses of query trees on graphs of bounded degree, and extend this improved analysis to the cases where the degrees are distributed binomially, and to a special case of bipartite graphs. Using this method, we give a local computation algorithm for maximal matching in graphs of bounded degree, which runs in time and space O(log3n). We also show how to convert a large family of load balancing algorithms (related to balls and bins problems) to local computation algorithms. This gives several local load balancing algorithms which achieve the same approximation ratios as the online algorithms, but run in O(logn) time and space. Finally, we modify existing local computation algorithms for hypergraph 2-coloring and k-CNF and use our improved analysis to obtain better time and space bounds, of O(log4n), removing the dependency on the maximal degree of the graph from the exponent.

42 citations


Posted Content
TL;DR: This work describes the challenges of designing a suitable auction, and presents a simple auction called the Optional Second Price (OSP) auction that is currently used in Doubleclick Ad Exchange.
Abstract: Display advertisements on the web are sold via ad exchanges that use real time auction. We describe the challenges of designing a suitable auction, and present a simple auction called the Optional Second Price (OSP) auction that is currently used in Doubleclick Ad Exchange.

42 citations


Journal ArticleDOI
TL;DR: In this article, the authors formalize the notion and study properties of reliable classifiers in the spirit of agnostic learning, a PAC-like model where no assumption is made on the function being learned.

37 citations


01 Jan 2012
TL;DR: The authors derived a generalization bound for domain adaptation by using the properties of robust algorithms based on?-shift, a measure of prior knowledge regarding the similarity of source and target domain distributions.
Abstract: We derive a generalization bound for domain adaptation by using the properties of robust algorithms. Our new bound depends on ?-shift, a measure of prior knowledge regarding the similarity of source and target domain distributions. Based on the generalization bound, we design SVM variants for binary classification and regression domain adaptation algorithms.

27 citations


Journal ArticleDOI
TL;DR: This work presents a randomized competitive online algorithm for the weighted case with general capacity (namely, where sets may have different values, and elements arrive with different multiplicities), and proves a matching lower bound on the competitive ratio for any randomized online algorithm.
Abstract: In online set packing (OSP), elements arrive online, announcing which sets they belong to, and the algorithm needs to assign each element, upon arrival, to one of its sets. The goal is to maximize the number of sets that are assigned all their elements: a set that misses even a single element is deemed worthless. This is a natural online optimization problem that abstracts allocation of scarce compound resources, e.g., multipacket data frames in communication networks. We present a randomized competitive online algorithm for the weighted case with general capacity (namely, where sets may have different values, and elements arrive with different multiplicities). We prove a matching lower bound on the competitive ratio for any randomized online algorithm. Our bounds are expressed in terms of the maximum set size and the maximum number of sets an element belongs to. We also present refined bounds that depend on the uniformity of these parameters.

26 citations


Proceedings ArticleDOI
25 Mar 2012
TL;DR: This paper considers a networking application where multiple commodities compete over the capacity of a network, and the only known way of finding a max-min fair allocation requires an iterative solution of multiple linear programs.
Abstract: Often one would like to allocate shared resources in a fair way. A common and well studied notion of fairness is Max-Min Fairness, where we first maximize the smallest allocation, and subject to that the second smallest, and so on. We consider a networking application where multiple commodities compete over the capacity of a network. In our setting each commodity has multiple possible paths to route its demand (for example, a network using MPLS tunneling). In this setting, the only known way of finding a max-min fair allocation requires an iterative solution of multiple linear programs. Such an approach, although polynomial time, scales badly with the size of the network, the number of demands, and the number of paths. More importantly, a network operator has limited control and understanding of the inner working of the algorithm. Finally, this approach is inherently centralized and cannot be implemented via a distributed protocol.

Proceedings Article
16 Jun 2012
TL;DR: In this paper, the authors consider PAC-learning from distributed data and analyze fundamental communication complexity questions involved, and provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teachingdimension and mistake bound of a class play an important role.
Abstract: We consider the problem of PAC-learning from distributed data and analyze fundamental communication complexity questions involved. We provide general upper and lower bounds on the amount of communication needed to learn well, showing that in addition to VC-dimension and covering number, quantities such as the teaching-dimension and mistake-bound of a class play an important role. We also present tight results for a number of common concept classes including conjunctions, parity functions, and decision lists. For linear separators, we show that for non-concentrated distributions, we can use a version of the Perceptron algorithm to learn with much less communication than the number of updates given by the usual margin bound. We also show how boosting can be performed in a generic manner in the distributed setting to achieve communication with only logarithmic dependence on 1/epsilon for any concept class, and demonstrate how recent work on agnostic learning from class-conditional queries can be used to achieve low communication in agnostic settings as well. We additionally present an analysis of privacy, considering both differential privacy and a notion of distributional privacy that is especially appealing in this context.

Book ChapterDOI
09 Jul 2012
TL;DR: A novel secure and highly efficient method for validating correctness of the output of a transaction while keeping input values secret and for preparation of any number of committed-to split representations of the n input values is employed.
Abstract: Zero Knowledge Proofs (ZKPs) are one of the most striking innovations in theoretical computer science. In practice, the prevalent ZKP methods are, at times, too complicated to be useful for real-life applications. In this paper we present a practically efficient method for ZKPs which has a wide range applications. Specifically, motivated by the need to provide an upon-demand efficient validation of various financial transactions (e.g., the high-volume Internet auctions), we have developed a novel secure and highly efficient method for validating correctness of the output of a transaction while keeping input values secret. The method applies to input values which are publicly committed to by employing generic commitment functions (even input values submitted using tamper-proof hardware solely with input/ output access can be used.) We call these: strictly black box [SBB] commitments. Hence these commitments are typically much faster than public-key ones, and are the only cryptographic/ security tool we give the poly-time players, throughout. The general problem we solve in this work is: Let SLC be a publicly known staight line computation on n input values taken from a finite field and having k output values. The inputs are publicly committed to in a SBB manner. An Evaluator performs the SLC on the inputs and announces the output values. Upon demand the Evaluator, or a Prover acting on his behalf, can present to a Verifier a proof of correctness of the announced output values. This is done in a manner that (1) The input values as well as all intermediate values of the SLC remain information theoretically secret. (2) The probability that the Verifier will accept a false claim of correctness of the output values can be made exponentially small. (3) The Prover can supply any required number of proofs of correctness to multiple Verifiers. (4) The method is highly efficient. The application to financial processes is straight forward. To this end (1) we first use a novel technique for representation of values from a finite field which we call "split representation", the two coordinates of the split representation are generically committed to; (2) next, the SLC is augmented by the Prover into a "translation" which is presented to the Verifier as a sequence of generically committed split representations of values; (3) using the translation, the Prover and Verifier conduct a secrecy preserving proof of correctness of the announced SLC output values; (4) in order to exponentially reduce the probability of cheating by the Prover and also to enable multiple proofs, a novel highly efficient method for preparation of any number of committed-to split representations of the n input values is employed. The extreme efficiency of these ZK methods is of decisive importance for large volume applications. Secrecy preserving validation of announced results of Vickrey auctions is our demonstrative example.

Journal ArticleDOI
TL;DR: A simple online distributed randomized algorithm is presented, and it is proved that in any scenario, its expected goodput is ?

Journal ArticleDOI
01 Jan 2012-Networks
TL;DR: Each client suffers a delay, that is, the sum of the network delay and the congestion delay at this server, a nondecreasing function of the number of clients assigned to the server.
Abstract: Problems dealing with assignment of clients to servers have been widely studied. However, they usually do not model the fact that the delay incurred by a client is a function of both the distance to the assigned server and the load on this server, under a given assignment. We study a problem referred to as the load-distance balancing (LDB) problem, where the objective is assigning a set of clients to a set of given servers. Each client suffers a delay, that is, the sum of the network delay (which is proportional to the distance to its server) and the congestion delay at this server, a nondecreasing function of the number of clients assigned to the server. We address two flavors of LDB—the first one seeking to minimize the maximum incurred delay, and the second one targeted for minimizing the average delay. For the first variation, we present hardness results, a best possible approximation algorithm, and an optimal algorithm for a special case of linear placement of clients and servers. For the second one, we show the problem is NP-hard in general, and present a 2-approximation for concave delay functions and an exact algorithm, if the delay function is convex. We also consider the game theoretic version of the second problem and show the price of stability of the game is at most 2 and at least 4/3. © 2011 Wiley Periodicals, Inc. NETWORKS, 2012

Book ChapterDOI
29 Oct 2012
TL;DR: This work lower bound the individual sequence anytime regret of a large family of online algorithms and shows that bounds on anytime regret imply a lower bound on the price of "at the money" call options in an arbitrage-free market.
Abstract: In this work, we lower bound the individual sequence anytime regret of a large family of online algorithms. This bound depends on the quadratic variation of the sequence, QT, and the learning rate. Nevertheless, we show that any learning rate that guarantees a regret upper bound of $O(\sqrt{Q_T})$ necessarily implies an $\Omega(\sqrt{Q_T})$ anytime regret on any sequence with quadratic variation QT. The algorithms we consider are linear forecasters whose weight vector at time t+1 is the gradient of a concave potential function of cumulative losses at time t. We show that these algorithms include all linear Regularized Follow the Leader algorithms. We prove our result for the case of potentials with negative definite Hessians, and potentials for the best expert setting satisfying some natural regularity conditions. In the best expert setting, we give our result in terms of the translation-invariant relative quadratic variation. We apply our lower bounds to Randomized Weighted Majority and to linear cost Online Gradient Descent. We show that bounds on anytime regret imply a lower bound on the price of "at the money" call options in an arbitrage-free market. Given a lower bound Q on the quadratic variation of a stock price, we give an $\Omega(\sqrt{Q})$ lower bound on the option price, for Q<0.5. This lower bound has the same asymptotic behavior as the Black-Scholes pricing and improves a previous Ω(Q) result given in [4].

Book ChapterDOI
04 Jun 2012
TL;DR: Brand advertising through web display ads, aimed at driving up brand awareness and purchase intentions, is the cornerstone of the Internet economic ecosystem and gave rise to an array of new interconnected entities offering added value to publishers and advertisers.
Abstract: Brand advertising through web display ads, aimed at driving up brand awareness and purchase intentions, is the cornerstone of the Internet economic ecosystem. The ever increasing penetration of the Internet and recent technological advances allow for cost effective targeting and gave rise to an array of new interconnected entities - e.g., the pivotal Ad Exchange (AdX) - offering added value to publishers and advertisers.

Posted Content
TL;DR: In this paper, the authors extend previous multiple source loss guarantees based on distribution weighted combinations to arbitrary target distributions P, not necessarily mixtures of the source distributions, and prove a lower bound.
Abstract: This paper presents a novel theoretical study of the general problem of multiple source adaptation using the notion of Renyi divergence. Our results build on our previous work [12], but significantly broaden the scope of that work in several directions. We extend previous multiple source loss guarantees based on distribution weighted combinations to arbitrary target distributions P, not necessarily mixtures of the source distributions, analyze both known and unknown target distribution cases, and prove a lower bound. We further extend our bounds to deal with the case where the learner receives an approximate distribution for each source instead of the exact one, and show that similar loss guarantees can be achieved depending on the divergence between the approximate and true distributions. We also analyze the case where the labeling functions of the source domains are somewhat different. Finally, we report the results of experiments with both an artificial data set and a sentiment analysis task, showing the performance benefits of the distribution weighted combinations and the quality of our bounds based on the Renyi divergence.

Proceedings ArticleDOI
17 Jan 2012
TL;DR: The properties of Nash Equilibria of non-myopic Cournot competition with linear demand functions are studied and it is shown that simple best response dynamics will produce such an equilibrium, and that for some natural dynamics this convergence is within linear time.
Abstract: A Nash Equilibrium is a joint strategy profile at which each agent myopically plays a best response to the other agents' strategies, ignoring the possibility that deviating from the equilibrium could lead to an avalanche of successive changes by other agents. However, such changes could potentially be beneficial to the agent, creating incentive to act non-myopically, so as to take advantage of others' responses.To study this phenomenon, we consider a non-myopic Cournot competition, where each firm selects whether it wants to maximize profit (as in the classical Cournot competition) or to maximize revenue (by masquerading as a firm with zero production costs).The key observation is that profit may actually be higher when acting to maximize revenue, (1) which will depress market prices, (2) which will reduce the production of other firms, (3) which will gain market share for the revenue maximizing firm, (4) which will, overall, increase profits for the revenue maximizing firm. Implicit in this line of thought is that one might take other firms' responses into account when choosing a market strategy. The Nash Equilibria of the non-myopic Cournot competition capture this action/response issue appropriately, and this work is a step towards understanding the impact of such strategic manipulative play in markets.We study the properties of Nash Equilibria of non-myopic Cournot competition with linear demand functions and show existence of pure Nash Equilibria, that simple best response dynamics will produce such an equilibrium, and that for some natural dynamics this convergence is within linear time. This is in contrast to the well known fact that best response dynamics need not converge in the standard myopic Cournot competition.Furthermore, we compare the outcome of the non-myopic Cournot competition with that of the standard myopic Cournot competition. Not surprisingly, perhaps, prices in the non-myopic game are lower and the firms, in total, produce more and have a lower aggregate utility.

Posted Content
TL;DR: In this paper, a simple simultaneous first price auction for multiple items in a complete information setting is considered, where one agent is single-minded and the other is unit demand, and the goal is to characterize the mixed equilibria in this setting.
Abstract: We consider a simple simultaneous first price auction for multiple items in a complete information setting. Our goal is to completely characterize the mixed equilibria in this setting, for a simple, yet highly interesting, {\tt AND}-{\tt OR} game, where one agent is single minded and the other is unit demand.

Posted Content
TL;DR: In this paper, it was shown that POMDPs can be represented by multiplicity automata with no increase in the representation size, and that the size of the automaton is equal to the rank of the predictive state representation.
Abstract: Planning and learning in Partially Observable MDPs (POMDPs) are among the most challenging tasks in both the AI and Operation Research communities. Although solutions to these problems are intractable in general, there might be special cases, such as structured POMDPs, which can be solved efficiently. A natural and possibly efficient way to represent a POMDP is through the predictive state representation (PSR) - a representation which recently has been receiving increasing attention. In this work, we relate POMDPs to multiplicity automata- showing that POMDPs can be represented by multiplicity automata with no increase in the representation size. Furthermore, we show that the size of the multiplicity automaton is equal to the rank of the predictive state representation. Therefore, we relate both the predictive state representation and POMDPs to the well-founded multiplicity automata literature. Based on the multiplicity automata representation, we provide a planning algorithm which is exponential only in the multiplicity automata rank rather than the number of states of the POMDP. As a result, whenever the predictive state representation is logarithmic in the standard POMDP representation, our planning algorithm is efficient.

Book ChapterDOI
04 Jun 2012
TL;DR: A model-free approach to bidding in the Ad-Auctions Trading Agents Competition, which describes a simple and robust yet high-performing agent using a Regret Minimization optimization algorithm for the 2010 competition and the top performing agent for the 2011 competition, still using simplified modeling and optimization methods.
Abstract: We describe a model-free approach to bidding in the Ad-Auctions Trading Agents Competition: First, a simple and robust yet high-performing agent using a Regret Minimization optimization algorithm for the 2010 competition, followed by our top performing agent for the 2011 competition, still using simplified modeling and optimization methods. Specifically, we model the user populations using particle filters, but base the observations on a Nearest Neighbor estimator (instead of game specific parameters). We implement a simple and effective bid optimization algorithm by applying the equimarginal principle combined with perplexity-based regularization. The implementation of our 2011 agent also remains model-free in the sense that we do not attempt to model the competing agents behavior for estimating costs and associated game parameters.