Showing papers by "Richard Cole published in 2014"

PDF

Open Access

Proceedings Article•DOI•

The sample complexity of revenue maximization

[...]

Richard Cole¹, Tim Roughgarden²•Institutions (2)

New York University¹, Stanford University²

31 May 2014

TL;DR: In this article, the authors consider a single-item auction where bidders' valuations are drawn independently from unknown and non-identical distributions and the seller is given m samples from each of these distributions "for free" and chooses an auction to run on a fresh sample.

...read moreread less

Abstract: In the design and analysis of revenue-maximizing auctions, auction performance is typically measured with respect to a prior distribution over inputs. The most obvious source for such a distribution is past data. The goal of this paper is to understand how much data is necessary and sufficient to guarantee near-optimal expected revenue. Our basic model is a single-item auction in which bidders' valuations are drawn independently from unknown and nonidentical distributions. The seller is given m samples from each of these distributions "for free" and chooses an auction to run on a fresh sample. How large does m need to be, as a function of the number k of bidders and e 0, so that a (1 -- e)-approximation of the optimal revenue is achievable? We prove that, under standard tail conditions on the underlying distributions, m = poly(k, 1/e) samples are necessary and sufficient. Our lower bound stands in contrast to many recent results on simple and prior-independent auctions and fundamentally involves the interplay between bidder competition, non-identical distributions, and a very close (but still constant) approximation of the optimal revenue. It effectively shows that the only way to achieve a sufficiently good constant approximation of the optimal revenue is through a detailed understanding of bidders' valuation distributions. Our upper bound is constructive and applies in particular to a variant of the empirical Myerson auction, the natural auction that runs the revenue-maximizing auction with respect to the empirical distributions of the samples. To capture how our sample complexity upper bound depends on the set of allowable distributions, we introduce α-strongly regular distributions, which interpolate between the well-studied classes of regular (α = 0) and MHR (α = 1) distributions. We give evidence that this definition is of independent interest.

...read moreread less

202 citations

Posted Content•

Amortized Analysis on Asynchronous Gradient Descent

[...]

Yun Kuen Cheung, Richard Cole

29 Nov 2014-arXiv: Optimization and Control

TL;DR: This work provides a version of asynchronous gradient descent (AGD) in which communication between cores is minimal and for which there is little synchronization overhead, and gives the first amortized analysis of AGD on convex functions.

...read moreread less

Abstract: Gradient descent is an important class of iterative algorithms for minimizing convex functions. Classically, gradient descent has been a sequential and synchronous process. Distributed and asynchronous variants of gradient descent have been studied since the 1980s, and they have been experiencing a resurgence due to demand from large-scale machine learning problems running on multi-core processors. We provide a version of asynchronous gradient descent (AGD) in which communication between cores is minimal and for which there is little synchronization overhead. We also propose a new timing model for its analysis. With this model, we give the first amortized analysis of AGD on convex functions. The amortization allows for bad updates (updates that increase the value of the convex function); in contrast, most prior work makes the strong assumption that every update must be significantly improving. Typically, the step sizes used in AGD are smaller than those used in its synchronous counterpart. We provide a method to determine the step sizes in AGD based on the Hessian entries for the convex function. In certain circumstances, the resulting step sizes are a constant fraction of those used in the corresponding synchronous algorithm, enabling the overall performance of AGD to improve linearly with the number of cores. We give two applications of our amortized analysis: • We show that our AGD algorithm can be applied to two classes of problems which have huge problem sizes in applications and consequently can benefit substantially from parallelism. The first class of problems is to solve linear systems Ap = b, where the A are symmetric and positive definite matrices. The second class of problems is to minimize convex functions of the form P n=1 fi(pi) + 1 kAp −bk 2 , where the fi are convex differentiable univariate functions.

...read moreread less

13 citations

Journal Article•DOI•

Two-Dimensional Parameterized Matching

[...]

Richard Cole¹, Carmit Hazay², Moshe Lewenstein², Dekel Tsur³•Institutions (3)

New York University¹, Bar-Ilan University², University of Haifa³

30 Oct 2014-ACM Transactions on Algorithms

TL;DR: Two algorithms are presented that solve the two-dimensional parameterized matching problem and are faster than the O(n2m log2 m log log m) time algorithm for this problem of Amir et al.

...read moreread less

Abstract: Two equal-length strings, or two equal-sized two-dimensional texts, parameterize match (p-match) if there is a one-one mapping (relative to the alphabet) of their characters. Two-dimensional parameterized matching is the task of finding all m × m substrings of an n × n text that p-match an m × m pattern. This models searching for color images with changing of color maps, for example. We present two algorithms that solve the two-dimensional parameterized matching problem. The time complexities of our algorithms are O(n2 log2m) and O(n2 p m2.5 polylog(m)). Our algorithms are faster than the O(n2m log2m log log m) time algorithm for this problem of Amir et al. [2006].A key step in both of our algorithms is to count the number of distinct characters in every m × m substring of an n × n string. We show how to solve this problem in O(n2) time. This result may be of independent interest.

...read moreread less

7 citations

Book Chapter•DOI•

Fast Algorithms for Constructing Maximum Entropy Summary Trees

[...]

Richard Cole¹, Howard Karloff²•Institutions (2)

Courant Institute of Mathematical Sciences¹, Yahoo!²

08 Jul 2014

TL;DR: The algorithm generating optimal summary trees was only pseudo-polynomial (and worked only for integral weights); the authors left open existence of a polynomial-time algorithm.

...read moreread less

Abstract: Karloff and Shirley recently proposed “summary trees” as a new way to visualize large rooted trees (Eurovis 2013) and gave algorithms for generating a maximum-entropy k-node summary tree of an input n-node rooted tree. However, the algorithm generating optimal summary trees was only pseudo-polynomial (and worked only for integral weights); the authors left open existence of a polynomial-time algorithm. In addition, the authors provided an additive approximation algorithm and a greedy heuristic, both working on real weights.

...read moreread less

1 citations

Posted Content•

Fast Algorithms for Constructing Maximum Entropy Summary Trees

[...]

Richard Cole¹, Howard Karloff²•Institutions (2)

Courant Institute of Mathematical Sciences¹, Yahoo!²

22 Apr 2014-arXiv: Data Structures and Algorithms

TL;DR: In this article, the authors presented an O(k^2 n + n log n) time algorithm to generate a maximum entropy k-node summary tree of an input n-node rooted tree.

...read moreread less

Abstract: Karloff? and Shirley recently proposed summary trees as a new way to visualize large rooted trees (Eurovis 2013) and gave algorithms for generating a maximum-entropy k-node summary tree of an input n-node rooted tree. However, the algorithm generating optimal summary trees was only pseudo-polynomial (and worked only for integral weights); the authors left open existence of a olynomial-time algorithm. In addition, the authors provided an additive approximation algorithm and a greedy heuristic, both working on real weights. This paper shows how to construct maximum entropy k-node summary trees in time O(k^2 n + n log n) for real weights (indeed, as small as the time bound for the greedy heuristic given previously); how to speed up the approximation algorithm so that it runs in time O(n + (k^4/eps?) log(k/eps?)), and how to speed up the greedy algorithm so as to run in time O(kn + n log n). Altogether, these results make summary trees a much more practical tool than before.

...read moreread less

1 citations