Showing papers in &quot;Journal of the ACM in 2009&quot;

Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations

TL;DR: A (classical) public-key cryptosystem whose security is based on the hardness of the learning problem, which is a reduction from worst-case lattice problems such as GapSVP and SIVP to a certain learning problem that is quantum.

...read moreread less

Abstract: Our main result is a reduction from worst-case lattice problems such as GapSVP and SIVP to a certain learning problem. This learning problem is a natural extension of the “learning from parity with error” problem to higher moduli. It can also be viewed as the problem of decoding from a random linear code. This, we believe, gives a strong indication that these problems are hard. Our reduction, however, is quantum. Hence, an efficient solution to the learning problem implies a quantum algorithm for GapSVP and SIVP. A main open question is whether this reduction can be made classical (i.e., nonquantum).We also present a (classical) public-key cryptosystem whose security is based on the hardness of the learning problem. By the main result, its security is also based on the worst-case quantum hardness of GapSVP and SIVP. The new cryptosystem is much more efficient than previous lattice-based cryptosystems: the public key is of size O(n2) and encrypting a message increases its size by a factor of O(n) (in previous cryptosystems these values are O(n4) and O(n2), respectively). In fact, under the assumption that all parties share a random bit string of length O(n2), the size of the public key can be reduced to O(n).

...read moreread less

1,599 citations

Journal Article•DOI•

[...]

Kousha Etessami¹, Mihalis Yannakakis²•Institutions (2)

University of Edinburgh¹, Columbia University²

03 Feb 2009-Journal of the ACM

TL;DR: It is shown that the PSPACE upper bounds cannot be substantially improved without a breakthrough on long standing open problems: the square-root sum problem and an arithmetic circuit decision problem that captures P-time on the unit-cost rational arithmetic RAM model.

...read moreread less

Abstract: We define Recursive Markov Chains (RMCs), a class of finitely presented denumerable Markov chains, and we study algorithms for their analysis. Informally, an RMC consists of a collection of finite-state Markov chains with the ability to invoke each other in a potentially recursive manner. RMCs offer a natural abstract model for probabilistic programs with procedures. They generalize, in a precise sense, a number of well-studied stochastic models, including Stochastic Context-Free Grammars (SCFG) and Multi-Type Branching Processes (MT-BP).We focus on algorithms for reachability and termination analysis for RMCs: what is the probability that an RMC started from a given state reaches another target state, or that it terminatesq These probabilities are in general irrational, and they arise as (least) fixed point solutions to certain (monotone) systems of nonlinear equations associated with RMCs. We address both the qualitative problem of determining whether the probabilities are 0, 1 or in-between, and the quantitative problems of comparing the probabilities with a given bound, or approximating them to desired precision.We show that all these problems can be solved in PSPACE using a decision procedure for the Existential Theory of Reals. We provide a more practical algorithm, based on a decomposed version of multi-variate Newton's method, and prove that it always converges monotonically to the desired probabilities. We show this method applies more generally to any monotone polynomial system. We obtain polynomial-time algorithms for various special subclasses of RMCs. Among these: for SCFGs and MT-BPs (equivalently, for 1-exit RMCs) the qualitative problem can be solved in P-time; for linearly recursive RMCs the probabilities are rational and can be computed exactly in P-time.We show that our PSPACE upper bounds cannot be substantially improved without a breakthrough on long standing open problems: the square-root sum problem and an arithmetic circuit decision problem that captures P-time on the unit-cost rational arithmetic RAM model. We show that these problems reduce to the qualitative problem and to the approximation problem (to within any nontrivial error) for termination probabilities of general RMCs, and to the quantitative decision problem for termination (extinction) of SCFGs (MT-BPs).

...read moreread less

632 citations

Journal Article•DOI•

Settling the complexity of computing two-player Nash equilibria

[...]

Xi Chen¹, Xiaotie Deng², Shang-Hua Teng³•Institutions (3)

Tsinghua University¹, City University of Hong Kong², Akamai Technologies³

Expander flows, geometric embeddings and graph partitioning

TL;DR: The complexity of finding a Nash equilibrium in a two-player game is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou in 1991 as discussed by the authors.

...read moreread less

Abstract: We prove that Bimatrix, the problem of finding a Nash equilibrium in a two-player game, is complete for the complexity class PPAD (Polynomial Parity Argument, Directed version) introduced by Papadimitriou in 1991.Our result, building upon the work of Daskalakis et al. [2006a] on the complexity of four-player Nash equilibria, settles a long standing open problem in algorithmic game theory. It also serves as a starting point for a series of results concerning the complexity of two-player Nash equilibria. In particular, we prove the following theorems:—Bimatrix does not have a fully polynomial-time approximation scheme unless every problem in PPAD is solvable in polynomial time.—The smoothed complexity of the classic Lemke-Howson algorithm and, in fact, of any algorithm for Bimatrix is not polynomial unless every problem in PPAD is solvable in randomized polynomial time.Our results also have a complexity implication in mathematical economics:—Arrow-Debreu market equilibria are PPAD-hard to compute.

...read moreread less

497 citations

Journal Article•DOI•

[...]

Sanjeev Arora¹, Satish Rao², Umesh Vazirani²•Institutions (2)

Princeton University¹, University of California, Berkeley²

Adding nesting structure to words

TL;DR: An interesting and natural “approximate certificate” for a graph's expansion, which involves embedding an n-node expander in it with appropriate dilation and congestion, is described.

...read moreread less

Abstract: We give a O(√log n)-approximation algorithm for the sparsest cut, edge expansion, balanced separator, and graph conductance problems. This improves the O(log n)-approximation of Leighton and Rao (1988). We use a well-known semidefinite relaxation with triangle inequality constraints. Central to our analysis is a geometric theorem about projections of point sets in Rd, whose proof makes essential use of a phenomenon called measure concentration.We also describe an interesting and natural “approximate certificate” for a graph's expansion, which involves embedding an n-node expander in it with appropriate dilation and congestion. We call this an expander flow.

...read moreread less

483 citations

Journal Article•DOI•

[...]

Rajeev Alur¹, P. Madhusudan²•Institutions (2)

University of Pennsylvania¹, University of Illinois at Urbana–Champaign²

Unbalanced expanders and randomness extractors from Parvaresh--Vardy codes

TL;DR: In this paper, the authors define nested word automata, which generalize both words and ordered trees, and allow both word and tree operations, and show that the resulting class of regular languages of nested words has all the appealing theoretical properties that the classical regular word languages enjoys: deterministic nestedword automata are as expressive as their non-deterministic counterparts; the class is closed under union, intersection, complementation, concatenation, Kleene-a, prefixes, and language homomorphisms; membership, emptiness, language equivalence are all decidable;

...read moreread less

Abstract: We propose the model of nested words for representation of data with both a linear ordering and a hierarchically nested matching of items. Examples of data with such dual linear-hierarchical structure include executions of structured programs, annotated linguistic data, and HTML/XML documents. Nested words generalize both words and ordered trees, and allow both word and tree operations. We define nested word automata—finite-state acceptors for nested words, and show that the resulting class of regular languages of nested words has all the appealing theoretical properties that the classical regular word languages enjoys: deterministic nested word automata are as expressive as their nondeterministic counterparts; the class is closed under union, intersection, complementation, concatenation, Kleene-a, prefixes, and language homomorphisms; membership, emptiness, language inclusion, and language equivalence are all decidable; and definability in monadic second order logic corresponds exactly to finite-state recognizability. We also consider regular languages of infinite nested words and show that the closure properties, MSO-characterization, and decidability of decision problems carry over.The linear encodings of nested words give the class of visibly pushdown languages of words, and this class lies between balanced languages and deterministic context-free languages. We argue that for algorithmic verification of structured programs, instead of viewing the program as a context-free language over words, one should view it as a regular language of nested words (or equivalently, a visibly pushdown language), and this would allow model checking of many properties (such as stack inspection, pre-post conditions) that are not expressible in existing specification logics.We also study the relationship between ordered trees and nested words, and the corresponding automata: while the analysis complexity of nested word automata is the same as that of classical tree automata, they combine both bottom-up and top-down traversals, and enjoy expressiveness and succinctness benefits over tree automata.

...read moreread less

395 citations

Journal Article•DOI•

[...]

Venkatesan Guruswami¹, Christopher Umans², Salil Vadhan³•Institutions (3)

Carnegie Mellon University¹, California Institute of Technology², Harvard University³

A measure & conquer approach for the analysis of exact algorithms

TL;DR: A new, self-contained construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al.

...read moreread less

Abstract: We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of right-hand vertices are polynomially close to optimal, whereas the previous constructions of Ta-Shma et al. [2007] required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and self-contained description and analysis, based on the ideas underlying the recent list-decodable error-correcting codes of Parvaresh and Vardy [2005].Our expanders can be interpreted as near-optimal “randomness condensers,” that reduce the task of extracting randomness from sources of arbitrary min-entropy rate to extracting randomness from sources of min-entropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new, self-contained construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. [2003] and improving upon it when the error parameter is small (e.g., 1/poly(n)).

...read moreread less

304 citations

Journal Article•DOI•

[...]

Fedor V. Fomin¹, Fabrizio Grandoni², Dieter Kratsch³•Institutions (3)

University of Bergen¹, University of Rome Tor Vergata², Metz³

21 Aug 2009-Journal of the ACM

TL;DR: The idea is that a smarter measure may capture behaviors of the algorithm that a standard measure might not be able to exploit, and hence lead to a significantly better worst-case time analysis, in order to step beyond limitations of current algorithms design.

...read moreread less

Abstract: For more than 40 years, Branch & Reduce exponential-time backtracking algorithms have been among the most common tools used for finding exact solutions of NP-hard problems. Despite that, the way to analyze such recursive algorithms is still far from producing tight worst-case running time bounds. Motivated by this, we use an approach, that we call “Measure & Conquer”, as an attempt to step beyond such limitations. The approach is based on the careful design of a nonstandard measure of the subproblem size; this measure is then used to lower bound the progress made by the algorithm at each branching step. The idea is that a smarter measure may capture behaviors of the algorithm that a standard measure might not be able to exploit, and hence lead to a significantly better worst-case time analysis.In order to show the potentialities of Measure & Conquer, we consider two well-studied NP-hard problems: minimum dominating set and maximum independent set. For the first problem, we consider the current best algorithm, and prove (thanks to a better measure) a much tighter running time bound for it. For the second problem, we describe a new, simple algorithm, and show that its running time is competitive with the current best time bounds, achieved with far more complicated algorithms (and standard analysis).Our examples show that a good choice of the measure, made in the very first stages of exact algorithms design, can have a tremendous impact on the running time bounds achievable.

...read moreread less

284 citations

Journal Article•DOI•

Space-time tradeoffs for approximate nearest neighbor searching

[...]

Sunil Arya¹, Theocharis Malamatos², David M. Mount³•Institutions (3)

Hong Kong University of Science and Technology¹, University of Peloponnese², University of Maryland, College Park³

27 Nov 2009-Journal of the ACM

TL;DR: There is a single approach to nearest neighbor searching, which both improves upon existing results and spans the spectrum of space-time tradeoffs, and new algorithms for constructing AVDs and tools for analyzing their total space requirements are provided.

...read moreread less

Abstract: Nearest neighbor searching is the problem of preprocessing a set of n point points in d-dimensional space so that, given any query point q, it is possible to report the closest point to q rapidly. In approximate nearest neighbor searching, a parameter e > 0 is given, and a multiplicative error of (1 + e) is allowed. We assume that the dimension d is a constant and treat n and e as asymptotic quantities. Numerous solutions have been proposed, ranging from low-space solutions having space O(n) and query time O(log n + 1/ed−1) to high-space solutions having space roughly O((n log n)/ed) and query time O(log (n/e)).We show that there is a single approach to this fundamental problem, which both improves upon existing results and spans the spectrum of space-time tradeoffs. Given a tradeoff parameter γ, where 2 ≤ γ ≤ 1/e, we show that there exists a data structure of space O(nγd−1 log(1/e)) that can answer queries in time O(log(nγ) + 1/(eγ)(d−1)/2. When γ = 2, this yields a data structure of space O(n log (1/e)) that can answer queries in time O(log n + 1/e(d−1)/2). When γ = 1/e, it provides a data structure of space O((n/ed−1)log(1/e)) that can answer queries in time O(log(n/e)).Our results are based on a data structure called a (t,e)-AVD, which is a hierarchical quadtree-based subdivision of space into cells. Each cell stores up to t representative points of the set, such that for any query point q in the cell at least one of these points is an approximate nearest neighbor of q. We provide new algorithms for constructing AVDs and tools for analyzing their total space requirements. We also establish lower bounds on the space complexity of AVDs, and show that, up to a factor of O(log (1/e)), our space bounds are asymptotically tight in the two extremes, γ = 2 and γ = 1/e.

...read moreread less

266 citations

Journal Article•DOI•

Compressing and indexing labeled trees, with applications

[...]

Paolo Ferragina¹, Fabrizio Luccio¹, Giovanni Manzini², S. Muthukrishnan³•Institutions (3)

University of Pisa¹, University of Eastern Piedmont², Rutgers University³

27 Nov 2009-Journal of the ACM

TL;DR: For the first time, by using the properties of the XBW-transform, compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.

...read moreread less

Abstract: Consider an ordered, static tree T where each node has a label from alphabet Σ. Tree T may be of arbitrary degree and shape. Our goal is designing a compressed storage scheme of T that supports basic navigational operations among the immediate neighbors of a node (i.e. parent, ith child, or any child with some label,…) as well as more sophisticated path-based search operations over its labeled structure.We present a novel approach to this problem by designing what we call the XBW-transform of the tree in the spirit of the well-known Burrows-Wheeler transform for strings [1994]. The XBW-transform uses path-sorting to linearize the labeled tree T into two coordinated arrays, one capturing the structure and the other the labels. For the first time, by using the properties of the XBW-transform, our compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.Our XBW-transform is simple and likely to spur new results in the theory of tree compression and indexing, as well as interesting application contexts. As an example, we use the XBW-transform to design and implement a compressed index for XML documents whose compression ratio is significantly better than the one achievable by state-of-the-art tools, and its query time performance is order of magnitudes faster.

...read moreread less

189 citations

Journal Article•DOI•

Triangulation and embedding using small sets of beacons

[...]

Jon Kleinberg¹, Aleksandrs Slivkins², Tom Wexler³•Institutions (3)

Cornell University¹, Microsoft², Denison University³

Empirical hardness models: Methodology and a case study on combinatorial auctions

TL;DR: A beacon-based embedding algorithm is given that achieves constant distortion on a 1 - /spl epsiv/ fraction of distances; this provides some theoretical justification for the success of the recent global network positioning algorithm of Ng and Zhang.

...read moreread less

Abstract: Concurrent with recent theoretical interest in the problem of metric embedding, a growing body of research in the networking community has studied the distance matrix defined by node-to-node latencies in the Internet, resulting in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclidean space. There is a fundamental distinction, however, between the theoretical approaches to the embedding problem and this recent Internet-related work: in addition to computational limitations, Internet measurement algorithms operate under the constraint that it is only feasible to measure distances for a linear (or near-linear) number of node pairs, and typically in a highly structured way. Indeed, the most common framework for Internet measurements of this type is a beacon-based approach one chooses uniformly at random a constant number of nodes (“beacons”) in the network, each node measures its distance to all beacons, and one then has access to only these measurements for the remainder of the algorithm. Moreover, beacon-based algorithms are often designed not for embedding but for the more basic problem of triangulation, in which one uses the triangle inequality to infer the distances that have not been measured.Here we give algorithms with provable performance guarantees for beacon-based triangulation and embedding. We show that in addition to multiplicative error in the distances, performance guarantees for beacon-based algorithms typically must include a notion of slack—a certain fraction of all distances may be arbitrarily distorted. For metric spaces of bounded doubling dimension (which have been proposed as a reasonable abstraction of Internet latencies), we show that triangulation-based distance reconstruction with a constant number of beacons can achieve multiplicative error 1 + δ on a 1 − e fraction of distances, for arbitrarily small constants δ and e. For this same class of metric spaces, we give a beacon-based embedding algorithm that achieves constant distortion on a 1 − e fraction of distances; this provides some theoretical justification for the success of the recent Global Network Positioning algorithm of Ng and Zhang [2002], and it forms an interesting contrast with lower bounds showing that it is not possible to embed all distances in a doubling metric space with constant distortion. We also give results for other classes of metric spaces, as well as distributed algorithms that require only a sparse set of distances but do not place too much measurement load on any one node.

...read moreread less

169 citations

Journal Article•DOI•

[...]

Kevin Leyton-Brown¹, Eugene Nudelman², Yoav Shoham²•Institutions (2)

University of British Columbia¹, Stanford University²

Two-variable logic on data trees and XML reasoning

TL;DR: The use of supervised machine learning is proposed to build models that predict an algorithm's runtime given a problem instance and techniques for interpreting them are described to gain understanding of the characteristics that cause instances to be hard or easy.

...read moreread less

Abstract: Is it possible to predict how long an algorithm will take to solve a previously-unseen instance of an NP-complete problemq If so, what uses can be found for models that make such predictionsq This article provides answers to these questions and evaluates the answers experimentally.We propose the use of supervised machine learning to build models that predict an algorithm's runtime given a problem instance. We discuss the construction of these models and describe techniques for interpreting them to gain understanding of the characteristics that cause instances to be hard or easy. We also present two applications of our models: building algorithm portfolios that outperform their constituent algorithms, and generating test distributions that emphasize hard problems.We demonstrate the effectiveness of our techniques in a case study of the combinatorial auction winner determination problem. Our experimental results show that we can build very accurate models of an algorithm's running time, interpret our models, build an algorithm portfolio that strongly outperforms the best single algorithm, and tune a standard benchmark suite to generate much harder problem instances.

...read moreread less

Journal Article•DOI•

[...]

Mikołaj Bojańczyk¹, Anca Muscholl², Thomas Schwentick³, Luc Segoufin⁴•Institutions (4)

University of Warsaw¹, University of Bordeaux², Technical University of Dortmund³, French Institute for Research in Computer Science and Automation⁴

Multi-linear formulas for permanent and determinant are of super-polynomial size

TL;DR: It is shown that satisfiability for two-variable first-order logic is decidable if the tree structure can be accessed only through the child and the next sibling predicates and the access to data values is restricted to equality tests.

...read moreread less

Abstract: Motivated by reasoning tasks for XML languages, the satisfiability problem of logics on data trees is investigated. The nodes of a data tree have a label from a finite set and a data value from a possibly infinite set. It is shown that satisfiability for two-variable first-order logic is decidable if the tree structure can be accessed only through the child and the next sibling predicates and the access to data values is restricted to equality tests. From this main result, decidability of satisfiability and containment for a data-aware fragment of XPath and of the implication problem for unary key and inclusion constraints is concluded.

...read moreread less

Journal Article•DOI•

[...]

Ran Raz¹•Institutions (1)

Weizmann Institute of Science¹

An O(n log n) algorithm for maximum st-flow in a directed planar graph

TL;DR: It is proved that any multilinear arithmetic formula for the permanent or the determinant of an n × n matrix is of size super-polynomial in n.

...read moreread less

Abstract: An arithmetic formula is multilinear if the polynomial computed by each of its subformulas is multilinear. We prove that any multilinear arithmetic formula for the permanent or the determinant of an n × n matrix is of size super-polynomial in n. Previously, super-polynomial lower bounds were not known (for any explicit function) even for the special case of multilinear formulas of constant depth.

...read moreread less

Journal Article•DOI•

[...]

Glencora Borradaile¹, Philip N. Klein¹•Institutions (1)

Brown University¹

Graph partitioning using single commodity flows

TL;DR: The first correct O(n) algorithm for finding a maximum flow in a directed planar graph is given, which consists of repeatedly saturating the leftmost residual residual of the shortest-path path.

...read moreread less

Abstract: We give the first correct O(n log n) algorithm for finding a maximum st-flow in a directed planar graph. After a preprocessing step that consists in finding single-source shortest-path distances in the dual, the algorithm consists of repeatedly saturating the leftmost residual s-to-t path.

...read moreread less

Journal Article•DOI•

[...]

Rohit Khandekar¹, Satish Rao², Umesh Vazirani²•Institutions (2)

IBM¹, University of California, Berkeley²

Efficient and secure authenticated key exchange using weak passwords

TL;DR: This work shows that the sparsest cut in graphs with n vertices and edges can be approximated within O(log 2/2) time using polylogarithmic single commodity max-flow computations, thus providing a certificate of expansion.

...read moreread less

Abstract: We show that the sparsest cut in graphs with n vertices and m edges can be approximated within O(log2n) factor in O(m + n3/2) time using polylogarithmic single commodity max-flow computations. Previous algorithms are based on multicommodity flows that take time O(m + n2). Our algorithm iteratively employs max-flow computations to embed an expander flow, thus providing a certificate of expansion. Our technique can also be extended to yield an O(log2n)-(pseudo-) approximation algorithm for the edge-separator problem with a similar running time.

...read moreread less

Journal Article•DOI•

[...]

Jonathan Katz¹, Rafail Ostrovsky², Moti Yung³•Institutions (3)

University of Maryland, College Park¹, University of California, Los Angeles², Columbia University³

27 Nov 2009-Journal of the ACM

TL;DR: The authors' is the first protocol for password-only authentication that is both practical and provably-secure using standard cryptographic assumptions, and is remarkably efficient, requiring computation only 4 times greater than “classical” Diffie-Hellman key exchange that provides no authentication at all.

...read moreread less

Abstract: Mutual authentication and authenticated key exchange are fundamental techniques for enabling secure communication over public, insecure networks. It is well known how to design secure protocols for achieving these goals when parties share high-entropy cryptographic keys in advance of the authentication stage. Unfortunately, it is much more common for users to share weak, low-entropy passwords which furthermore may be chosen from a known space of possibilities (say, a dictionary of English words). In this case, the problem becomes much more difficult as one must ensure that protocols are immune to off-line dictionary attacks in which an adversary exhaustively enumerates all possible passwords in an attempt to determine the correct one.We propose a 3-round protocol for password-only authenticated key exchange, and provide a rigorous proof of security for our protocol based on the decisional Diffie-Hellman assumption. The protocol assumes only public parameters—specifically, a “common reference string”—which can be “hard-coded” into an implementation of the protocol; in particular, and in contrast to some previous work, our protocol does not require either party to pre-share a public key. The protocol is also remarkably efficient, requiring computation only (roughly) 4 times greater than “classical” Diffie-Hellman key exchange that provides no authentication at all. Ours is the first protocol for password-only authentication that is both practical and provably-secure using standard cryptographic assumptions.

...read moreread less

Journal Article•DOI•

Generalized hypertree decompositions: NP-hardness and tractable variants

[...]

Georg Gottlob¹, Zoltán Miklós², Thomas Schwentick³•Institutions (3)

University of Oxford¹, École Polytechnique Fédérale de Lausanne², Technical University of Dortmund³

Adaptive simulated annealing: A near-optimal connection between sampling and counting

TL;DR: It is proven that determining whether a hypergraph H admits a tree projection with respect to ahypergraph G is NP-complete, and the new Component Hypertree Decomposition method is defined, which is tractable and strictly more general than other approximations to GHD published so far.

...read moreread less

Abstract: The generalized hypertree width GHW(H) of a hypergraph H is a measure of its cyclicity. Classes of conjunctive queries or constraint satisfaction problems whose associated hypergraphs have bounded GHW are known to be solvable in polynomial time. However, it has been an open problem for several years if for a fixed constant k and input hypergraph H it can be determined in polynomial time whether GHW(H) ≤ k. Here, this problem is settled by proving that even for k = 3 the problem is already NP-hard. On the way to this result, another long standing open problem, originally raised by Goodman and Shmueli [1984] in the context of join optimization is solved. It is proven that determining whether a hypergraph H admits a tree projection with respect to a hypergraph G is NP-complete. Our intractability results on generalized hypertree width motivate further research on more restrictive tractable hypergraph decomposition methods that approximate generalized hypertree decomposition (GHD). We show that each such method is dominated by a tractable decomposition method definable through a function that associates a set of partial edges to a hypergraph. By using one particular such function, we define the new Component Hypertree Decomposition method, which is tractable and strictly more general than other approximations to GHD published so far.

...read moreread less

Journal Article•DOI•

[...]

Daniel Stefankovic¹, Santosh Vempala², Eric Vigoda²•Institutions (2)

University of Rochester¹, Georgia Institute of Technology²

The gene evolution model and computing its associated probabilities

TL;DR: In this article, the authors present a near-optimal reduction from approximately counting the cardinality of a discrete set to approximately sampling elements of the set, which can be used to approximate the partition function Z of the Ising model, matchings or colorings of a graph.

...read moreread less

Abstract: We present a near-optimal reduction from approximately counting the cardinality of a discrete set to approximately sampling elements of the set. An important application of our work is to approximating the partition function Z of a discrete system, such as the Ising model, matchings or colorings of a graph. The typical approach to estimating the partition function Z(βa) at some desired inverse temperature βa is to define a sequence, which we call a cooling schedule, β0 = 0

...read moreread less

Journal Article•DOI•

[...]

Lars Arvestad¹, Jens Lagergren¹, Bengt Sennblad²•Institutions (2)

Royal Institute of Technology¹, Stockholm University²

On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs

TL;DR: The probabilistic gene evolution model is introduced, which describes how a gene tree evolves within a given species tree with respect to speciation, gene duplication, and gene loss, and is a canonical generalization of the classical linear birth-death process.

...read moreread less

Abstract: Phylogeny is both a fundamental tool in biology and a rich source of fascinating modeling and algorithmic problems. Today's wealth of sequenced genomes makes it increasingly important to understand evolutionary events such as duplications, losses, transpositions, inversions, lateral transfers, and domain shuffling. We focus on the gene duplication event, that constitutes a major force in the creation of genes with new function [Ohno 1970; Lynch and Force 2000] and, thereby also, of biodiversity.We introduce the probabilistic gene evolution model, which describes how a gene tree evolves within a given species tree with respect to speciation, gene duplication, and gene loss. The actual relation between gene tree and species tree is captured by a reconciliation, a concept which we generalize for more expressiveness. The model is a canonical generalization of the classical linear birth-death process, obtained by replacing the interval where the process takes place by a tree.For the gene evolution model, we derive efficient algorithms for some associated probability distributions: the probability of a reconciled tree, the probability of a gene tree, the maximum probability reconciliation, the posterior probability of a reconciliation, and sampling reconciliations with respect to the posterior probability. These algorithms provides the basis for several applications, including species tree construction, reconciliation analysis, orthology analysis, biogeography, and host-parasite co-evolution.

...read moreread less

Journal Article•DOI•

[...]

Dimitris Achlioptas¹, Aaron Clauset², David Kempe³, Cristopher Moore²•Institutions (3)

University of California, Santa Cruz¹, University of New Mexico², University of Southern California³

The complexity of online memory checking

TL;DR: This work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions, and proves that traceroute sampling finds power-law degree distributions in both Δ-regular and Poisson-distributed random graphs.

...read moreread less

Abstract: Understanding the graph structure of the Internet is a crucial step for building accurate network models and designing efficient algorithms for Internet applications. Yet, obtaining this graph structure can be a surprisingly difficult task, as edges cannot be explicitly queried. For instance, empirical studies of the network of Internet Protocol (IP) addresses typically rely on indirect methods like traceroute to build what are approximately single-source, all-destinations, shortest-path trees. These trees only sample a fraction of the network's edges, and a paper by Lakhina et al. [2003] found empirically that the resulting sample is intrinsically biased. Further, in simulations, they observed that the degree distribution under traceroute sampling exhibits a power law even when the underlying degree distribution is Poisson.In this article, we study the bias of traceroute sampling mathematically and, for a very general class of underlying degree distributions, explicitly calculate the distribution that will be observed. As example applications of our machinery, we prove that traceroute sampling finds power-law degree distributions in both δ-regular and Poisson-distributed random graphs. Thus, our work puts the observations of Lakhina et al. on a rigorous footing, and extends them to nearly arbitrary degree distributions.

...read moreread less

Journal Article•DOI•

[...]

Moni Naor¹, Guy N. Rothblum²•Institutions (2)

Weizmann Institute of Science¹, Massachusetts Institute of Technology²

01 Jan 2009-Journal of the ACM

TL;DR: In this paper, the authors studied the problem of sublinear authentication, where the user wants to encode and store the file in a way that allows him to verify that it has not been corrupted, but without reading the entire file.

...read moreread less

Abstract: We consider the problem of storing a large file on a remote and unreliable server. To verify that the file has not been corrupted, a user could store a small private (randomized) “fingerprint” on his own computer. This is the setting for the well-studied authentication problem in cryptography, and the required fingerprint size is well understood. We study the problem of sublinear authentication: suppose the user would like to encode and store the file in a way that allows him to verify that it has not been corrupted, but without reading the entire file. If the user only wants to read q bits of the file, how large does the size s of the private fingerprint need to beq We define this problem formally, and show a tight lower bound on the relationship between s and q when the adversary is not computationally bounded, namely: s × q e Ω(n), where n is the file size. This is an easier case of the online memory checking problem, introduced by Blum et al. [1991], and hence the same (tight) lower bound applies also to that problem. It was previously shown that, when the adversary is computationally bounded, under the assumption that one-way functions exist, it is possible to construct much better online memory checkers. The same is also true for sublinear authentication schemes. We show that the existence of one-way functions is also a necessary condition: even slightly breaking the s × q e Ω(n) lower bound in a computational setting implies the existence of one-way functions.

...read moreread less

Journal Article•DOI•

Improved bounds on the average length of longest common subsequences

[...]

George S. Lueker¹•Institutions (1)

University of California, Irvine¹

Single-value combinatorial auctions and algorithmic implementation in undominated strategies

TL;DR: Dancik and Paterson as mentioned in this paper improved the lower bound to 0.788071 and 0.826280, which is the best known lower and upper bound for γ2.

...read moreread less

Abstract: It has long been known lChvatal and Sankoff 1975r that the average length of the longest common subsequence of two random strings of length n over an alphabet of size k is asymptotic to γkn for some constant γk depending on k. The value of these constants remains unknown, and a number of papers have proved upper and lower bounds on them. We discuss techniques, involving numerical calculations with recurrences on many variables, for determining lower and upper bounds on these constants. To our knowledge, the previous best-known lower and upper bounds for γ2 were those of Dancik and Paterson, approximately 0.773911 and 0.837623 lDancik 1994; Dancik and Paterson 1995r. We improve these to 0.788071 and 0.826280. This upper bound is less than the γ2 given by Steele's old conjecture (see Steele l1997, page 3r) that γ2 = 2/(1 + √2)a 0.828427. (As Steele points out, experimental evidence had already suggested that this conjectured value was too high.) Finally, we show that the upper bound technique described here could be used to produce, for any k, a sequence of upper bounds converging to γk, though the computation time grows very quickly as better bounds are guaranteed.

...read moreread less

Journal Article•DOI•

[...]

Moshe Babaioff¹, Ron Lavi², Elan Pavlov³•Institutions (3)

Microsoft¹, Technion – Israel Institute of Technology², Massachusetts Institute of Technology³

03 Feb 2009-Journal of the ACM

TL;DR: A general deterministic technique to decouple the algorithmic allocation problem from the strategic aspects, by a procedure that converts any algorithm to a dominant-strategy ascending mechanism, which achieves an approximation to the social welfare close to the best possible in polynomial time.

...read moreread less

Abstract: In this article, we are interested in general techniques for designing mechanisms that approximate the social welfare in the presence of selfish rational behavior. We demonstrate our results in the setting of Combinatorial Auctions (CA). Our first result is a general deterministic technique to decouple the algorithmic allocation problem from the strategic aspects, by a procedure that converts any algorithm to a dominant-strategy ascending mechanism. This technique works for any single value domain, in which each agent has the same value for each desired outcome, and this value is the only private information. In particular, for “single-value CAs”, where each player desires any one of several different bundles but has the same value for each of them, our technique converts any approximation algorithm to a dominant strategy mechanism that almost preserves the original approximation ratio. Our second result provides the first computationally efficient deterministic mechanism for the case of single-value multi-minded bidders (with private value and private desired bundles). The mechanism achieves an approximation to the social welfare which is close to the best possible in polynomial time (unless P=NP). This mechanism is an algorithmic implementation in undominated strategies, a notion that we define and justify, and is of independent interest.

...read moreread less

Journal Article•DOI•

A unified approach to scheduling on unrelated parallel machines

[...]

V. S. Anil Kumar¹, Madhav V. Marathe¹, Srinivasan Parthasarathy², Aravind Srinivasan³•Institutions (3)

Virginia Tech¹, IBM², University of Maryland, College Park³

21 Aug 2009-Journal of the ACM

TL;DR: A single rounding algorithm for scheduling on unrelated parallel machines that works well with the known linear programming-, quadratic programming-, and convex programming-relaxations for scheduling to minimize completion time, makespan, and other well-studied objective functions.

...read moreread less

Abstract: We develop a single rounding algorithm for scheduling on unrelated parallel machines; this algorithm works well with the known linear programming-, quadratic programming-, and convex programming-relaxations for scheduling to minimize completion time, makespan, and other well-studied objective functions. This algorithm leads to the following applications for the general setting of unrelated parallel machines: (i) a bicriteria algorithm for a schedule whose weighted completion-time and makespan simultaneously exhibit the current-best individual approximations for these criteria; (ii) better-than-two approximation guarantees for scheduling to minimize the Lp norm of the vector of machine-loads, for all 1

...read moreread less

Journal Article•DOI•

A property of quantum relative entropy with an application to privacy in quantum communication

[...]

Rahul Jain¹, Jaikumar Radhakrishnan², Pranab Sen²•Institutions (2)

National University of Singapore¹, Tata Institute of Fundamental Research²

Quantifying inefficiency in cost-sharing mechanisms

TL;DR: The substate theorem is derived, which states that if the relative entropy of ρ and σ is small, then there is a state ρ′ close to ρ, i.e. with small trace distance ‘sits inside’, or becomes a ‘substate’ of, σ.

...read moreread less

Abstract: We prove the following information-theoretic property about quantum states.Substate theorem: Let ρ and σ be quantum states in the same Hilbert space with relative entropy S(ρ ‖ σ) c Tr ρ (log ρ - log σ) = c. Then for all ϵ > 0, there is a state ρ′ such that the trace distance ‖ρ′ - ρ‖tr : Tr √(ρ′ - ρ)2 ≤ ϵ, and ρ′/2O(c/ϵ2) ≤ σ.It states that if the relative entropy of ρ and σ is small, then there is a state ρ′ close to ρ, i.e. with small trace distance ‖ρ′ - ρ‖tr, that when scaled down by a factor 2O(c) ‘sits inside’, or becomes a ‘substate’ of, σ. This result has several applications in quantum communication complexity and cryptography. Using the substate theorem, we derive a privacy trade-off for the set membership problem in the two-party quantum communication model. Here Alice is given a subset A ⊆ [n], Bob an input i i [n], and they need to determine if i i A.Privacy trade-off for set membership: In any two-party quantum communication protocol for the set membership problem, if Bob reveals only k bits of information about his input, then Alice must reveal at least n/2O(k) bits of information about her input.We also discuss relationships between various information theoretic quantities that arise naturally in the context of the substate theorem.

...read moreread less

Journal Article•DOI•

[...]

Tim Roughgarden¹, Mukund Sundararajan¹•Institutions (1)

Stanford University¹

The complexity of obstruction-free implementations

TL;DR: This work introduces novel measures for quantifying efficiency loss in cost-sharing mechanisms and proves simultaneous approximate budget-balance and approximate efficiency guarantees for mechanisms for a wide range of cost- sharing problems, including all submodular and Steiner tree problems.

...read moreread less

Abstract: In a cost-sharing problem, several participants with unknown preferences vie to receive some good or service, and each possible outcome has a known cost. A cost-sharing mechanism is a protocol that decides which participants are allocated a good and at what prices. Three desirable properties of a cost-sharing mechanism are: incentive-compatibility, meaning that participants are motivated to bid their true private value for receiving the good; budget-balance, meaning that the mechanism recovers its incurred cost with the prices charged; and economic efficiency, meaning that the cost incurred and the value to the participants are traded off in an optimal way. These three goals have been known to be mutually incompatible for thirty years. Nearly all the work on cost-sharing mechanism design by the economics and computer science communities has focused on achieving two of these goals while completely ignoring the third.We introduce novel measures for quantifying efficiency loss in cost-sharing mechanisms and prove simultaneous approximate budget-balance and approximate efficiency guarantees for mechanisms for a wide range of cost-sharing problems, including all submodular and Steiner tree problems. Our key technical tool is an exact characterization of worst-case efficiency loss in Moulin mechanisms, the dominant paradigm in cost-sharing mechanism design.

...read moreread less

Journal Article•DOI•

[...]

Hagit Attiya¹, Rachid Guerraoui², Danny Hendler³, Petr Kuznetsov⁴•Institutions (4)

Technion – Israel Institute of Technology¹, École Polytechnique Fédérale de Lausanne², Ben-Gurion University of the Negev³, Deutsche Telekom⁴

Permuting streaming data using RAMs

TL;DR: It is established that the worst-case operation time complexity of obstruction-free implementations is high, even in the absence of step contention, and it is shown that lock-based implementations are not subject to some of the time-complexity lower bounds the authors present.

...read moreread less

Abstract: Obstruction-free implementations of concurrent objects are optimized for the common case where there is no step contention, and were recently advocated as a solution to the costs associated with synchronization without locks. In this article, we study this claim and this goes through precisely defining the notions of obstruction-freedom and step contention. We consider several classes of obstruction-free implementations, present corresponding generic object implementations, and prove lower bounds on their complexity. Viewed collectively, our results establish that the worst-case operation time complexity of obstruction-free implementations is high, even in the absence of step contention. We also show that lock-based implementations are not subject to some of the time-complexity lower bounds we present.

...read moreread less

Journal Article•DOI•

[...]

Markus Püschel¹, Peter Milder¹, James C. Hoe¹•Institutions (1)

Carnegie Mellon University¹

The complexity of query containment in expressive fragments of XPath 2.0

TL;DR: This article presents a method for constructing hardware structures that perform a fixed permutation on streaming data and provides an algorithm that completely specifies the datapath and control logic given the desired permutation and streaming width.

...read moreread less

Abstract: This article presents a method for constructing hardware structures that perform a fixed permutation on streaming data. The method applies to permutations that can be represented as linear mappings on the bit-level representation of the data locations. This subclass includes many important permutations such as stride permutations (corner turn, perfect shuffle, etc.), the bit reversal, the Hadamard reordering, and the Gray code reordering.The datapath for performing the streaming permutation consists of several independent banks of memory and two interconnection networks. These structures are built for a given streaming width (i.e., number of inputs and outputs per cycle) and operate at full throughput for this streaming width.We provide an algorithm that completely specifies the datapath and control logic given the desired permutation and streaming width. Further, we provide lower bounds on the achievable cost of a solution and show that for an important subclass of permutations our solution is optimal.We apply our algorithm to derive datapaths for several important permutations, including a detailed example that carefully illustrates each aspect of the design process. Lastly, we compare our permutation structures to those of Jarvinen et al. [2004], which are specialized for stride permutations.

...read moreread less

Journal Article•DOI•

[...]

Balder ten Cate¹, Carsten Lutz²•Institutions (2)

École normale supérieure de Cachan¹, University of Bremen²