scispace - formally typeset
Search or ask a question

Showing papers in "SIAM Journal on Computing in 2008"


Journal ArticleDOI
TL;DR: In this article, the authors provide formal definitions and efficient secure techniques for turning noisy information into keys usable for any cryptographic application, and, in particular, reliably and securely authenticating biometric data.
Abstract: We provide formal definitions and efficient secure techniques for turning noisy information into keys usable for any cryptographic application, and, in particular, reliably and securely authenticating biometric data. Our techniques apply not just to biometric information, but to any keying material that, unlike traditional cryptographic keys, is (1) not reproducible precisely and (2) not distributed uniformly. We propose two primitives: a fuzzy extractor reliably extracts nearly uniform randomness $R$ from its input; the extraction is error-tolerant in the sense that $R$ will be the same even if the input changes, as long as it remains reasonably close to the original. Thus, $R$ can be used as a key in a cryptographic application. A secure sketch produces public information about its input $w$ that does not reveal $w$ and yet allows exact recovery of $w$ given another value that is close to $w$. Thus, it can be used to reliably reproduce error-prone biometric inputs without incurring the security risk inherent in storing them. We define the primitives to be both formally secure and versatile, generalizing much prior work. In addition, we provide nearly optimal constructions of both primitives for various measures of “closeness” of input data, such as Hamming distance, edit distance, and set difference.

1,279 citations


Journal ArticleDOI
TL;DR: It is established that the fair cost allocation protocol is in fact a useful mechanism for inducing strategic behavior to form near-optimal equilibria, and its results are extended to cases in which users are seeking to balance network design costs with latencies in the constructed network.
Abstract: Network design is a fundamental problem for which it is important to understand the effects of strategic behavior. Given a collection of self-interested agents who want to form a network connecting certain endpoints, the set of stable solutions—the Nash equilibria—may look quite different from the centrally enforced optimum. We study the quality of the best Nash equilibrium, and refer to the ratio of its cost to the optimum network cost as the price of stability. The best Nash equilibrium solution has a natural meaning of stability in this context—it is the optimal solution that can be proposed from which no user will defect. We consider the price of stability for network design with respect to one of the most widely studied protocols for network cost allocation, in which the cost of each edge is divided equally between users whose connections make use of it; this fair-division scheme can be derived from the Shapley value and has a number of basic economic motivations. We show that the price of stability for network design with respect to this fair cost allocation is $O(\log k)$, where $k$ is the number of users, and that a good Nash equilibrium can be achieved via best-response dynamics in which users iteratively defect from a starting solution. This establishes that the fair cost allocation protocol is in fact a useful mechanism for inducing strategic behavior to form near-optimal equilibria. We discuss connections to the class of potential games defined by Monderer and Shapley, and extend our results to cases in which users are seeking to balance network design costs with latencies in the constructed network, with stronger results when the network has only delays and no construction costs. We also present bounds on the convergence time of best-response dynamics, and discuss extensions to a weighted game.

855 citations


Journal ArticleDOI
TL;DR: It is proved that a quantum circuit with T gates whose underlying graph has a treewidth d can be simulated deterministically in T^{O(1)}\exp[O(d)]$ time, which, in particular, is polynomial in $T$ if d=O(\log T)$.
Abstract: The treewidth of a graph is a useful combinatorial measure of how close the graph is to a tree. We prove that a quantum circuit with $T$ gates whose underlying graph has a treewidth $d$ can be simulated deterministically in $T^{O(1)}\exp[O(d)]$ time, which, in particular, is polynomial in $T$ if $d=O(\log T)$. Among many implications, we show efficient simulations for log-depth circuits whose gates apply to nearby qubits only, a natural constraint satisfied by most physical implementations. We also show that one-way quantum computation of Raussendorf and Briegel (Phys. Rev. Lett., 86 (2001), pp. 5188-5191), a universal quantum computation scheme with promising physical implementations, can be efficiently simulated by a randomized algorithm if its quantum resource is derived from a small-treewidth graph with a constant maximum degree. (The requirement on the maximum degree was removed in [I. L. Markov and Y. Shi, preprint:quant-ph/0511069].)

409 citations


Journal ArticleDOI
TL;DR: This paper provides a self-contained and complete proof of universal fault-tolerant quantum computation in the presence of local noise, and shows that local noise is in principle not an obstacle for scalable quantum computation.
Abstract: This paper shows that quantum computation can be made fault-tolerant against errors and inaccuracies when $\eta$, the probability for an error in a qubit or a gate, is smaller than a constant threshold $\eta_c$. This result improves on Shor's result [Proceedings of the 37th Symposium on the Foundations of Computer Science, IEEE, Los Alamitos, CA, 1996, pp. 56-65], which shows how to perform fault-tolerant quantum computation when the error rate $\eta$ decays polylogarithmically with the size of the computation, an assumption which is physically unreasonable. The cost of making the quantum circuit fault-tolerant in our construction is polylogarithmic in time and space. Our result holds for a very general local noise model, which includes probabilistic errors, decoherence, amplitude damping, depolarization, and systematic inaccuracies in the gates. Moreover, we allow exponentially decaying correlations between the errors both in space and in time. Fault-tolerant computation can be performed with any universal set of gates. The result also holds for quantum particles with $p>2$ states, namely, $p$-qudits, and is also generalized to one-dimensional quantum computers with only nearest-neighbor interactions. No measurements, or classical operations, are required during the quantum computation. We estimate the threshold of our construction to be $\eta_c\simeq 10^{-6}$, in the best case. By this we show that local noise is in principle not an obstacle for scalable quantum computation. The main ingredient of our proof is the computation on states encoded by a quantum error correcting code (QECC). To this end we introduce a special class of Calderbank-Shor-Steane (CSS) codes, called polynomial codes (the quantum analogue of Reed-Solomon codes). Their nice algebraic structure allows all of the encoded gates to be transversal. We also provide another version of the proof which uses more general CSS codes, but its encoded gates are slightly less elegant. To achieve fault tolerance, we encode the quantum circuit by another circuit by using one of these QECCs. This step is repeated polyloglog many times, each step slightly improving the effective error rate, to achieve the desired reliability. The resulting circuit exhibits a hierarchical structure, and for the analysis of its robustness we borrow terminology from Khalfin and Tsirelson [Found. Phys., 22 (1992), pp. 879-948] and Gacs [Advances in Computing Research: A Research Annual: Randomness and Computation, JAI Press, Greenwich, CT, 1989]. The paper is to a large extent self-contained. In particular, we provide simpler proofs for many of the known results we use, such as the fact that it suffices to correct for bit-flips and phase-flips, the correctness of CSS codes, and the fact that two-qubit gates are universal, together with their extensions to higher-dimensional particles. We also provide full proofs of the universality of the sets of gates we use (the proof of universality was missing in Shor's paper). This paper thus provides a self-contained and complete proof of universal fault-tolerant quantum computation in the presence of local noise.

361 citations


Journal ArticleDOI
TL;DR: This work gives the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is given access to labeled examples drawn from a distribution, without restriction on the labels.
Abstract: We give a computationally efficient algorithm that learns (under distributional assumptions) a halfspace in the difficult agnostic framework of Kearns, Schapire, and Sellie [Mach. Learn., 17 (1994), pp. 115-141], where a learner is given access to a distribution on labelled examples but where the labelling may be arbitrary (similar to malicious noise). It constructs a hypothesis whose error rate on future examples is within an additive $\epsilon$ of the optimal halfspace, in time poly$(n)$ for any constant $\epsilon>0$, for the uniform distribution over $\{-1,1\}^n$ or unit sphere in $\mathbb R^n,$ as well as any log-concave distribution in $\mathbb R^n$. It also agnostically learns Boolean disjunctions in time $2^{\tilde{O}(\sqrt{n})}$ with respect to any distribution. Our algorithm, which performs $L_1$ polynomial regression, is a natural noise-tolerant arbitrary-distribution generalization of the well-known “low-degree” Fourier algorithm of Linial, Mansour, and Nisan. We observe that significant improvements on the running time of our algorithm would yield the fastest known algorithm for learning parity with noise, a challenging open problem in computational learning theory.

299 citations


Journal ArticleDOI
TL;DR: This work develops approximation algorithms for the problem of placing replicated data in arbitrary networks, where the nodes may both issue requests for data objects and have capacity for storing data objects so as to minimize the average data-access cost.
Abstract: We develop approximation algorithms for the problem of placing replicated data in arbitrary networks, where the nodes may both issue requests for data objects and have capacity for storing data objects so as to minimize the average data-access cost. We introduce the data placement problem to model this problem. We have a set of caches $\mathcal{F}$, a set of clients $\mathcal{D}$, and a set of data objects $\mathcal{O}$. Each cache $i$ can store at most $u_i$ data objects. Each client $j\in\mathcal{D}$ has demand $d_j$ for a specific data object $o(j)\in\mathcal{O}$ and has to be assigned to a cache that stores that object. Storing an object $o$ in cache $i$ incurs a storage cost of $f_i^o$, and assigning client $j$ to cache $i$ incurs an access cost of $d_jc_{ij}$. The goal is to find a placement of the data objects to caches respecting the capacity constraints, and an assignment of clients to caches so as to minimize the total storage and client access costs. We present a 10-approximation algorithm for this problem. Our algorithm is based on rounding an optimal solution to a natural linear-programming relaxation of the problem. One of the main technical challenges encountered during rounding is to preserve the cache capacities while incurring only a constant-factor increase in the solution cost. We also introduce the connected data placement problem to capture settings where write-requests are also issued for data objects, so that one requires a mechanism to maintain consistency of data. We model this by requiring that all caches containing a given object be connected by a Steiner tree to a root for that object, which issues a multicast message upon a write to (any copy of) that object. The total cost now includes the cost of these Steiner trees. We devise a 14-approximation algorithm for this problem. We show that our algorithms can be adapted to handle two variants of the problem: (a) a $k$-median variant, where there is a specified bound on the number of caches that may contain a given object, and (b) a generalization where objects have lengths and the total length of the objects stored in any cache must not exceed its capacity.

253 citations


Journal ArticleDOI
TL;DR: It is shown that embeddings into $L_1$ are insufficient but that the additional structure provided by many embedding theorems does suffice for the authors' purposes, and an optimal $O(\log k)$-approximate max-flow/min-vertex-cut theorem for arbitrary vertex-capacitated multicommodity flow instances on $k$ terminals is proved.
Abstract: We develop the algorithmic theory of vertex separators and its relation to the embeddings of certain metric spaces. Unlike in the edge case, we show that embeddings into $L_1$ (and even Euclidean embeddings) are insufficient but that the additional structure provided by many embedding theorems does suffice for our purposes. We obtain an $O(\sqrt{\log n})$ approximation for minimum ratio vertex cuts in general graphs, based on a new semidefinite relaxation of the problem, and a tight analysis of the integrality gap which is shown to be $\Theta(\sqrt{\log n})$. We also prove an optimal $O(\log k)$-approximate max-flow/min-vertex-cut theorem for arbitrary vertex-capacitated multicommodity flow instances on $k$ terminals. For uniform instances on any excluded-minor family of graphs, we improve this to $O(1)$, and this yields a constant-factor approximation for minimum ratio vertex cuts in such graphs. Previously, this was known only for planar graphs, and for general excluded-minor families the best known ratio was $O(\log n)$. These results have a number of applications. We exhibit an $O(\sqrt{\log n})$ pseudoapproximation for finding balanced vertex separators in general graphs. In fact, we achieve an approximation ratio of $O(\sqrt{\log {opt}})$, where ${opt}$ is the size of an optimal separator, improving over the previous best bound of $O(\log {opt})$. Likewise, we obtain improved approximation ratios for treewidth: in any graph of treewidth $k$, we show how to find a tree decomposition of width at most $O(k \sqrt{\log k})$, whereas previous algorithms yielded $O(k \log k)$. For graphs excluding a fixed graph as a minor (which includes, e.g., bounded genus graphs), we give a constant-factor approximation for the treewidth. This in turn can be used to obtain polynomial-time approximation schemes for several problems in such graphs.

238 citations


Journal ArticleDOI
TL;DR: It is shown that the CSP dichotomy for digraphs with no sources or sinks agrees with the algebraic characterization conjectured by Bulatov, Jeavons, and Krokhin in 2005.
Abstract: Bang-Jensen and Hell conjectured in 1990 (using the language of graph homomorphisms) a constraint satisfaction problem (CSP) dichotomy for digraphs with no sources or sinks. The conjecture states that the CSP for such a digraph is tractable if each component of its core is a cycle and is $NP$-complete otherwise. In this paper we prove this conjecture and, as a consequence, a conjecture of Bang-Jensen, Hell, and MacGillivray from 1995 classifying hereditarily hard digraphs. Further, we show that the CSP dichotomy for digraphs with no sources or sinks agrees with the algebraic characterization conjectured by Bulatov, Jeavons, and Krokhin in 2005.

209 citations


Journal ArticleDOI
TL;DR: It is shown how to convert the problem of verifying the satisfaction of a circuit by a given assignment to the task of verifying that a given function is close to being a Reed-Solomon codeword, i.e., a univariate polynomial of specified degree.
Abstract: We give constructions of probabilistically checkable proofs (PCPs) of length $n \cdot polylog n$ proving satisfiability of circuits of size $n$ that can be verified by querying $polylog n$ bits of the proof. We also give analogous constructions of locally testable codes (LTCs) mapping $n$ information bits to $n\cdot polylog n$ bit long codewords that are testable with $polylog n$ queries. Our constructions rely on new techniques revolving around properties of codes based on relatively high-degree polynomials in one variable, i.e., Reed-Solomon codes. In contrast, previous constructions of short PCPs, beginning with [L. Babai, L. Fortnow, L. Levin, and M. Szegedy, Checking computations in polylogarithmic time, in Proceedings of the 23rd ACM Symposium on Theory of Computing, ACM, New York, 1991, pp. 21-31] and until the recent [E. Ben-Sasson, O. Goldreich, P. Harsha, M. Sudan, and S. Vadhan, Robust PCPs of proximity, shorter PCPs, and applications to coding, in Proceedings of the 36th ACM Symposium on Theory of Computing, ACM, New York, 2004, pp. 13-15], relied extensively on properties of low-degree polynomials in many variables. We show how to convert the problem of verifying the satisfaction of a circuit by a given assignment to the task of verifying that a given function is close to being a Reed-Solomon codeword, i.e., a univariate polynomial of specified degree. This reduction also gives an alternative to using the “sumcheck protocol” [C. Lund, L. Fortnow, H. Karloff, and N. Nisan, J. ACM, 39 (1992), pp. 859-868]. We then give a new PCP for the special task of proving that a function is close to being a Reed-Solomon codeword. The resulting PCPs are not only shorter than previous ones but also arguably simpler. In fact, our constructions are also more natural in that they yield locally testable codes first, which are then converted to PCPs. In contrast, most recent constructions go in the opposite direction of getting locally testable codes from PCPs.

200 citations


Journal ArticleDOI
TL;DR: This paper presents the Forgetron family of kernel-based online classification algorithms, which overcome the problem of growing unboundedly the amount of memory required to store the online hypothesis, by restricting themselves to a predefined memory budget.
Abstract: The Perceptron algorithm, despite its simplicity, often performs well in online classification tasks. The Perceptron becomes especially effective when it is used in conjunction with kernel functions. However, a common difficulty encountered when implementing kernel-based online algorithms is the amount of memory required to store the online hypothesis, which may grow unboundedly as the algorithm progresses. Moreover, the running time of each online round grows linearly with the amount of memory used to store the hypothesis. In this paper, we present the Forgetron family of kernel-based online classification algorithms, which overcome this problem by restricting themselves to a predefined memory budget. We obtain different members of this family by modifying the kernel-based Perceptron in various ways. We also prove a unified mistake bound for all of the Forgetron algorithms. To our knowledge, this is the first online kernel-based learning paradigm which, on one hand, maintains a strict limit on the amount of memory it uses and, on the other hand, entertains a relative mistake bound. We conclude with experiments using real datasets, which underscore the merits of our approach.

192 citations


Journal ArticleDOI
TL;DR: It is shown that a graph property has an oblivious one-sided error tester if and only if ${\cal P}$ is semihereditary, and infer that some of the most well-studied graph properties, both in graph theory and computer science, are testable with one- sided error.
Abstract: The problem of characterizing all the testable graph properties is considered by many to be the most important open problem in the area of property testing. Our main result in this paper is a solution of an important special case of this general problem: Call a property tester oblivious if its decisions are independent of the size of the input graph. We show that a graph property ${\cal P}$ has an oblivious one-sided error tester if and only if ${\cal P}$ is semihereditary. We stress that any “natural” property that can be tested (either with one-sided or with two-sided error) can be tested by an oblivious tester. In particular, all the testers studied thus far in the literature were oblivious. Our main result can thus be considered as a precise characterization of the natural graph properties, which are testable with one-sided error. One of the main technical contributions of this paper is in showing that any hereditary graph property can be tested with one-sided error. This general result contains as a special case all the previous results about testing graph properties with one-sided error. More importantly, as a special case of our main result, we infer that some of the most well-studied graph properties, both in graph theory and computer science, are testable with one-sided error. Some of these properties are the well-known graph properties of being perfect, chordal, interval, comparability, permutation, and more. None of these properties was previously known to be testable.

Journal ArticleDOI
TL;DR: This paper reduces the problem of finding a disk covering the largest number of red points, while avoiding all the blue points to a near-linear expected-time randomized approximation algorithm and proves that approximate range counting has roughly the same time and space complexity as answering emptiness range queries.
Abstract: We study the question of finding a deepest point in an arrangement of regions and provide a fast algorithm for this problem using random sampling, showing it sufficient to solve this problem when the deepest point is shallow. This implies, among other results, a fast algorithm for approximately solving linear programming problems with violations. We also use this technique to approximate the disk covering the largest number of red points, while avoiding all the blue points, given two such sets in the plane. Using similar techniques implies that approximate range counting queries have roughly the same time and space complexity as emptiness range queries.

Journal ArticleDOI
TL;DR: This work uses a completely different, and elementary, approach to obtain a deterministic subexponential algorithm for the solution of parity games, and is almost as fast as the randomized algorithms mentioned above.
Abstract: The existence of polynomial-time algorithms for the solution of parity games is a major open problem. The fastest known algorithms for the problem are randomized algorithms that run in subexponential time. These algorithms are all ultimately based on the randomized subexponential simplex algorithms of Kalai and of Matousek, Sharir, and Welzl. Randomness seems to play an essential role in these algorithms. We use a completely different, and elementary, approach to obtain a deterministic subexponential algorithm for the solution of parity games. The new algorithm, like the existing randomized subexponential algorithms, uses only polynomial space, and it is almost as fast as the randomized subexponential algorithms mentioned above.

Journal ArticleDOI
TL;DR: DrDrineas et al. as mentioned in this paper constructed coresets and obtained an efficient two-stage sampling-based approximation algorithm for the very overconstrained ($n\gg d$) version of this classical problem, for all $p\in[1, \infty)
Abstract: The $\ell_p$ regression problem takes as input a matrix $A\in\mathbb{R}^{n\times d}$, a vector $b\in\mathbb{R}^n$, and a number $p\in[1,\infty)$, and it returns as output a number ${\cal Z}$ and a vector $x_{\text{{\sc opt}}}\in\mathbb{R}^d$ such that ${\cal Z}=\min_{x\in\mathbb{R}^d}\|Ax-b\|_p=\|Ax_{\text{{\sc opt}}}-b\|_p$. In this paper, we construct coresets and obtain an efficient two-stage sampling-based approximation algorithm for the very overconstrained ($n\gg d$) version of this classical problem, for all $p\in[1, \infty)$. The first stage of our algorithm nonuniformly samples $\hat{r}_1=O(36^p d^{\max\{p/2+1,p\}+1})$ rows of $A$ and the corresponding elements of $b$, and then it solves the $\ell_p$ regression problem on the sample; we prove this is an 8-approximation. The second stage of our algorithm uses the output of the first stage to resample $\hat{r}_1/\epsilon^2$ constraints, and then it solves the $\ell_p$ regression problem on the new sample; we prove this is a $(1+\epsilon)$-approximation. Our algorithm unifies, improves upon, and extends the existing algorithms for special cases of $\ell_p$ regression, namely, $p = 1,2$ [K. L. Clarkson, in Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, SIAM, Philadelphia, 2005, pp. 257-266; P. Drineas, M. W. Mahoney, and S. Muthukrishnan, in Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, SIAM, Philadelphia, 2006, pp. 1127-1136]. In the course of proving our result, we develop two concepts—well-conditioned bases and subspace-preserving sampling—that are of independent interest.

Journal ArticleDOI
Petr Hliněný, Sang-il Oum1
TL;DR: A new algorithm is presented that can output the rank-decomposition of width at most $k$ of a graph if such exists and run in time $O(n^3)$ where $n$ is the number of vertices / elements of the input, for each constant value of £k and any fixed finite field.
Abstract: We present a new algorithm that can output the rank-decomposition of width at most $k$ of a graph if such exists. For that we use an algorithm that, for an input matroid represented over a fixed finite field, outputs its branch-decomposition of width at most $k$ if such exists. This algorithm works also for partitioned matroids. Both of these algorithms are fixed-parameter tractable, that is, they run in time $O(n^3)$ where $n$ is the number of vertices / elements of the input, for each constant value of $k$ and any fixed finite field. The previous best algorithm for construction of a branch-decomposition or a rank-decomposition of optimal width due to Oum and Seymour [J. Combin. Theory Ser. B, 97 (2007), pp. 385-393] is not fixed-parameter tractable.

Journal ArticleDOI
TL;DR: The results indicate that the fundamental lower bound problems in complexity theory are, in turn, intimately linked with explicit construction problems in algebraic geometry and representation theory.
Abstract: In [K. D. Mulmuley and M. Sohoni, SIAM J. Comput., 31 (2001), pp. 496-526], henceforth referred to as Part I, we suggested an approach to the $P$ vs. $NP$ and related lower bound problems in complexity theory through geometric invariant theory. In particular, it reduces the arithmetic (characteristic zero) version of the $NP ot \subseteq P$ conjecture to the problem of showing that a variety associated with the complexity class $NP$ cannot be embedded in a variety associated with the complexity class $P$. We shall call these class varieties associated with the complexity classes $P$ and $NP$. This paper develops this approach further, reducing these lower bound problems—which are all nonexistence problems—to some existence problems: specifically to proving the existence of obstructions to such embeddings among class varieties. It gives two results towards explicit construction of such obstructions. The first result is a generalization of the Borel-Weil theorem to a class of orbit closures, which include class varieties. The second result is a weaker form of a conjectured analogue of the second fundamental theorem of invariant theory for the class variety associated with the complexity class $NC$. These results indicate that the fundamental lower bound problems in complexity theory are, in turn, intimately linked with explicit construction problems in algebraic geometry and representation theory. The results here were announced in [K. D. Mulmuley and M. Sohoni, in Advances in Algebra and Geometry (Hyderabad, $2001$), Hindustan Book Agency, New Delhi, India, 2003, pp. 239-261].

Journal ArticleDOI
TL;DR: It is proved that the Euclidean traveling salesman problem lies in the counting hierarchy, and it is conjecture that using transcendental constants provides no additional power, beyond nonuniform reductions to PosSLP, and some preliminary results supporting this conjecture are presented.
Abstract: We study two quite different approaches to understanding the complexity of fundamental problems in numerical analysis: (a) the Blum-Shub-Smale model of computation over the reals; and (b) a problem we call the “generic task of numerical computation,” which captures an aspect of doing numerical computation in floating point, similar to the “long exponent model” that has been studied in the numerical computing community. We show that both of these approaches hinge on the question of understanding the complexity of the following problem, which we call PosSLP: Given a division-free straight-line program producing an integer $N$, decide whether $N>0$. In the Blum-Shub-Smale model, polynomial-time computation over the reals (on discrete inputs) is polynomial-time equivalent to PosSLP when there are only algebraic constants. We conjecture that using transcendental constants provides no additional power, beyond nonuniform reductions to PosSLP, and we present some preliminary results supporting this conjecture. The generic task of numerical computation is also polynomial-time equivalent to PosSLP. We prove that PosSLP lies in the counting hierarchy. Combining this with work of Tiwari, we obtain that the Euclidean traveling salesman problem lies in the counting hierarchy—the previous best upper bound for this important problem (in terms of classical complexity classes) being PSPACE. In the course of developing the context for our results on arithmetic circuits, we present some new observations on the complexity of the arithmetic circuit identity testing (ACIT) problem. In particular, we show that if $n!$ is not ultimately easy, then ACIT has subexponential complexity.

Journal ArticleDOI
TL;DR: Lower bounds for determining the length of the shortest cycle and other graph properties are proved and two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor are discussed.
Abstract: We explore problems related to computing graph distances in the data-stream model. The goal is to design algorithms that can process the edges of a graph in an arbitrary order given only a limited amount of working memory. We are motivated by both the practical challenge of processing massive graphs such as the web graph and the desire for a better theoretical understanding of the data-stream model. In particular, we are interested in the trade-offs between model parameters such as per-data-item processing time, total space, and the number of passes that may be taken over the stream. These trade-offs are more apparent when considering graph problems than they were in previous streaming work that solved problems of a statistical nature. Our results include the following: (1) Spanner construction: There exists a single-pass, $\tilde{O}(tn^{1+1/t})$-space, $\tilde{O}(t^2n^{1/t})$-time-per-edge algorithm that constructs a $(2t+1)$-spanner. For $t=\Omega(\log n/{\log\log n})$, the algorithm satisfies the semistreaming space restriction of $O(n\operatorname{polylog}n)$ and has per-edge processing time $O(\operatorname{polylog}n)$. This resolves an open question from [J. Feigenbaum et al., Theoret. Comput. Sci., 348 (2005), pp. 207-216]. (2) Breadth-first-search (BFS) trees: For any even constant $k$, we show that any algorithm that computes the first $k$ layers of a BFS tree from a prescribed node with probability at least $2/3$ requires either greater than $k/2$ passes or $\tilde{\Omega}(n^{1+1/k})$ space. Since constructing BFS trees is an important subroutine in many traditional graph algorithms, this demonstrates the need for new algorithmic techniques when processing graphs in the data-stream model. (3) Graph-distance lower bounds: Any $t$-approximation of the distance between two nodes requires $\Omega(n^{1+1/t})$ space. We also prove lower bounds for determining the length of the shortest cycle and other graph properties. (4) Techniques for decreasing per-edge processing: We discuss two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by only a small factor.

Journal ArticleDOI
TL;DR: It is shown that the inapproximability results extend to envy-free pricing, an important problem in computational economics, and how the (budgeted) unique coverage problem has close connections to other theoretical problems including max cut, maximum coverage, and radio broad-casting.
Abstract: We prove semilogarithmic inapproximability for a maximization problem called unique coverage: given a collection of sets, find a subcollection that maximizes the number of elements covered exactly once. Specifically, assuming that $\mathrm{NP} ot\subseteq\operatorname{BPTIME}(2^{n^\varepsilon})$ for an arbitrary $\varepsilon>0$, we prove $O(1/\log^{\sigma}n)$ inapproximability for some constant $\sigma=\sigma(\varepsilon)$. We also prove $O(1/\log^{1/3-\varepsilon}n)$ inapproximability for any $\varepsilon>0$, assuming that refuting random instances of 3SAT is hard on average; and we prove $O(1/\log n)$ inapproximability under a plausible hypothesis concerning the hardness of another problem, balanced bipartite independent set. We establish an $\Omega(1/\log n)$-approximation algorithm, even for a more general (budgeted) setting, and obtain an $\Omega(1/\log B)$-approximation algorithm when every set has at most $B$ elements. We also show that our inapproximability results extend to envy-free pricing, an important problem in computational economics. We describe how the (budgeted) unique coverage problem, motivated by real-world applications, has close connections to other theoretical problems, including max cut, maximum coverage, and radio broadcasting.

Journal ArticleDOI
TL;DR: A deterministic oracle with constant query time for this problem that uses $O (n^2\log n)$ space, where $n$ is the number of vertices in $G$ and the construction time for the oracle is $O(mn^{2} + n^{3}\ log n)$.
Abstract: We consider the problem of preprocessing an edge-weighted directed graph $G$ to answer queries that ask for the length and first hop of a shortest path from any given vertex $x$ to any given vertex $y$ avoiding any given vertex or edge. As a natural application, this problem models routing in networks subject to node or link failures. We describe a deterministic oracle with constant query time for this problem that uses $O(n^2\log n)$ space, where $n$ is the number of vertices in $G$. The construction time for our oracle is $O(mn^{2} + n^{3}\log n)$. However, if one is willing to settle for $\Theta (n^{2.5})$ space, we can improve the preprocessing time to $O(mn^{1.5}+n^{2.5}\log n)$ while maintaining the constant query time. Our algorithms can find the shortest path avoiding a failed node or link in time proportional to the length of the path.

Journal ArticleDOI
TL;DR: Any monotone graph property can be tested with one-sided error, and with query complexity depending only on $\epsilon$, which unifies several previous results in the area of property testing and implies the testability of well-studied graph properties that were previously not known to be testable.
Abstract: A graph property is called monotone if it is closed under removal of edges and vertices. Many monotone graph properties are some of the most well-studied properties in graph theory, and the abstract family of all monotone graph properties was also extensively studied. Our main result in this paper is that any monotone graph property can be tested with one-sided error, and with query complexity depending only on $\epsilon$. This result unifies several previous results in the area of property testing and also implies the testability of well-studied graph properties that were previously not known to be testable. At the heart of the proof is an application of a variant of Szemeredi's regularity lemma. The main ideas behind this application may be useful in characterizing all testable graph properties and in generally studying graph property testing. As a byproduct of our techniques we also obtain additional results in graph theory and property testing, which are of independent interest. One of these results is that the query complexity of testing testable graph properties with one-sided error may be arbitrarily large. Another result, which significantly extends previous results in extremal graph theory, is that for any monotone graph property ${\cal P}$, any graph that is $\epsilon$-far from satisfying ${\cal P}$ contains a subgraph of size depending on $\epsilon$ only, which does not satisfy ${\cal P}$. Finally, we prove the following compactness statement: If a graph $G$ is $\epsilon$-far from satisfying a (possibly infinite) set of monotone graph properties ${\cal P}$, then it is at least $\delta_{{\cal P}}(\epsilon)$-far from satisfying one of the properties.

Journal ArticleDOI
TL;DR: It is shown that every weighted connected graph $G$ contains as a subgraph a spanning tree into which the edges of G can be embedded with average stretch, and this tree can be constructed in time in general and in time if the input graph is unweighted.
Abstract: We show that every weighted connected graph $G$ contains as a subgraph a spanning tree into which the edges of $G$ can be embedded with average stretch $O (\log^{2} n \log \log n)$. Moreover, we show that this tree can be constructed in time $O (m \log n + n \log^2 n)$ in general, and in time $O (m \log n)$ if the input graph is unweighted. The main ingredient in our construction is a novel graph decomposition technique. Our new algorithm can be immediately used to improve the running time of the recent solver for symmetric diagonally dominant linear systems of Spielman and Teng from $ m 2^{(O (\sqrt{\log n\log\log n})) }$ to $m \log^{O (1)}n$, and to $O ( n \log^{2} n \log \log n)$ when the system is planar. Our result can also be used to improve several earlier approximation algorithms that use low-stretch spanning trees.

Journal ArticleDOI
TL;DR: Several new dynamic algorithms for maintaining the transitiveclosure of a directed graph and several other algorithms for answering reachability queries without explicitly maintaining a transitive closure matrix are obtained.
Abstract: We obtain several new dynamic algorithms for maintaining the transitive closure of a directed graph and several other algorithms for answering reachability queries without explicitly maintaining a transitive closure matrix. Among our algorithms are: (i) A decremental algorithm for maintaining the transitive closure of a directed graph, through an arbitrary sequence of edge deletions, in $O(mn)$ total expected time, essentially the time needed for computing the transitive closure of the initial graph. Such a result was previously known only for acyclic graphs. (ii) Two fully dynamic algorithms for answering reachability queries. The first is deterministic and has an amortized insert/delete time of $O(m\sqrt{n})$, and worst-case query time of $O(\sqrt{n})$. The second is randomized and has an amortized insert/delete time of $O(m^{0.58}n)$ and worst-case query time of $O(m^{0.43})$. This significantly improves the query times of algorithms with similar update times. (iii) A fully dynamic algorithm for maintaining the transitive closure of an acyclic graph. The algorithm is deterministic and has a worst-case insert time of $O(m)$, constant amortized delete time of $O(1)$, and a worst-case query time of $O(n/\log n)$. Our algorithms are obtained by combining several new ideas, one of which is a simple sampling idea used for detecting decompositions of strongly connected components, with techniques of Even and Shiloach [J. ACM, 28 (1981), pp. 1-4], Italiano [Inform. Process. Lett., 28 (1988), pp. 5-11], Henzinger and King [Proceedings of the $36$th Annual Symposium on Foundations of Computer Science, Milwaukee, WI, 1995, pp. 664-672], and Frigioni et al. [ACM J. Exp. Algorithmics, 6 (2001), (electronic)].

Journal ArticleDOI
TL;DR: It is shown that neither resolution nor tree-like resolution is automatizable unless the class W[P] from the hierarchy of parameterized problems is fixed-parameter tractable by randomized algorithms with one-sided error.
Abstract: We show that neither resolution nor tree-like resolution is automatizable unless the class W[P] from the hierarchy of parameterized problems is fixed-parameter tractable by randomized algorithms with one-sided error.

Journal ArticleDOI
TL;DR: This work gives a poly(n//spl epsi/) time algorithm for learning a mixture of k arbitrary product distributions over the n-dimensional Boolean cube to accuracy to prove that no polynomial time algorithm can succeed when k is superconstant.
Abstract: We consider the problem of learning mixtures of product distributions over discrete domains in the distribution learning framework introduced by Kearns et al [Proceedings of the $26$th Annual Symposium on Theory of Computing (STOC), Montreal, QC, 1994, ACM, New York, pp 273-282] We give a $\operatorname{poly}(n/\epsilon)$-time algorithm for learning a mixture of $k$ arbitrary product distributions over the $n$-dimensional Boolean cube $\{0,1\}^n$ to accuracy $\epsilon$, for any constant $k$ Previous polynomial-time algorithms could achieve this only for $k = 2$ product distributions; our result answers an open question stated independently in [M Cryan, Learning and Approximation Algorithms for Problems Motivated by Evolutionary Trees, PhD thesis, University of Warwick, Warwick, UK, 1999] and [Y Freund and Y Mansour, Proceedings of the $12$th Annual Conference on Computational Learning Theory, 1999, pp 183-192] We further give evidence that no polynomial-time algorithm can succeed when $k$ is superconstant, by reduction from a difficult open problem in PAC (probably approximately correct) learning Finally, we generalize our $\operatorname{poly}(n/\epsilon)$-time algorithm to learn any mixture of $k = O(1)$ product distributions over $\{0,1, \dots, b-1\}^n$, for any $b = O(1)$

Journal ArticleDOI
TL;DR: A new type of computationally sound proof system called universal arguments is put forward, which adopts the instance-based prover-efficiency paradigm of CS proofs but follows the computational-soundness condition of argument systems.
Abstract: We put forward a new type of computationally sound proof system called universal arguments. Universal arguments are related but different from both CS proofs (as defined by Micali [SIAM J. Comput., 37 (2000), pp. 1253-1298]) and arguments (as defined by Brassard, Chaum, and Crepeau [J. Comput. System Sci., 37 (1988), pp. 156-189]. In particular, we adopt the instance-based prover-efficiency paradigm of CS proofs but follow the computational-soundness condition of argument systems (i.e., we consider only cheating strategies that are implementable by polynomial-size circuits). We show that universal arguments can be constructed based on standard intractability assumptions that refer to polynomial-size circuits (rather than based on assumptions that refer to subexponential-size circuits as used in the construction of CS proofs). Furthermore, these protocols have a constant number of rounds and are of the public-coin type. As an application of these universal arguments, we weaken the intractability assumptions used in the non-black-box zero-knowledge arguments of Barak [in Proceedings of the 42nd IEEE Symposiun on Foundations of Computer Science, 2001]. Specifically, we only utilize intractability assumptions that refer to polynomial-size circuits (rather than assumptions that refer to circuits of some “nice” superpolynomial size).

Journal ArticleDOI
TL;DR: An improved "cooling schedule" for simulated annealing algorithms for combinatorial counting problems is presented and improved analysis of the Markov chain underlying the simulatedAnnealing algorithm results in an O(n7 log4 n) time algorithm for approximating the permanent of a 0/1 matrix.
Abstract: We present an improved “cooling schedule” for simulated annealing algorithms for combinatorial counting problems. Under our new schedule the rate of cooling accelerates as the temperature decreases. Thus, fewer intermediate temperatures are needed as the simulated annealing algorithm moves from the high temperature (easy region) to the low temperature (difficult region). We present applications of our technique to colorings and the permanent (perfect matchings of bipartite graphs). Moreover, for the permanent, we improve the analysis of the Markov chain underlying the simulated annealing algorithm. This improved analysis, combined with the faster cooling schedule, results in an $O(n^7\log^4{n})$ time algorithm for approximating the permanent of a $0/1$ matrix.

Journal ArticleDOI
TL;DR: This paper presents an almost ideal solution to this problem: a hash function h: U: Uarrow V that, on any set of $n$ inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM and can be stored in $(1+\epsilon)n\log |V| + O(n+\log \log |U|)$ bits.
Abstract: Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theoretical ideal can be realized by hash functions that do not take up too much space and can be evaluated quickly. In this paper we present an almost ideal solution to this problem: a hash function $h: U\rightarrow V$ that, on any set of $n$ inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM and can be stored in $(1+\epsilon)n\log |V| + O(n+\log\log |U|)$ bits. Here $\epsilon$ can be chosen to be any positive constant, so this essentially matches the entropy lower bound. For many hashing schemes this is the first hash function that makes their uniform hashing analysis come true, with high probability, without incurring overhead in time or space.

Journal ArticleDOI
TL;DR: The complexity of computing the partition function of an instance of a weighted Boolean constraint satisfaction problem is studied in this paper, where it is shown that computing the sum of the weights of all configurations is in P.
Abstract: This paper gives a dichotomy theorem for the complexity of computing the partition function of an instance of a weighted Boolean constraint satisfaction problem. The problem is parameterized by a finite set $\mathcal{F}$ of nonnegative functions that may be used to assign weights to the configurations (feasible solutions) of a problem instance. Classical constraint satisfaction problems correspond to the special case of 0,1-valued functions. We show that computing the partition function, i.e., the sum of the weights of all configurations, is $\text{{\sf FP}}^{\text{{\sf\#P}}}$-complete unless either (1) every function in $\mathcal{F}$ is of “product type,” or (2) every function in $\mathcal{F}$ is “pure affine.” In the remaining cases, computing the partition function is in P.

Journal ArticleDOI
TL;DR: The results show that consensus can be solved even in the presence of $O(n^2)$ moving omission and/or arbitrary link failures per round, provided that both the number of affected outgoing and incoming links of every process is bounded.
Abstract: We provide a suite of impossibility results and lower bounds for the required number of processes and rounds for synchronous consensus under transient link failures. Our results show that consensus can be solved even in the presence of $O(n^2)$ moving omission and/or arbitrary link failures per round, provided that both the number of affected outgoing and incoming links of every process is bounded. Providing a step further toward the weakest conditions under which consensus is solvable, our findings are applicable to a variety of dynamic phenomena such as transient communication failures and end-to-end delay variations. We also prove that our model surpasses alternative link failure modeling approaches in terms of assumption coverage.