scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the ACM in 1998"


Journal ArticleDOI
TL;DR: It is proved that (1 - o(1) ln n setcover is a threshold below which setcover cannot be approximated efficiently, unless NP has slightlysuperpolynomial time algorithms.
Abstract: Given a collection ℱ of subsets of S = {1,…,n}, set cover is the problem of selecting as few as possible subsets from ℱ such that their union covers S,, and max k-cover is the problem of selecting k subsets from ℱ such that their union has maximum cardinality. Both these problems are NP-hard. We prove that (1 - o(1)) ln n is a threshold below which set cover cannot be approximated efficiently, unless NP has slightly superpolynomial time algorithms. This closes the gap (up to low-order terms) between the ratio of approximation achievable by the greedy alogorithm (which is (1 - o(1)) ln n), and provious results of Lund and Yanakakis, that showed hardness of approximation within a ratio of (log2n) / 2 ≃0.72 ln n. For max k-cover, we show an approximation threshold of (1 - 1/e)(up to low-order terms), under assumption that P ≠ NP.

2,941 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that given an integer k ≥ 1, (1 + ϵ)-approximation to the k nearest neighbors of q can be computed in additional O(kd log n) time.
Abstract: Consider a set of S of n data points in real d-dimensional space, Rd, where distances are measured using any Minkowski metric. In nearest neighbor searching, we preprocess S into a data structure, so that given any query point q∈ Rd, is the closest point of S to q can be reported quickly. Given any positive real ϵ, data point p is a (1 +ϵ)-approximate nearest neighbor of q if its distance from q is within a factor of (1 + ϵ) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in Rd in O(dn log n) time and O(dn) space, so that given a query point q ∈ Rd, and ϵ > 0, a (1 + ϵ)-approximate nearest neighbor of q can be computed in O(cd, ϵ log n) time, where cd,ϵ≤d ⌈1 + 6d/ϵ⌉d is a factor depending only on dimension and ϵ. In general, we show that given an integer k ≥ 1, (1 + ϵ)-approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.

2,813 citations


Journal ArticleDOI
TL;DR: This work describes schemes that enable a user to access k replicated copies of a database and privately retrieve information stored in the database, so that each individual server gets no information on the identity of the item retrieved by the user.
Abstract: Publicly accessible databases are an indispensable resource for retrieving up-to-date information. But they also pose a significant risk to the privacy of the user, since a curious database operator can follow the user's queries and infer what the user is after. Indeed, in cases where the users' intentions are to be kept secret, users are often cautious about accessing the database. It can be shown that when accessing a single database, to completely guarantee the privacy of the user, the whole database should be down-loaded; namely n bits should be communicated (where n is the number of bits in the database).In this work, we investigate whether by replicating the database, more efficient solutions to the private retrieval problem can be obtained. We describe schemes that enable a user to access k replicated copies of a database (k≥2) and privately retrieve information stored in the database. This means that each individual server (holding a replicated copy of the database) gets no information on the identity of the item retrieved by the user. Our schemes use the replication to gain substantial saving. In particular, we present a two-server scheme with communication complexity O(n1/3).

1,918 citations


Journal ArticleDOI
TL;DR: It is proved that no MAX SNP-hard problem has a polynomial time approximation scheme, unless NP = P, and there exists a positive ε such that approximating the maximum clique size in an N-vertex graph to within a factor of Nε is NP-hard.
Abstract: We show that every language in NP has a probablistic verifier that checks membership proofs for it using logarithmic number of random bits and by examining a constant number of bits in the proof. If a string is in the language, then there exists a proof such that the verifier accepts with probability 1 (i.e., for every choice of its random string). For strings not in the language, the verifier rejects every provided “proof” with probability at least 1/2. Our result builds upon and improves a recent result of Arora and Safra [1998] whose verifiers examine a nonconstant number of bits in the proof (though this number is a very slowly growing function of the input length).As a consequence, we prove that no MAX SNP-hard problem has a polynomial time approximation scheme, unless NP = P. The class MAX SNP was defined by Papadimitriou and Yannakakis [1991] and hard problems for this class include vertex cover, maximum satisfiability, maximum cut, metric TSP, Steiner trees and shortest superstring. We also improve upon the clique hardness results of Feige et al. [1996] and Arora and Safra [1998] and show that there exists a positive e such that approximating the maximum clique size in an N-vertex graph to within a factor of Ne is NP-hard.

1,501 citations


Journal ArticleDOI
TL;DR: It is shown that approximating Clique and Independent Set, even in a very weak sense, is NP-hard, and the class NP contains exactly those languages for which membership proofs can be verified probabilistically in polynomial time.
Abstract: We give a new characterization of NP: the class NP contains exactly those languages L for which membership proofs (a proof that an input x is in L) can be verified probabilistically in polynomial time using logarithmic number of random bits and by reading sublogarithmic number of bits from the proof.We discuss implications of this characterization; specifically, we show that approximating Clique and Independent Set, even in a very weak sense, is NP-hard.

1,261 citations


Journal ArticleDOI
TL;DR: The previous best approximation algorithm for the problem (due to Christofides) achieves a 3/2-aproximation in polynomial time.
Abstract: We present a polynomial time approximation scheme for Euclidean TSP in fixed dimensions. For every fixed c > 1 and given any n nodes in ℛ2, a randomized version of the scheme finds a (1 + 1/c)-approximation to the optimum traveling salesman tour in O(n(log n)O(c)) time. When the nodes are in ℛd, the running time increases to O(n(log n)(O(√c))d-1). For every fixed c, d the running time is n · poly(logn), that is nearly linear in n. The algorithmm can be derandomized, but this increases the running time by a factor O(nd). The previous best approximation algorithm for the problem (due to Christofides) achieves a 3/2-aproximation in polynomial time.We also give similar approximation schemes for some other NP-hard Euclidean problems: Minimum Steiner Tree, k-TSP, and k-MST. (The running times of the algorithm for k-TSP and k-MST involve an additional multiplicative factor k.) The previous best approximation algorithms for all these problems achieved a constant-factor approximation. We also give efficient approximation schemes for Euclidean Min-Cost Matching, a problem that can be solved exactly in polynomial time.All our algorithms also work, with almost no modification, when distance is measured using any geometric norm (such as lp for p ≥ 1 or other Minkowski norms). They also have simple parallel (i.e., NC) implementations.

1,113 citations


Journal ArticleDOI
TL;DR: This paper considers the question of determining whether a function f has property P or is ε-far from any function with property P, and devise algorithms to test whether the underlying graph has properties such as being bipartite, k-Colorable, or having a clique of density p-Clique with respect to the vertex set.
Abstract: In this paper, we consider the question of determining whether a function f has property P or is e-far from any function with property P. A property testing algorithm is given a sample of the value of f on instances drawn according to some distribution. In some cases, it is also allowed to query f on instances of its choice. We study this question for different properties and establish some connections to problems in learning theory and approximation.In particular, we focus our attention on testing graph properties. Given access to a graph G in the form of being able to query whether an edge exists or not between a pair of vertices, we devise algorithms to test whether the underlying graph has properties such as being bipartite, k-Colorable, or having a p-Clique (clique of density p with respect to the vertex set). Our graph property testing algorithms are probabilistic and make assertions that are correct with high probability, while making a number of queries that is independent of the size of the graph. Moreover, the property testing algorithms can be used to efficiently (i.e., in time linear in the number of vertices) construct partitions of the graph that correspond to the property being tested, if it holds for the input graph.

1,027 citations


Journal ArticleDOI
Michael Kearns1
TL;DR: This paper formalizes a new but related model of learning from statistical queries, and demonstrates the generality of the statistical query model, showing that practically every class learnable in Valiant's model and its variants can also be learned in the new model (and thus can be learning in the presence of noise).
Abstract: In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of “robust” learning algorithms in the most general way, we formalize a new but related model of learning from statistical queries. Intuitively, in this model a learning algorithm is forbidden to examine individual examples of the unknown target function, but is given acess to an oracle providing estimates of probabilities over the sample space of random examples.One of our main results shows that any class of functions learnable from statistical queries is in fact learnable with classification noise in Valiant's model, with a noise rate approaching the information-theoretic barrier of 1/2. We then demonstrate the generality of the statistical query model, showing that practically every class learnable in Valiant's model and its variants can also be learned in the new model (and thus can be learned in the presence of noise). A notable exception to this statement is the class of parity functions, which we prove is not learnable from statistical queries, and for which no noise-tolerant algorithm is known.

662 citations


Journal ArticleDOI
TL;DR: Borders are improved for the Gomory-Hu tree problem, the parametric flow problem, and the approximate s-t cut problem by introducing a new approach to the maximum flow problem.
Abstract: We introduce a new approach to the maximum flow problem. This approach is based on assigning arc lengths based on the residual flow value and the residual arc capacities. Our approach leads to an O(min(n2/3, m1/2)m log(n2/m) log U) time bound for a network with n vertices, m arcs, and integral arc capacities in the range [1, …, U]. This is a fundamental improvement over the previous time bounds. We also improve bounds for the Gomory-Hu tree problem, the parametric flow problem, and the approximate s-t cut problem.

493 citations


Journal ArticleDOI
TL;DR: A duality relationship established between the value of the optimum solution to the authors' semidefinite program and the Lovász &thgr;-function is established and lower bounds on the gap between the best known approximation ratio in terms of n are shown.
Abstract: We consider the problem of coloring k-colorable graphs with the fewest possible colors. We present a randomized polynomial time algorithm that colors a 3-colorable graph on n vertices with min{O(D1/3 log1/2 D log n), O(n1/4 log1/2n)} colors where D is the maximum degree of any vertex. Besides giving the best known approximation ratio in terms of n, this marks the first nontrivial approximation result as a function of the maximum degree D. This result can be generalized to k-colorable graphs to obtain a coloring using min{O(D1-2/k log1/2 D log n), O(n1−3/(k+1) log1/2n)} colors. Our results are inspired by the recent work of Goemans and Williamson who used an algorithm for semidefinite optimization problems, which generalize linear programs, to obtain improved approximations for the MAX CUT and MAX 2-SAT problems. An intriguing outcome of our work is a duality relationship established between the value of the optimum solution to our semidefinite program and the Lovasz t-function. We show lower bounds on the gap between the optimum solution of our semidefinite program and the actual chromatic number; by duality this also demonstrates interesting new facts about the t-function.

492 citations


Journal ArticleDOI
TL;DR: The concept of a generalized tensor product is introduced and a number of lemmas concerning this product are proved to show that this relatively small number of operations is sufficient in many practical cases of interest in which the automata contain functional and not simply constant transitions.
Abstract: This paper examines numerical issues in computing solutions to networks of stochastic automata. It is well-known that when the matrices that represent the automata contain only constant values, the cost of performing the operation basic to all iterative solution methods, that of matrix-vector multiply, is given by ρN = ΠNi-1 ni × ΣNi=1 ni, where ni is the number of states in the ith automaton and N is the number of automata in the network. We introduce the concept of a generalized tensor product and prove a number of lemmas concerning this product. The result of these lemmas allows us to show that this relatively small number of operations is sufficient in many practical cases of interest in which the automata contain functional and not simply constant transitions. Furthermore, we show how the automata should be ordered to achieve this.

Journal ArticleDOI
TL;DR: The problem faced by a robot that must explore and learn an unknown room with obstacles in it is considered and a competitive algorithm for the case of a polygonal room with a bounded number of obstacles is given.
Abstract: We consider the problem faced by a robot that must explore and learn an unknown room with obstacles in it. We seek algorithms that achieve a bounded ratio of the worst-case distance traversed in order to see all visible points of the environment (thus creating a map), divided by the optimum distance needed to verify the map, if we had it in the beginning. The situation is complicated by the fact that the latter off-line problem (the problem of optimally verifying a map) is NP-hard. Although we show that there is no such “competitive” algorithm for general obstacle courses, we give a competitive algorithm for the case of a polygonal room with a bounded number of obstacles in it. We restrict ourselves to the rectilinear case, where each side of the obstacles and the room is parallel to one of the coordinates, and the robot must also move either parallel or perpendicular to the sides. (In a subsequent paper, we will discuss the extension to polygons of general shapes.)We also discuss the off-line problem for simple rectilinear polygons and find an optimal solution (in the L1 metric) in polynomial time, in the case where the entry and the exit are different points.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of implementing shared objects that tolerate the failure of both processes and base objects from which they are implemented, and show how to implement a shared object of type T which is t-tolerant for responsive and non-responsive failures.
Abstract: Wait-free implementations of shared objects tolerate the failure of processes, but not the failure of base objects from which they are implemented. We consider the problem of implementing shared objects that tolerate the failure of both processes and base objects.We identify two classes of object failures: responsive and nonresponsive. With responsive failures, a faulty object responds to every operation, but its responses may be incorrect. With nonresponsive failures, a faulty object may also “hang” without responding. In each class, we define crash, omission, and arbitrary modes of failure.We show that all responsive failure modes can be tolerated. More precisely, for all responsive failure modes ℱ, object types T, and t ≥ 0, we show how to implement a shared object of type T which is t-tolerant for ℱ. Such an object remains correct and wait-free even if up to t base objects fail according to ℱ. In contrast to responsive failures, we show that even the most benign non-responsive failure mode cannot be tolerated. We also show that randomization can be used to circumvent this impossibility result.Graceful degradation is a desirable property of fault-tolerant implementations: the implemented object never fails more severely than the base objects it is derived from, even if all the base objects fail. For several failure modes, we show wheter this property can be achieved, and, if so, how.

Journal ArticleDOI
TL;DR: All search and update operations have guaranteed expected cost O(log n), but now irrespective of any assumption on the input distribution.
Abstract: In this paper, we present randomized algorithms over binary search trees such that: (a) the insertion of a set of keys, in any fixed order, into an initially empty tree always produces a random binary search tree; (b) the deletion of any key from a random binary search tree results in a random binary search tree; (c) the random choices made by the algorithms are based upon the sizes of the subtrees of the tree; this implies that we can support accesses by rank without additional storage requirements or modification of the data structures; and (d) the cost of any elementary operation, measured as the number of visited nodes, is the same as the expected cost of its standard deterministic counterpart; hence, all search and update operations have guaranteed expected cost O(log n), but now irrespective of any assumption on the input distribution.

Journal ArticleDOI
TL;DR: The conjecture that recursive queries such as parity test and transitive closure cannot be expressed in the relational calculus augmented with polynomial inequality constraints over the reals is settled and a number of collapse results of the following form are established.
Abstract: The expressive power of first-order query languages with several classes of equality and inequality constraints is studied in this paper. We settle the conjecture that recursive queries such as parity test and transitive closure cannot be expressed in the relational calculus augmented with polynomial inequality constraints over the reals. Furthermore, noting that relational queries exhibit several forms of genericity, we establish a number of collapse results of the following form: The class of generic Boolean queries expressible in the relational calculus augmented with a given class of constraints coincides with the class of queries expressible in the relational calculus (with or without an order relation). We prove such results for both the natural and active-domain semantics. As a consequence, the relational calculus augmented with polynomial inequalities expresses the same classes of generic Boolean queries under both the natural and active-domain semantics.In the course of proving these results for the active-domin semantics, we establish Ramsey-type theorems saying that any query involving certain kinds of constraints coincides with a constraint-free query on databases whose elements come from a certain infinite subset of the domain. To prove the collapse results for the natural semantics, we make use of techniques from nonstandard analysis and from the model theory of ordered structures.

Journal ArticleDOI
TL;DR: A separation based on space complexity for synchronization primitives in randomized computation is implied, and that this separation differs from that implied by the deterministic “wait-free hierarchy.”
Abstract: The “waite-free hierarchy” provides a classification of multiprocessor synchronization primitives based on the values of n for which there are deterministic wait-free implementations of n-process consensus using instances of these objects and read-write registers. In a randomized wait-free setting, this classification is degenerate, since n-process consensus can be solved using only O(n) read-write registers.In this paper, we propose a classification of synchronization primitives based on the space complexity of randomized solutions to n-process consensus. A historyless object, such as a read-write register, a swap register, or a test&set register, is an object whose state depends only on the lost nontrivial operation thate was applied to it. We show that, using historyless objects, Ω(√n) object instances are necessary to solve n-process consensus. This lower bound holds even if the objects have unbounded size and the termination requirement is nondeterministic solo termination, a property strictly weaker than randomized wait-freedom.We then use this result to related the randomized space complexity of basic multiprocessor synchronization primitives such as shared counters, fetch&add registers, and compare&swap registers. Viewed collectively, our results imply that there is a separation based on space complexity for synchronization primitives in randomized computation, and that this separation differs from that implied by the deterministic “wait-free hierarchy.”

Journal ArticleDOI
TL;DR: It is demonstrated that rewrite techniques considerably restrict variable chaining and that further restrictions are possible if the transitive relation under consideration satisfies additional properties, such as symmetry.
Abstract: We propose inference systems for binary relations that satisfy composition laws such as transitivity. Our inference mechanisms are based on standard techniques from term rewriting and represent a refinement of chaining methods as they are used in the context of resolution-type theorem proving. We establish the refutational completeness of these calculi and prove that our methods are compatible with the usual simplification techniques employed in refutational theorem provers, such as subsumption or tautology deletion. Various optimizations of the basic chaining calculus will be discussed for theories with equality and for total orderings. A key to the practicality of chaining methods is the extent to which so-called variable chaining can be avoided. We demonstrate that rewrite techniques considerably restrict variable chaining and that further restrictions are possible if the transitive relation under consideration satisfies additional properties, such as symmetry. But we also show that variable chaining cannot be completely avoided in general.

Journal ArticleDOI
TL;DR: It is proved that if a logarithmic price quick hitting set generator exists then BPP = P, and the new derandomization method is based on a deterministic algorithm that constructs a discrepancy set for C, which depends on C.
Abstract: We show that quick hitting set generators can replace quick pseudorandom generators to derandomize any probabilistic two-sided error algorithms. Up to now quick hitting set generators have been known as the general and uniform derandomization method for probabilistic one-sided error algorithms, while quick pseudorandom generators as the generators as the general and uniform method to derandomize probabilistic two-sided error algorithms.Our method is based on a deterministic algorithm that, given a Boolean circuit C and given access to a hitting set generator, constructs a discrepancy set for C. The main novelty is that the discrepancy set depends on C, so the new derandomization method is not uniform (i.e., not oblivious).The algorithm works in time exponential in k(p(n)) where k(*) is the price of the hitting set generator and p(*) is a polynomial function in the size of C. We thus prove that if a logarithmic price quick hitting set generator exists then BPP = P.

Journal ArticleDOI
TL;DR: A new approach to analyze simplicial complexes in Euclidean 3-space by using methods from topology to analyze triangulated 3-manifolds and determining homology groups and concrete representations of their generators for a given complex.
Abstract: Recent developments in analyzing molecular structures and representing solid models using simplicial complexes have further enhanced the need for computing structural information about simplicial complexes in R3. This paper develops basic techniques required to manipulate and analyze structures of complexes in R3.A new approach to analyze simplicial complexes in Euclidean 3-space R3 is described. First, methods from topology are used to analyze triangulated 3-manifolds in R3. Then, it is shown that these methods can, in fact, be applied to arbitrary simplicial complexes in R3 after (simulating) the process of thickening a complex to a 3-manifold homotopic to it. As a consequence considerable structural information about the complex can be determined and certain discrete problems solved as well. For example, it is shown how to determine homology groups, as well as concrete representations of their generators, for a given complex in R3

Journal ArticleDOI
James Aspnes1
TL;DR: It is shown that given an adaptive adversary, any t-resilient asynchronous consensus protocol requires ω(t2/log) local coin flips in any model that can be simulated deterministically using atomic registers and gives the first nontrivial lower bound on the total work required by wait-free consensus.
Abstract: We examine a class of collective coin-flipping games that arises from randomized distributed algorithms with halting failures. In these games, a sequence of local coin flips is generated, which must be combined to form a single global coin flip. An adversary monitors the game and may attempt to bias its outcome by hiding the result of up to t local coin flips. We show that to guarantee at most constant bias, o(t2) local coins are needed, even if (a) the local coins can have arbitrary distributions and ranges, (b) the adversary is required to decide immediately wheter to hide or reveal each local coin, and (c) the game can detect which local coins have been hidden. If the adversary is permitted to control the outcome of the coin except for cases whose probability is polynomial in t, o(t2/log2t) local coins are needed. Combining this fact with an extended version of the well-known Fischer-Lynch-Paterson impossibility proof of deterministic consensus, we show that given an adaptive adversary, any t-resilient asynchronous consensus protocol requires o(t2/log2t) local coin flips in any model that can be simulated deterministically using atomic registers. This gives the first nontrivial lower bound on the total work required by wait-free consensus and is tight to within logarithmic factors.

Journal ArticleDOI
TL;DR: An object-based data model, whose structural part generalizes most of the known complex-object data models: cyclicity is allowed in both its schemas and instances is developed.
Abstract: We demonstrate the power of object identities (oids) as a database query language primitive. We develop an object-based data model, whose structural part generalizes most of the known complex-object data models: cyclicity is allowed in both its schemas and instances. Our main contribution is the operational part of the data model, the query language IQL, which uses oids for three critical purposes: (1) to represent data-structures with sharing and cycles, (2) to manipulate sets, and (3) to express any computable database query. IQL can be type checked, can be evaluated bottom-up, and naturally generalizes most popular rule-based languages. The model can also be extended to incorporate type inheritance, without changes to IQL. Finally, we investigate an analogous value-based data model, whose structural part is founded on regular infinte trees and whose operational part is IQL.

Journal ArticleDOI
TL;DR: The problem whether a regular language given by a neuromaton (or by a Hopfield acceptor) is nonempty, is proved to be PSPACE-complete and the class of Hopfield languages is shown to be closed under union, intersection, concatenation and complement.
Abstract: A finite automaton—the so-called neuromaton, realized by a finite discrete recurrent neural network, working in parallel computation mode, is considered. Both the size of neuromata (i.e., the number of neurons) and their descriptional complexity (i.e., the number of bits in the neuromaton representation) are studied. It is proved that a constraint time delay of the neuromaton output does not play a role within a polynomial descriptional complexity. It is shown that any regular language given by a regular expression of length n is recognized by a neuromaton with Θ(n) neurons. Further, it is proved that this network size is, in the worst case, optimal. On the other hand, generally there is not an equivalent polynomial length regular expression for a given neuromaton. Then, two specialized constructions of neural acceptors of the optimal descriptional complexity Θ(n) for a single n-bit string recognition are described. They both require O(n1/2) neurons and either O(n) connections with constant weights or O(n1/2) edges with weights of the O(2√n) size. Furthermore, the concept of Hopfield languages is introduced by means of so-called Hopfield neuromata (i.e., of neural networks with symmetric weights). It is proved that the class of Hopfield languages is strictly contained in the class of regular languages. The necessary and sufficient so-called Hopfield condition stating when a regular language is a Hopfield language, is formulated. A construction of a Hopfield neuromaton is presented for a regular language satisfying the Hopfield condition. The class of Hopfield languages is shown to be closed under union, intersection, concatenation and complement, and it is not closed under iteration. Finally, the problem whether a regular language given by a neuromaton (or by a Hopfield acceptor) is nonempty, is proved to be PSPACE-complete. As a consequence, the same result for a neuromaton equivalence problem is achieved.

Journal ArticleDOI
TL;DR: This paper presents and proves correct a distributed algorithm that implements a sequentially consistent collection of shared read/update objects that is a generalization of one used in the Orca shared object system.
Abstract: This paper presents and proves correct a distributed algorithm that implements a sequentially consistent collection of shared read/update objects. This algorithm is a generalization of one used in the Orca shared object system. The algorithm caches objects in the local memory of processors according to application needs; each read operation accesses a single copy of the object, while each update accesses all copies. The algorithm uses broadcast communication when it sends messages to replicated copies of an object, and it uses point-to-point communication when a message is sent to a single copy, and when a reply is returned. Copies of all objects are kept consistent using a strategy based on sequence numbers for broadcasts.The algorithm is presented in two layers. The lower layer uses the given broadcast and point-to-point communication services, plus sequence numbers, to provide a new communication service called a context multicast channel. The higher layer uses a context multicast channel to manage the object replication in a consistent fashion. Both layers and their combination are described and verified formally, using the I/O automation model for asynchronous concurrent systems.

Journal ArticleDOI
TL;DR: The construction implies that the complexity class Strong-RP defined by Sipser, equals RP, and gives the first polynomial-time simulation of RP algorithms using the output of any “&eegr;-minimally random” source.
Abstract: An (N, M, T)-OR-disperser is a bipartite multigraph G=(V, W, E) with |V| = N, and |W| = M, having the following expansion property: any subset of V having at least T vertices has a neighbor set of size at least M/2. For any pair of constants x, l, 1 ≥ x > l ≥ 0, any sufficiently large N, and for any T ≥ 2(logN)M ≤ 2(log N)l, we give an explicit elementary construction of an (N, M, T)-OR-disperser such that the out-degree of any vertex in V is at most polylogarithmic in N. Using this with known applications of OR-dispersers yields several results. First, our construction implies that the complexity class Strong-RP defined by Sipser, equals RP. Second, for any fixed e > 0, we give the first polynomial-time simulation of RP algorithms using the output of any “e-minimally random” source. For any integral R > 0, such a source accepts a single request for an R-bit string and generates the string according to a distribution that assigns probability at most 2−Re to any string. It is minimally random in the sense that any weaker source is insufficient to do a black-box polynomial-time simulation of RP algorithms.

Journal ArticleDOI
TL;DR: This work analyzes how quickly a set of abstract processes competing for the use of a number of resources can access the given resource using a simple randomized strategy and obtains precise bounds on the performance of both strategies.
Abstract: Consider an on-line scheduling problem in which a set of abstract processes are competing for the use of a number of resources. Further assume that it is either prohibitively expensive or impossible for any two of the processes to directly communicate with one another. If several processes simultaneously attempt to allocate a particular resource (as may be expected to occur, since the processes cannot easily coordinate their allocations), then none succeed. In such a framework, it is a challenge to design efficient contention resolution protocols.Two recently-proposed approaches to the problem of PRAM emulation give rise to scheduling problems of the above kind. In one approach, the resources (in this case, the shared memory cells) are duplicated and distributed randomly. We analyze a simple and efficient deterministic algorithm for accessing some subset of the duplicated resources. In the other approach, we analyze how quickly we can access the given (nonduplicated) resource using a simple randomized strategy. We obtain precise bounds on the performance of both strategies. We anticipate that our results with find other applications.

Journal ArticleDOI
TL;DR: This paper develops the concepts and notations to verifiy some properties of a directory-based protocol designed for non-FIFO interconnection networks and compares the verification of the protocol with SSM and with the Stanford Mur, a verification tool enumerating system states.
Abstract: Directory-based coherence protocols in shared-memory multiprocessors are so complex that verification techniques based on automated procedures are required to establish their correctness. State enumeration approaches are well-suited to the verification of cache protocols but they face the problem of state space explosion, leading to unacceptable verification time and memory consumption even for small system configurations. One way to manage this complexity and make the verification feasible is to map the system model to verify onto a symbolic state model (SSM). Since the number of symbolic states is considerably less than the number of system states, an exhaustive state search becomes possible, even for large-scale sytems and complex protocols.In this paper, we develop the concepts and notations to verifiy some properties of a directory-based protocol designed for non-FIFO interconnection networks. We compare the verification of the protocol with SSM and with the Stanford Mur φ, a verification tool enumerating system states. We show that SSM is much more efficient in terms of verification time and memory consumption and therefore holds that promise of verifying much more complex protocols. A unique feature of SSM is that it verifies protocols for any system size and therefore provides reliable verification results in one run of the tool.

Journal ArticleDOI
TL;DR: An efficient algorithm for PAC-learning a very general class of geometric concepts over R for fixed d, and a statistical query version of the algorithm that can tolerate random classification noise is presented.
Abstract: We present an efficient algorithm for PAC-learning a very general class of geometric concepts over ℛd for fixed d. More specifically, let 𝒯 be any set of s halfspaces. Let x =(x1, …, xd) be an arbitrary point in ℛd. With each t ∈ 𝒯 we associate a boolean indicator function It(x) which is 1 if and only if x is in the halfspace t. The concept class, 𝒞ds, that we study consists of all concepts formed by any Boolean function over It1, …, Its for ti ∈ 𝒯. This class is much more general than any geometric concept class known to be PAC-learnable. Our results can be extended easily to learn efficiently any Boolean combination of a polynomial number of concepts selected from any concept class 𝒞 over ℛd given that the VC-dimension of 𝒞 has dependence only on d and there is a polynomial time algorithm to determine if there is a concept from 𝒞 consistent with a given set of labeled examples. We also present a statistical query version of our algorithm that can tolerate random classification noise. Finally we present a generalization of the standard e-net result of Haussler and Welzl [1987] and apply it to give an alternative noise-tolerant algorithm for d = 2 based on geometric subdivisions.

Journal ArticleDOI
TL;DR: It is demonstrated that superfinite queries represent an interesting and nontrivial subclass within the class of all finite queries, and how a decision procedure for super-entailment can be used to enhance tests for query finiteness is shown.
Abstract: A database query is finite if its result consists of a finite sets tuples. For queries formulated as sets of pure Horn rules, the problem of determining finiteness is, in general, undecidable.In this paper, we consider superfiniteness—a stronger kind of finiteness, which applies to Horn queries whose function symbols are replaced by the abstraction of infinite relations with finiteness constraints (abbr., FC's). We show that superfiniteness is not only decidable but also axiomatizable, and the axiomatization yields an effective decision procedure. Although there are finite queries that are not superfinite, we demonstrate that superfinite queries represent an interesting and nontrivial subclass within the class of all finite queries.The we turn to the issue of inference of finiteness constraints—an important practical problem that is instrumental in deciding if a query is evaluable by a bottom-up algorithm. Although it is not known whether FC-entailment is decidable for sets of function-free Horn rules, we show that super-entailment, a stronger form of entailment, is decidable. We also show how a decision procedure for super-entailment can be used to enhance tests for query finiteness.

Journal ArticleDOI
TL;DR: Adaptive Gaussian approximation algorithms are given, which improves upon previously known best asymptotic bounds and blending techniques from integral geometry, computational geometry and numerical analysis are achieved.
Abstract: The following problems that arise in the computation of electrostatic forces and in the Boundary Element Method are considered. Given two convex interior-disjoint polyhedra in 3-space endowed with a volume charge density which is a polynomial in the Cartesian coordinates of R3, compute the Coulomb force acting on them. Given two interior-disjoint polygons in 3-space endowed with a surface charge density which is polynomial in the Cartesian coordinates of R3, compute the normal component of the Coulomb force acting on them. For both problems adaptive Gaussian approximation algorithms are given, which, for n Gaussian points, in time O(n), achieve absolute error O(c-√n) for a constant c > 1. Such a result improves upon previously known best asymptotic bounds. This result is achieved by blending techniques from integral geometry, computational geometry and numerical analysis. In particular, integral geometry is used in order to represent the forces as integrals whose kernal is free from singularities.

Journal ArticleDOI
TL;DR: It is shown that, uniformly on all instances of uniform AND/OR trees, the parallel And/OR tree algorithm achieves an asymptotic linear speedup using a polynomial number of processors in the height of the tree.
Abstract: A class of parallel algorithms for evaluating game trees is presented. These algorithms parallelize a standard sequential algorithm for evaluating AND/OR trees and the a-b pruning procedure for evaluating MIN/MAX trees. It is shown that, uniformly on all instances of uniform AND/OR trees, the parallel AND/OR tree algorithm achieves an asymptotic linear speedup using a polynomial number of processors in the height of the tree. The analysis of linear speedup using more than a linear number of processors is due to J. Harting. A numerical lower bound rigorously establishes a good speedup for the uniform AND/OR trees with parameters that are typical in practice. The performance of the parallel a-b algorithm on best-ordered MIN/MAX trees is analyzed.