scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the ACM in 2001"


Journal ArticleDOI
TL;DR: It is proved optimal, up to an arbitrary ε > 0, inapproximability results for Max-E k-Sat for k ≥ 3, maximizing the number of satisfied linear equations in an over-determined system of linear equations modulo a prime p and Set Splitting.
Abstract: We prove optimal, up to an arbitrary e > 0, inapproximability results for Max-E k-Sat for k ≥ 3, maximizing the number of satisfied linear equations in an over-determined system of linear equations modulo a prime p and Set Splitting. As a consequence of these results we get improved lower bounds for the efficient approximability of many optimization problems studied previously. In particular, for Max-E2-Sat, Max-Cut, Max-di-Cut, and Vertex cover.

1,938 citations


Journal ArticleDOI
TL;DR: A new extension of the primal-dual schema and the use of Lagrangian relaxation to derive approximation algorithms for the metric uncapacitated facility location problem and the metric k-median problem achieving guarantees of 3 and 6 respectively.
Abstract: We present approximation algorithms for the metric uncapacitated facility location problem and the metric k-median problem achieving guarantees of 3 and 6 respectively. The distinguishing feature of our algorithms is their low running time: O(m logm) and O(m logm(L + log (n))) respectively, where n and m are the total number of vertices and edges in the underlying complete bipartite graph on cities and facilities. The main algorithmic ideas are a new extension of the primal-dual schema and the use of Lagrangian relaxation to derive approximation algorithms.

872 citations


Journal ArticleDOI
TL;DR: In this article, basic techniques to prove the unconditional security of quantum crypto graphy are applied to a quantum key distribution protocol proposed by Bennett and Brassard [1984] and considered a practical variation on the protocol in which the channel is noisy and photos may be lost during the transmission.
Abstract: Basic techniques to prove the unconditional security of quantum crypto graphy are described. They are applied to a quantum key distribution protocol proposed by Bennett and Brassard [1984]. The proof considers a practical variation on the protocol in which the channel is noisy and photos may be lost during the transmission. Each individual signal sent into the channel must contain a single photon or any two-dimensional system in the exact state described in the protocol. No restriction is imposed on the detector used at the receiving side of the channel, except that whether or not the received system is detected must be independent of the basis used to measure this system.

858 citations


Journal ArticleDOI
TL;DR: This paper presents a combinatorial polynomial-time algorithm for minimizing submodular functions, answering an open question posed in 1981 by Grötschel, Lovász, and Schrijver.
Abstract: This paper presents a combinatorial polynomial-time algorithm for minimizing submodular functions, answering an open question posed in 1981 by Grotschel, Lovasz, and Schrijver. The algorithm employs a scaling scheme that uses a flow in the complete directed graph on the underlying set with each arc capacity equal to the scaled parameter. The resulting algorithm runs in time bounded by a polynomial in the size of the underlying set and the length of the largest absolute function value. The paper also presents a strongly polynomial version in which the number of steps is bounded by a polynomial in the size of the underlying set, independent of the function values.

651 citations


Journal ArticleDOI
TL;DR: This work examines the number of queries to input variables that a quantum algorithm requires to compute Boolean functions on {0,1}N in the black-box model and gives asymptotically tight characterizations of T for all symmetric f in the exact, zero-error, and bounded-error settings.
Abstract: We examine the number of queries to input variables that a quantum algorithm requires to compute Boolean functions on {0,1}N in the black-box model. We show that the exponential quantum speed-up obtained for partial functions (i.e., problems involving a promise on the input) by Deutsch and Jozsa, Simon, and Shor cannot be obtained for any total function: if a quantum algorithm computes some total Boolean function f with small error probability using T black-box queries, then there is a classical deterministic algorithm that computes f exactly with O(Ts6) queries. We also give asymptotically tight characterizations of T for all symmetric f in the exact, zero-error, and bounded-error settings. Finally, we give new precise bounds for AND, OR, and PARITY. Our results are a quantum extension of the so-called polynomial method, which has been successfully applied in classical complexity theory, and also a quantum extension of results by Nisan about a polynomial relationship between randomized and deterministic decision tree complexity.

590 citations


Journal ArticleDOI
TL;DR: This paper relates proof width to proof length (=size), in both general Resolution, and its tree-like variant, and presents a family of tautologies on which it is exponentially faster.
Abstract: The widthof a Resolution proof is defined to be the maximal number of literals in any clause of the proof. In this paper, we relate proof width to proof length (=size), in both general Resolution, and its tree-like variant. The following consequences of these relations reveal width as a crucial “resource” of Resolution proofs.In one direction, the relations allow us to give simple, unified proofs for almost all known exponential lower bounds on size of resolution proofs, as well as several interesting new ones. They all follow from width lower bounds, and we show how these follow from natural expansion property of clauses of the input tautology.In the other direction, the width-size relations naturally suggest a simple dynamic programming procedure for automated theorem proving—one which simply searches for small width proofs. This relation guarantees that the runnuing time (and thus the size of the produced proof) is at most quasi-polynomial in the smallest tree-like proof. This algorithm is never much worse than any of the recursive automated provers (such as DLL) used in practice. In contrast, we present a family of tautologies on which it is exponentially faster.

560 citations


Journal ArticleDOI
TL;DR: Deterministic fully dynamic graph algorithms are presented for connectivity, minimum spanning tree, 2-edge connectivity, and biconnectivity.
Abstract: Deterministic fully dynamic graph algorithms are presented for connectivity, minimum spanning tree, 2-edge connectivity, and biconnectivity. Assuming that we start with no edges in a graph with n vertices, the amortized operation costs are O(log2n) for connectivity, O(log4n) for minimum spanning forest, 2-edge connectivity, and O(log5n) biconnectivity.

501 citations


Journal ArticleDOI
TL;DR: A general framework for solving resource allocation and scheduling problems, given a resource of fixed size, and presents algorithms that approximate the maximum throughput or the minimum loss by a constant factor.
Abstract: We present a general framework for solving resource allocation and scheduling problems. Given a resource of fixed size, we present algorithms that approximate the maximum throughput or the minimum loss by a constant factor. Our approximation factors apply to many problems, among which are: (i) real-time scheduling of jobs on parallel machines, (ii) bandwidth allocation for sessions between two endpoints, (iii) general caching, (iv) dynamic storage allocation, and (v) bandwidth allocation on optical line and ring topologies. For some of these problems we provide the first constant factor approximation algorithm. Our algorithms are simple and efficient and are based on the local-ratio technique. We note that they can equivalently be interpreted within the primal-dual schema.

438 citations


Journal Article
TL;DR: It is shown that, using the simpler Nisan--Wigderson generator and standard error-correcting codes, one can build even better extractors with the additional advantage that both the construction and the analysis are simple and admit a short self-contained description.
Abstract: We introduce a new approach to constructing extractors. Extractors are algorithms that transform a “weakly random” distribution into an almost uniform distribution. Explicit constructions of extractors have a variety of important applications, and tend to be very difficult to obtain. We demonstrate an unsuspected connection between extractors and pseudorandom generators. In fact, we show that every pseudorandom generator of a certain kind is an extractor. A pseudorandom generator construction due to Impagliazzo and Wigderson, once reinterpreted via our connection, is already an extractor that beats most known constructions and solves an important open question. We also show that, using the simpler Nisan-Wigderson generator and standard error-correcting codes, one can build even better extractors with the additional advantage that both the construction and the analysis are simple and admit a short self-contained description.

394 citations


Journal ArticleDOI
TL;DR: Efficient techniques for a number of parties to jointly generate an RSA key are described and each party holds a share of the private exponent that enables threshold decryption.
Abstract: We describe efficient techniques for a number of parties to jointly generate an RSA key. At the end of the protocol an RSA modulus N = pq is publicly known. None of the parties know the factorization of N. In addition a public encryption exponent is publicly known and each party holds a share of the private exponent that enables threshold decryption. Our protocols are efficient in computation and communication. All results are presented in the honest but curious scenario (passive adversary).

393 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that the compression ratio of Gzip and Gzip can be bounded in terms of the kth order empirical entropy of the input string for any k ≥ 0.
Abstract: The Burrows—Wheeler Transform (also known as Block-Sorting) is at the base of compression algorithms that are the state of the art in lossless data compression In this paper, we analyze two algorithms that use this technique The first one is the original algorithm described by Burrows and Wheeler, which, despite its simplicity outperforms the Gzip compressor The second one uses an additional run-length encoding step to improve compression We prove that the compression ratio of both algorithms can be bounded in terms of the kth order empirical entropy of the input string for any k ≥ 0 We make no assumptions on the input and we obtain bounds which hold in the worst case that is for every possible input string All previous results for Block-Sorting algorithms were concerned with the average compression ratio and have been established assuming that the input comes from a finite-order Markov source

Journal ArticleDOI
TL;DR: It is shown that DNNF is universal; supports a rich set of polynomial--time logical operations; is more space-efficient than OBDDs; and is very simple as far as its structure and algorithms are concerned.
Abstract: Knowledge compilation has been emerging recently as a new direction of research for dealing with the computational intractability of general propositional reasoning. According to this approach, the reasoning process is split into two phases: an off-line compilation phase and an on-line query-answering phase. In the off-line phase, the propositional theory is compiled into some target language, which is typically a tractable one. In the on-line phase, the compiled target is used to efficiently answer a (potentially) exponential number of queries. The main motivation behind knowledge compilation is to push as much of the computational overhead as possible into the off-line phase, in order to amortize that overhead over all on-line queries. Another motivation behind compilation is to produce very simple on-line reasoning systems, which can be embedded cost-effectively into primitive computational platforms, such as those found in consumer electronics.One of the key aspects of any compilation approach is the target language into which the propositional theory is compiled. Previous target languages included Horn theories, prime implicates/implicants and ordered binary decision diagrams (OBDDs). We propose in this paper a new target compilation language, known as decomposable negation normal form (DNNF), and present a number of its properties that make it of interest to the broad community. Specifically, we show that DNNF is universal; supports a rich set of polynomial--time logical operations; is more space-efficient than OBDDs; and is very simple as far as its structure and algorithms are concerned. Moreover, we present an algorithm for converting any propositional theory in clausal form into a DNNF and show that if the clausal form has a bounded treewidth, then its DNNF compilation has a linear size and can be computed in linear time (treewidth is a graph-theoretic parameter that measures the connectivity of the clausal form). We also propose two techniques for approximating the DNNF compilation of a theory when the size of such compilation is too large to be practical. One of the techniques generates a sound but incomplete compilation, while the other generates a complete but unsound compilation. Together, these approximations bound the exact compilation from below and above in terms of their ability to answer clausal entailment queries. Finally, we show that the class of polynomial--time DNNF operations is rich enough to support relatively complex AI applications, by proposing a specific framework for compiling model-based diagnosis systems.

Journal ArticleDOI
TL;DR: It is shown that the IEEE standard's specification of operations involving the signed infinities, signed zeros, and the exact/inexact flag are such as to make a correct and optimal implementation more efficient.
Abstract: We start with a mathematical definition of a real interval as a closed, connected set of reals. Interval arithmetic operations (addition, subtraction, multiplication, and division) are likewise defined mathematically and we provide algorithms for computing these operations assuming exact real arithmetic. Next, we define interval arithmetic operations on intervals with IEEE 754 floating point endpoints to be sound and optimal approximations of the real interval operations and we show that the IEEE standard's specification of operations involving the signed infinities, signed zeros, and the exact/inexact flag are such as to make a correct and optimal implementation more efficient. From the resulting theorems, we derive data that are sufficiently detailed to convert directly to a program for efficiently implementing the interval operations. Finally, we extend these results to the case of general intervals, which are defined as connected sets of reals that are not necessarily closed.

Journal ArticleDOI
TL;DR: An adversarial theory of queuing is developed aimed at addressing some of the restrictions inherent in probabilistic analysis and queuing theory based on time-invariant stochastic generation.
Abstract: We consider packet routing when packets are injected continuously into a network. We develop an adversarial theory of queuing aimed at addressing some of the restrictions inherent in probabilistic analysis and queuing theory based on time-invariant stochastic generation. We examine the stability of queuing networks and policies when the arrival process is adversarial, and provide some preliminary results in this direction. Our approach sheds light on various queuing policies in simple networks, and paves the way for a systematic study of queuing with few or no probabilistic assumptions.

Journal ArticleDOI
TL;DR: It is shown that a type system based on the intuitionistic modal logic S4 provides an expressive framework for specifying and analyzing computation stages in the context of typed λ-calculi and functional languages.
Abstract: We show that a type system based on the intuitionistic modal logic S4 provides an expressive framework for specifying and analyzing computation stages in the context of typed λ-calculi and functional languages. We directly demonstrate the sense in which our λe→□-calculus captures staging, and also give a conservative embeddng of Nielson and Nielson's two-level functional language in our functional language Mini-ML□, thus proving that binding-time correctness is equivalent to modal correctness on this fragment. In addition, Mini-ML□ can also express immediate evaluation and sharing of code across multiple stages, thus supporting run-time code generation as well as partial evaluation.

Journal ArticleDOI
TL;DR: It is shown that for each property φ of structures that is definable in first-order logic and for each locally tree-decomposable class C of structures, there is a linear time algorithm deciding whether a given structure A ∈ C hasproperty φ.
Abstract: We introduce the concept of a class of graphs, or more generally, relational structures, being locally tree-decomposable. There are numerous examples of locally tree-decomposable classes, among them the class of planar graphs and all classes of bounded valence or of bounded tree-width. We also consider a slightly more general concept of a class of structures having bounded local tree-width.We show that for each property φ of structures that is definable in first-order logic and for each locally tree-decomposable class C of structures, there is a linear time algorithm deciding whether a given structure A ∈ C has property φ. For classes C of bounded local tree-width, we show that for every k ≥ 1 there is an algorithm solving the same problem in time O(n1+(1/k)) (where n is the cardinality of the input structure).

Journal ArticleDOI
TL;DR: This paper analyzes the behavior of packet-switched communication networks in which packets arrive dynamically at the nodes and are routed in discrete time steps across the edges, and provides the first examples of a protocol that is stable for all networks, and a Protocol that is not stable forall networks.
Abstract: In this paper, we analyze the behavior of packet-switched communication networks in which packets arrive dynamically at the nodes and are routed in discrete time steps across the edges. We focus on a basic adversarial model of packet arrival and path determination for which the time-averaged arrival rate of packets requiring the use of any edge is limited to be less than 1. This model can reflect the behavior of connection-oriented networks with transient connections (such as ATM networks) as well as connectionless networks (such as the Internet).We concentrate on greedy (also known as work-conserving) contention-resolution protocols. A crucial issue that arises in such a setting is that of stability—will the number of packets in the system remain bounded, as the system runs for an arbitrarily long period of time? We study the universal stability of network (i.e., stability under all greedy protocols) and universal stability of protocols (i.e., stability in all networks). Once the stability of a system is granted, we focus on the two main parameters that characterize its performance: maximum queue size required and maximum end-to-end delay experienced by any packet.Among other things, we show:(i) There exist simple greedy protocols that are stable for all networks.(ii) There exist other commonly used protocols (such as FIFO) and networks (such as arrays and hypercubes) that are not stable.(iii) The n-node ring is stable for all greedy routing protocols (with maximum queue-size and packet delay that is linear in n).(iv) There exists a simple distributed randomized greedy protocol that is stable for all networks and requires only polynomial queue size and polynomial delay.Our results resolve several questions posed by Borodin et al., and provide the first examples of (i) a protocol that is stable for all networks, and (ii) a protocol that is not stable for all networks.

Journal ArticleDOI
TL;DR: This paper shows that the problem of evaluating acyclic Boolean conjunctive queries is complete for LOGCFL, the class of decision problems that are logspace-reducible to a context-free language, and that the acYclic versions of the following well-known database andAI problems are allLOGCFL-complete.
Abstract: This paper deals with the evaluation of acyclic Boolean conjunctive queries in relational databases. By well-known results of Yannakakis[1981], this problem is solvable in polynomial time; its precise complexity, however, has not been pinpointed so far. We show that the problem of evaluating acyclic Boolean conjunctive queries is complete for LOGCFL, the class of decision problems that are logspace-reducible to a context-free language. Since LOGCFL is contained in AC1 and NC2, the evaluation problem of acyclic Boolean conjunctive queries is highly parallelizable. We present a parallel database algorithm solving this problem with alogarithmic number of parallel join operations. The algorithm is generalized to computing the output of relevant classes of non-Boolean queries. We also show that the acyclic versions of the following well-known database and AI problems are all LOGCFL-complete: The Query Output Tuple problem for conjunctive queries, Conjunctive Query Containment, Clause Subsumption, and Constraint Satisfaction. The LOGCFL-completeness result is extended to the class of queries of bounded tree width and to other relevant query classes which are more general than the acyclic queries.

Journal ArticleDOI
TL;DR: This paper discusses Markov random fields problems in the context of a representative application---the image segmentation problem and presents an algorithm that solves the problem in polynomial time when the deviation function is convex and separation function is linear.
Abstract: Problems of statistical inference involve the adjustment of sample observations so they fit some a priori rank requirements, or order constraints. In such problems, the objective is to minimize the deviation cost function that depends on the distance between the observed value and the modify value. In Markov random field problems, there is also a pairwise relationship between the objects. The objective in Markov random field problem is to minimize the sum of the deviation cost function and a penalty function that grows with the distance between the values of related pairs---separation function.We discuss Markov random fields problems in the context of a representative application---the image segmentation problem. In this problem, the goal is to modify color shades assigned to pixels of an image so that the penalty function consisting of one term due to the deviation from the initial color shade and a second term that penalizes differences in assigned values to neighboring pixels is minimized. We present here an algorithm that solves the problem in polynomial time when the deviation function is convex and separation function is linear; and in strongly polynomial time when the deviation cost function is linear, quadratic or piecewise linear convex with few pieces (where “few” means a number exponential in a polynomial function of the number of variables and constraints). The complexity of the algorithm for a problem on n pixels or variables, m adjacency relations or constraints, and range of variable values (colors) U, is O(T(n,m) + n log U) where T(n,m) is the complexity of solving the minimum s, t cut problem on a graph with n nodes and m arcs. Furthermore, other algorithms are shown to solve the problem with convex deviation and convex separation in running time O(mn log n log nU) and the problem with nonconvex deviation and convex separation in running time O(T(nU, mU). The nonconvex separation problem is NP-hard even for fixed value of U.For the family of problems with convex deviation functions and linear separation function, the algorithm described here runs in polynomial time which is demonstrated to be fastest possible.

Journal ArticleDOI
TL;DR: A provably good convex quadratic programming relaxation of strongly polynomial size is proposed for this problem of scheduling unrelated parallel machines subject to release dates so as to minimize the total weighted completion time of jobs.
Abstract: We consider the problem of scheduling unrelated parallel machines subject to release dates so as to minimize the total weighted completion time of jobs. The main contribution of this paper is a provably good convex quadratic programming relaxation of strongly polynomial size for this problem. The best previously known approximation algorithms are based on LP relaxations in time- or interval-indexed variables. Those LP relaxations, however, suffer from a huge number of variables. As a result of the convex quadratic programming approach we can give a very simple and easy to analyze 2-approximation algorithm which can be further improved to performance guarantee 3/2 in the absence of release dates. We also consider preemptive scheduling problems and derive approximation algorithms and results on the power of preemption which improve upon the best previously known results for these settings. Finally, for the special case of two machines we introduce a more sophisticated semidefinite programming relaxation and apply the random hyperplane technique introduced by Goemans and Williamson for the MaxCut problem; this leads to an improved 1.2752-approximation.

Journal Article
TL;DR: In this paper, the authors introduce a new approach to modeling uncertainty based on plausibility measures, which is easily seen to generalize other approaches to modelling uncertainty, such as probability measures, belief functions, and possibility measures.
Abstract: We introduce a new approach to modeling uncertainty based on plausibility measures. This approach is easily seen to generalize other approaches to modeling uncertainty, such as probability measures, belief functions, and possibility measures. We focus on one application of plausibility measures in this paper: default reasoning. In recent years, a number of different semantics for defaults have been proposed, such as preferential structures, $\epsilon$-semantics, possibilistic structures, and $\kappa$-rankings, that have been shown to be characterized by the same set of axioms, known as the KLM properties. While this was viewed as a surprise, we show here that it is almost inevitable. In the framework of plausibility measures, we can give a necessary condition for the KLM axioms to be sound, and an additional condition necessary and sufficient to ensure that the KLM axioms are complete. This additional condition is so weak that it is almost always met whenever the axioms are sound. In particular, it is easily seen to hold for all the proposals made in the literature.

Journal ArticleDOI
TL;DR: The primary goal of this study is to promote an integration of methods and techniques for MIR by contributing a conceptual model that encompasses in a unified and coherent perspective the many efforts that are being produced under the label of MIR.
Abstract: Research on multimedia information retrieval (MIR) has recently witnessed a booming interest. A prominent feature of this research trend is its simultaneous but independent materialization within several fields of computer science. The resulting richness of paradigms, methods and systems may, on the long run, result in a fragmentation of efforts and slow down progress. The primary goal of this study is to promote an integration of methods and techniques for MIR by contributing a conceptual model that encompasses in a unified and coherent perspective the many efforts that are being produced under the label of MIR. The model offers a retrieval capability that spans two media, text and images, but also several dimensions: form, content and structure. In this way, it reconciles similarity-based methods with semantics-based ones, providing the guidelines for the design of systems that are able to provide a generalized multimedia retrieval service, in which the existing forms of retrieval not only coexist, but can be combined in any desired manner. The model is formulated in terms of a fuzzy description logic, which plays a twofold role: (1) it directly models semantics-based retrieval, and (2) it offers an ideal framework for the integration of the multimedia and multidimensional aspects of retrieval mentioned above. The model also accounts for relevance feedback in both text and image retrieval, integrating known techniques for taking into account user judgments. The implementation of the model is addressed by presenting a decomposition technique that reduces query evaluation to the processing of simpler requests, each of which can be solved by means of widely known methods for text and image retrieval, and semantic processing. A prototype for multidimensional image retrieval is presented that shows this decomposition technique at work in a significant case.

Journal ArticleDOI
TL;DR: A data structure that allows arbitrary insertions and deletions on a planar point set P and supports basic queries on the convex hull of P, such as membership and tangent-finding is given.
Abstract: We give a data structure that allows arbitrary insertions and deletions on a planar point set P and supports basic queries on the convex hull of P, such as membership and tangent-finding. Updates take O(log1+en) amori tzed time and queries take O (log n time each, where n is the maximum size of P and e is any fixed positive constant. For some advanced queries such as bridge-finding, both our bounds increase to O(log3/2n). The only previous fully dynamic solution was by Overmars and van Leeuwen from 1981 and required O(log2n) time per update and O(log n) time per query.

Journal ArticleDOI
TL;DR: This work describes an on-line algorithm that greedily acknowledges exactly when the cost for an acknowledgment is less than the latency cost incurred by not acknowledging, and shows that for each objective function, at least one of the algorithms is optimal.
Abstract: We study an on-line problem that is motivated by the networking problem of dynamically adjusting of acknowledgments in the Transmission Control Protocol (TCP). We provide a theoretical model for this problem in which the goal is to send acks at a time that minimize a linear combination of the cost for the number of acknowledgments sent and the cost for the additional latency introduced by delaying acknowledgments. To study the usefulness of applying packet arrival time prediction to this problem, we assume there is an oracle that provides the algorithm with the times of the next L arrivals, for some L ≥ 0.We give two different objective functions for measuring the cost of a solution, each with its own measure of latency cost. For each objective function we first give an O(n2)-time dynamic programming algorithm for optimally solving the off-line problem. Then we describe an on-line algorithm that greedily acknowledges exactly when the cost for an acknowledgment is less than the latency cost incurred by not acknowledging. We show that for this algorithm there is a sequence of n packet arrivals for which it is O (***)-competitive for the first objective function, 2-competitive for the second function for L = 0, and 1-competitivefor the second function for L = 1. Next we present a second on-line algorithm which is a slight modification of the first, and we prove that it is 2-competitive for both objective functions for all L. We also give lower bounds on the competitive ratio for any deterministic on-line algorithm. These results show that for each objective function, at least one of our algorithms is optimal.Finally, we give some initial empirical results using arrival sequences from real network traffic where we compare the two methods used in TCP for acknowledgment delay with our two on-line algorithms. In all cases we examine performance with L = 0 and L = 1.

Journal ArticleDOI
TL;DR: A restricted aggregate logic is considered that gives us a tighter capture of database languages, and also uses it to show that some questions on expressivity of aggregation cannot be answered without resolving some deep problems in complexity theory.
Abstract: We study adding aggregate operators, such as summing up elements of a column of a relation, to logics with counting mechanisms. The primary motivation comes from database applications, where aggregate operators are present in all real life query languages. Unlike other features of query languages, aggregates are not adequately captured by the existing logical formalisms. Consequently, all previous approaches to analyzing the expressive power of aggregation were only capable of producing partial results, depending on the allowed class of aggregate and arithmetic operations.We consider a powerful counting logic, and extend it with the set of all aggregate operators. We show that the resulting logic satisfies analogs of Hanf's and Gaifman's theorems, meaning that it can only express local properties. We consider a database query language that expresses all the standard aggregates found in commercial query languages, and show how it can be translated into the aggregate logic, thereby providing a number of expressivity bounds, that do not depend on a particular class of arithmetic functions, and that subsume all those previously known. We consider a restricted aggregate logic that gives us a tighter capture of database languages, and also use it to show that some questions on expressivity of aggregation cannot be answered without resolving some deep problems in complexity theory.

Journal ArticleDOI
TL;DR: This paper resolves a long-standing open problem on whether the concurrent write capability of parallel random access machine (PRAM) is essential for solving fundamental graph problems like connected components and minimum spanning trees in logarithmic time.
Abstract: This paper resolves a long-standing open problem on whether the concurrent write capability of parallel random access machine (PRAM) is essential for solving fundamental graph problems like connected components and minimum spanning trees in O(logn) time. Specifically, we present a new algorithm to solve these problems in O(logn) time using a linear number of processors on the exclusive-read exclusive-write PRAM. The logarithmic time bound is actually optimal since it is well known that even computing the “OR” of nbit requires O(log n time on the exclusive-write PRAM. The efficiency achieved by the new algorithm is based on a new schedule which can exploit a high degree of parallelism.

Journal ArticleDOI
TL;DR: New theorems to analyze divide-and-conquer recurrences are presented, which improve other similar ones in several aspects and cover a wider set of toll functions and weight distributions, stochastic recurrence included.
Abstract: This paper presents new theorems to analyze divide-and-conquer recurrences, which improve other similar ones in several aspects. In particular, these theorems provide more information, free us almost completely from technicalities like floors and ceilings, and cover a wider set of toll functions and weight distributions, stochastic recurrences included.

Journal ArticleDOI
TL;DR: It is proved that, in many cases, order locality is equivalent to a clause set being saturated under ordered resolution, which provides a means of using standard resolution theorem provers for testing order locality and transforming non-local clause sets into local ones.
Abstract: We define order locality to be a property of clauses relative to a term ordering. This property generalizes the subformula property for proofs where the terms appearing in proofs can be bounded, under the given ordering, by terms appearing in the goal clause. We show that when a clause set is order local, then the complexity of its ground entailment problem is a function of its structure (e.g., full versus Horn clauses), and the ordering used. We prove that, in many cases, order locality is equivalent to a clause set being saturated under ordered resolution. This provides a means of using standard resolution theorem provers for testing order locality and transforming non-local clause sets into local ones. We have used the Saturate system to automatically establish complexity bounds for a number of nontrival entailment problems relative to complexity classes which include polynomial and exponential time and co-NP.

Journal ArticleDOI
TL;DR: This work modeling data obsolescence, that is, the reduction of consistency over time between a relation and its replica, provides several stochastic models for content evolution in the base relations of a database, taking referential integrity constraints into account.
Abstract: Recent trends in information management involve the periodic transcription of data onto secondary devices in a networked environment, and the proper scheduling of these transcriptions is critical for efficient data management. To assist in the scheduling process, we are interested in modeling data obsolescence, that is, the reduction of consistency over time between a relation and its replica. The modeling is based on techniques from the field of stochastic processes, and provides several stochastic models for content evolution in the base relations of a database, taking referential integrity constraints into account. These models are general enough to accommodate most of the common scenarios in databases, including batch insertions and lifespans both with and without memory. As an initial "proof of concept" of the applicability of our approach, we validate the insertion portion of our model framework via experiments with real data feeds. We also discuss a set of transcription protocols that make use of the proposed stochastic model.

Journal ArticleDOI
TL;DR: It is shown that both query-reachability and satisfiability are decidable for programs with stratified negation and that satisfiability is undecidable for datalog programs with unary IDB predicates, stratifiedNegation and the interpreted predicate ≠.
Abstract: We consider the problems of containment, equivalence, satisfiability and query-reachability for datalog programs with negation. These problems are important for optimizing datalog programs. We show that both query-reachability and satisfiability are decidable for programs with stratified negation provided that negation is applied only to EDB predicates or that all EDB predicates are unary. In the latter case, we show that equivalence is also decidable. The algorithms we present can also be used to push constraints from a given query to the EDB predicates. In showing our decidability results we describe a powerful tool, the query-tree, which is used for several optimization problems for datalog programs. Finally, we show that satisfiability is undecidable for datalog programs with unary IDB predicates, stratified negation and the interpreted predicate ≠.