scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the ACM in 1996"


Journal ArticleDOI
TL;DR: It is proved that Consensus and Atomic Broadcast are reducible to each other in asynchronous systems with crash failures; thus, the above results also apply to Atomic Broadcast.
Abstract: We introduce the concept of unreliable failure detectors and study how they can be used to solve Consensus in asynchronous systems with crash failures. We characterise unreliable failure detectors in terms of two properties—completeness and accuracy. We show that Consensus can be solved even with unreliable failure detectors that make an infinite number of mistakes, and determine which ones can be used to solve Consensus despite any number of crashes, and which ones require a majority of correct processes. We prove that Consensus and Atomic Broadcast are reducible to each other in asynchronous systems with crash failures; thus, the above results also apply to Atomic Broadcast. A companion paper shows that one of the failure detectors introduced here is the weakest failure detector for solving Consensus [Chandra et al. 1992].

2,718 citations


Journal ArticleDOI
TL;DR: This paper shows how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time, and shows that a logarithmic slowdown is a lower bound.
Abstract: Software protection is one of the most important issues concerning computer practice. There exist many heuristics and ad-hoc methods for protection, but the problem as a whole has not received the theoretical treatment it deserves. In this paper, we provide theoretical treatment of software protection. We reduce the problem of software protection to the problem of efficient simulation on oblivious RAM.A machine is oblivious if thhe sequence in which it accesses memory locations is equivalent for any two inputs with the same running time. For example, an oblivious Turing Machine is one for which the movement of the heads on the tapes is identical for each computation. (Thus, the movement is independent of the actual input.) What is the slowdown in the running time of a machine, if it is required to be oblivious? In 1979, Pippenger and Fischer showed how a two-tape oblivious Turing Machine can simulate, on-line, a one-tape Turing Machine, with a logarithmic slowdown in the running time. We show an analogous result for the random-access machine (RAM) model of computation. In particular, we show how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time. On the other hand, we show that a logarithmic slowdown is a lower bound.

1,752 citations


Journal ArticleDOI
TL;DR: It is proved that to solve Consensus, any failure detector has to provide at least as much information as W, and W is indeed the weakest failure detector for solving Consensus in asynchronous systems with a majority of correct processes.
Abstract: We determine what information about failures is necessary and sufficient to solve Consensus in asynchronous distributed systems subject to crash failures. In Chandra and Toueg [1996], it is shown that W, a failure detector that provides surprisingly little information about which processes have crashed, is sufficient to solve Consensus in asynchronous systems with a majority of correct processes. In this paper, we prove that to solve Consensus, any failure detector has to provide at least as much information as W. Thus, W is indeed the weakest failure detector for solving Consensus in asynchronous systems with a majority of correct processes.

853 citations


Journal ArticleDOI
TL;DR: This paper investigates whether observation equivalence really does respect the branching structure of processes, and finds that in the presence of the unobservable action τ of CCS this is not the case, and the notion of branching bisimulation equivalence is introduced which strongly preserves the branching structures of processes.
Abstract: In comparative concurrency semantics, one usually distinguishes between linear time and branching time semantic equivalences. Milner's notion of observatin equivalence is often mentioned as the standard example of a branching time equivalence. In this paper we investigate whether observation equivalence really does respect the branching structure of processes, and find that in the presence of the unobservable action t of CCS this is not the case.Therefore, the notion of branching bisimulation equivalence is introduced which strongly preserves the branching structure of processes, in the sense that it preserves computations together with the potentials in all intermediate states that are passed through, even if silent moves are involved. On closed CCS-terms branching bisimulation congruence can be completely axiomatized by the single axion scheme: a.(t.(y+z)+y)=a.(y+z) (where a ranges over all actions) and the usual loaws for strong congruence.We also establish that for sequential processes observation equivalence is not preserved under refinement of actions, whereas branching bisimulation is.For a large class of processes, it turns out that branching bisimulation and observation equivalence are the same. As far as we know, all protocols that have been verified in the setting of observation equivalence happen to fit in this class, and hence are also valid in the stronger setting of branching bisimulation equivalence.

851 citations


Journal ArticleDOI
TL;DR: In this article, a temporal language that can constrain the time difference between events with finite, yet arbitrary, precision is introduced and proved to be EXPSPACE-complete.
Abstract: The most natural, compositional, way of modeling real-time systems uses a dense domain for time. The satisfiability of timing constraints that are capable of expressing punctuality in this model, however, is known to be undecidable. We introduce a temporal language that can constrain the time difference between events only with finite, yet arbitrary, precision and show the resulting logic to be EXPSPACE-complete. This result allows us to develop an algorithm for the verification of timing properties of real-time systems with a dense semantics.

543 citations


Journal ArticleDOI
TL;DR: The connection between cliques and efficient multi-prover interaction proofs, is shown to yield hardness results on the complexity of approximating the size of the largest clique in a graph.
Abstract: The contribution of this paper is two-fold. First, a connection is established between approximating the size of the largest clique in a graph and multi-prover interactive proofs. Second, an efficient multi-prover interactive proof for NP languages is constructed, where the verifier uses very few random bits and communication bits. Last, the connection between cliques and efficient multi-prover interaction proofs, is shown to yield hardness results on the complexity of approximating the size of the largest clique in a graph.Of independent interest is our proof of correctness for the multilinearity test of functions.

527 citations


Journal ArticleDOI
TL;DR: A randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability with a significant improvement over the previous time bounds based on maximum flows.
Abstract: This paper present a new approach to finding minimum cuts in undirected graphs. The fundamental principle is simple: the edges in a graph's minimum cut form an extremely small fraction of the graph's edges. Using this idea, we give a randomized, strongly polynomial algorithm that finds the minimum cut in an arbitrarily weighted undirected graph with high probability. The algorithm runs in O(n2log3n) time, a significant improvement over the previous O˜(mn) time bounds based on maximum flows. It is simple and intuitive and uses no complex data structures. Our algorithm can be parallelized to run in RNC with n2 processors; this gives the first proof that the minimum cut problem can be solved in RNC. The algorithm does more than find a single minimum cut; it finds all of them.With minor modifications, our algorithm solves two other problems of interest. Our algorithm finds all cuts with value within a multiplicative factor of a of the minimum cut's in expected O˜(n2a) time, or in RNC with n2a processors. The problem of finding a minimum multiway cut of graph into r pieces is solved in expected O˜(n2(r-1)) time, or in RNC with n2(r-1) processors. The “trace” of the algorithm's execution on these two problems forms a new compact data structure for representing all small cuts and all multiway cuts in a graph. This data structure can be efficiently transformed into the more standard cactus representing for minimum cuts.

510 citations


Journal ArticleDOI
TL;DR: This work addresses all three problems for goal-oriented query evaluation of general logic programs by presenting tabled evaluation with delaying, called SLG resolution, which has three distinctive features: it has a polynomial time data complexity for well-founded negation of function-free programs.
Abstract: SLD resolution with negation as finite failure (SLDNF) reflects the procedural interpretation of predicate calculus as a programming language and forms the computational basis for Prolog systems. Despite its advantages for stack-based memory management, SLDNF is often not appropriate for query evaluation for three reasons: (a) it may not terminate due to infinite positive recursion; (b) it may be terminate due to infinite recursion through negation; and (c) it may repeatedly evaluate the same literal in a rule body, leading to unacceptable performance.We address all three problems for goal-oriented query evaluation of general logic programs by presenting tabled evaluation with delaying, called SLG resolution. It has three distinctive features:(i) SLG resolution is a partial deduction procedure, consisting of seven fundamental transformations. A query is transformed step by step into a set of answers. The use of transformations separates logical issues of query evaluation from procedural ones. SLG allows an arbitrary computation rule for selecting a literal from a rule body and an arbitrary control strategy for selecting transformations to apply.(ii) SLG resolution is sound and search space complete with respect to the well-founded partial model for all non-floundering queries, and preserves all three-valued stable models. To evaluate a query under differenc three-valued stable models, SLG resolution can be enhanced by further processing of the answers of subgoals relevant to a query.(iii) SLG resolution avoids both positive and negative loops and always terminates for programs with the bounded-term-size property. It has a polynomial time data complexity for well-founded negation of function-free programs. Through a delaying mechanism for handling ground negative literals involved in loops, SLG resolution avoids the repetition of any of its derivation steps.Restricted forms of SLG resolution are identified for definite, locally stratified, and modularly stratified programs, shedding light on the role each transformation plays.

434 citations


Journal ArticleDOI
TL;DR: This algorithm improves the complexity of the asymptotically fastest algorithm for this problem, known to this data, and new and improved algorithms for deciding a sentence in the first order theory over real closed fields, are obtained.
Abstract: In this paper, a new algorithm for performing quantifier elimination from first order formulas over real closed fields in given. This algorithm improves the complexity of the asymptotically fastest algorithm for this problem, known to this data. A new feature of this algorithm is that the role of the algebraic part (the dependence on the degrees of the imput polynomials) and the combinatorial part (the dependence on the number of polynomials) are sparated. Another new feature is that the degrees of the polynomials in the equivalent quantifier-free formula that is output, are independent of the number of input polynomials. As special cases of this algorithm new and improved algorithms for deciding a sentence in the first order theory over real closed fields, and also for solving the existential problem in the first order theory over real closed fields, are obtained.

395 citations


Journal ArticleDOI
Bart Selman1, Henry Kautz1
TL;DR: It is shown how propositional logical theories can be compiled into Horn theories that approximate the original information, and the approximations bound the original theory from below and above in terms of logical strength.
Abstract: Computational efficiency is a central concern in the design of knowledge representation systems In order to obtain efficient systems, it has been suggested that one should limit the form of the statements in the knowledge base or use an incomplete inference mechanism The former approach is often too restrictive for practical applications, whereas the latter leads to uncertainty about exactly what can and cannot be inferred from the knowledge base We present a third alternative, in which knowledge given in a general representation language is translated (compiled) into a tractable form—allowing for efficient subsequent query answeringWe show how propositional logical theories can be compiled into Horn theories that approximate the original information The approximations bound the original theory from below and above in terms of logical strength The procedures are extended to other tractable languages (for example, binary clauses) and to the first-order case Finally, we demonstrate the generality of our approach by compiling concept descriptions in a general frame-based language into a tractable form

348 citations


Journal ArticleDOI
TL;DR: It is proved that a simple algorithm can construct second-order recurrent neural networks with a sparse interconnection topology and sigmoidel discriminant function such that the internal DFA state representations are stable, that is, the constructed network correctly classifies strings of arbitrary length.
Abstract: Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use of a sigmoidel discriminant function together with the recurrent structure contribute to this instability. We prove that a simple algorithm can construct second-order recurrent neural networks with a sparse interconnection topology and sigmoidal discriminant function such that the internal DFA state representations are stable, that is, the constructed network correctly classifies strings of arbitrary length. The algorithm is based on encoding strengths of weights directly into the neural network. We derive a relationship between the weight strength and the number of DFA states for robust string classification. For a DFA with n state and minput alphabet symbols, the constructive algorithm generates a “programmed” neural network with O(n) neurons and O(mn) weights. We compare our algorithm to other methods proposed in the literature.

Journal ArticleDOI
TL;DR: In this paper, a general theory for the use of negative premises in the rules of transition system specifications (TSSs) is presented, and a criterion that should be satisfied by a TSS in order to be meaningful, that is, to unequivocally define a transition relation.
Abstract: We present a general theory for the use of negative premises in the rules of Transition System Specifications (TSSs). We formulate a criterion that should be satisfied by a TSS in order to be meaningful, that is, to unequivocally define a transition relation. We also provide powerful techniques for proving that a TSS satisfies this criterion, meanwhile constructing this transition relation. Both the criterion and the techniques originate from logic programming [van Gelder et al. 1988; Gelfond and Lifschitz 1988] to which TSSs are close. In an appendix we provide an extensive comparison between them.As in Groote [1993], we show that the bisimulation relation induced by a TSS is a congruence, provided that it is in ntyft/ntyxt-format and can be proved meaningful using our techniques. We also considerably extend the conservativity theorems of Groote[1993] and Groote and Vaandrager [1992]. As a running example, we study the combined addition of priorities and abstraction to Basic Process Algebra (BPA). Under some reasonable conditions we show that this TSS is indeed meaningful, which could not be shown by other methods [Bloom et al. 1995; Groote 1993]. Finally, we provide a sound and complete axiomatization for this example.

Journal ArticleDOI
TL;DR: The main new results of the paper are a confluent weak calculus of substitutions, where no variable clashes can be feared, and a conjecture raised in Abadi [1991]: λ&sgr;-calculus is not confluent (it is confluent on ground terms only).
Abstract: Categorical combinators [Curien 1986/1993; Hardin 1989; Yokouchi 1989] and more recently ls-calculus [Abadi 1991; Hardin and Le´vy 1989], have been introduced to provide an explicit treatment of substitutions in the l-calculus. We reintroduce here the ingredients of these calculi in a self-contained and stepwise way, with a special emphasis on confluence properties. The main new results of the paper with respect to Curien [1986/1993], Hardin [1989], Abadi [1991], and Hardin and Le´vy [1989] are the following:(1) We present a confluent weak calculus of substitutions, where no variable clashes can be feared; (2) We solve a conjecture raised in Abadi [1991]: ls-calculus is not confluent (it is confluent on ground terms only).This unfortunate result is “repaired” by presenting a confluent version of ls-calculus, named the lEnv-caldulus in Hardin and Le´vy [1989], called here the confluent ls-calculus.

Journal ArticleDOI
TL;DR: This work obtains searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression.
Abstract: We present algorithms for efficient searching of regular expressions on preprocessed text, using a Patricia tree as a logical model for the index. We obtain searching algorithms that run in logarithmic expected time in the size of the text for a wide subclass of regular expressions, and in sublinear expected time for any regular expression. This is the first such algorithm to be found with this complexity.

Journal ArticleDOI
TL;DR: This paper applies a form of the competitive philosophy for the first time to the problem of prefetching to develop an optimal universal prefetcher in terms of fault rate, with particular applications to large-scale databases and hypertext systems.
Abstract: Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper, we apply a form of the competitive philosophy for the first time to the problem of prefetching to develop an optimal universal prefetcher in terms of fault rate, with particular applications to large-scale databases and hypertext systems. Our prediction algorithms with particular applications to large-scale databases and hypertext systems. Our prediction algorithms for prefetching are novel in that they are based on data compression techniques that are both theoretically optimal and good in practice. Intuitively, in order to compress data effectively, you have to be able to predict future data well, and thus good data compressors should be able to predict well for purposes of prefetching. We show for powerful models such as Markov sources and mthe order Markov sources that the page fault rate incurred by our prefetching algorithms are optimal in the limit for almost all sequences of page requests.

Journal ArticleDOI
TL;DR: This thesis formally described a substantial subset of the MC68020, a widely used microprocessor built by Motorola, within the mathematical logic of the automated reasoning system Nqthm, a.k.a. the Boyer-Moore Theorem Proving System, and mechanized a mathematical theory to facilitate automated reasoning about object code programs.
Abstract: Computing devices can be specified and studied mathematically. Formal specification of computing devices has many advantages--it provides a precise characterization of the computational model and allows for mathematical reasoning about models of the computing devices and programs executed on them. While there has been a large body of research on program proving, work has almost exclusively focused on programs written in high level programming languages. This thesis addresses the very important but largely ignored problem of machine code program proving. In this thesis we have formally described a substantial subset of the MC68020, a widely used microprocessor built by Motorola, within the mathematical logic of the automated reasoning system Nqthm, a.k.a. the Boyer-Moore Theorem Proving System. Based on this formal model, we have mechanized a mathematical theory to facilitate automated reasoning about object code programs. We then have mechanically checked the correctness of MC68020 object code programs for binary search, Hoare's Quick Sort, the Berkeley Unix C string library, and other well-known algorithms. The object code for these examples was generated using the Gnu C, the Verdix Ada, and the AKCL Common Lisp compilers.

Journal ArticleDOI
TL;DR: In this article, it was shown that for any e > 0, monotone Boolean functions are PAC learnable with error e under product distributions in time 2 O((1/e) √ n).
Abstract: In this paper, monotone Boolean functions are studied using harmonic analysis on the cube. The main result is that any monotone Boolean function has most of its power spectrum on its Fourier coefficients of degree at most O(√n) under any product distribution. This is similar to a result of Linial et al. [1993], which showed that AC 0 functions have almost all of their power spectrum on the coefficients of degree, at most (log n) O(1) , under the uniform distribution. As a consequence of the main result, the following two corollaries are obtained : -For any e > 0, monotone Boolean functions are PAC learnable with error e under product distributions in time 2 O((1/e) √ n) . -Any monotone Boolean function can be approximated within error e under product distributions by a non-monotone Boolean circuit of size 2 O(1/e √ n) and depth O(1/e √n). The learning algorithm runs in time subexponential as long as the required error is Ω(1/(√n log n)). It is shown that this is tight in the sense that for any subexponential time algorithm there is a monotone Boolean function for which this algorithm cannot approximate with error better than O(1/√n). The main result is also applied to other problems in learning and complexity theory. In learning theory, several polynomial-time algorithms for learning some classes of monotone Boolean functions, such as Boolean functions with O(log 2 n/log log n) relevant variables, are presented. In complexity theory, some questions regarding monotone NP-complete problems are addressed.

Journal ArticleDOI
TL;DR: This paper considers optical networks with and without switches, and different types of routing in these networks, and presents optimal or near-optimal constructions of optical networks in these cases and algorithms for routing connections, specifically permutation routing for the networks constructed here.
Abstract: This paper studies the problem of dedicating routes to connections in optical networks. In optical networks, the vast bandwidth available in an optical fiber is utilized by partitioning it into several channels, each at a different optical wavelength. A connection between two nodes is assigned a specific wavelength, with the constraint that no two connections sharing a link in the network can be assigned the same wavelength. This paper considers optical networks with and without switches, and different types of routing in these networks. It presents optimal or near-optimal constructions of optical networks in these cases and algorithms for routing connections, specifically permutation routing for the networks constructed here.

Journal ArticleDOI
TL;DR: It is shown how a media-presentation can be generated by processing a sequence of queries, and furthermore it is shown that when these queries are extended to include constraints, then these queries can not only generate presentations, but also generate temporal synchronization properties and spatial layout properties for such presentations.
Abstract: Though numerous multimedia systems exist in the commercial market today, relatively little work has been done on developing the mathematical foundation of multimedia technology. We attempt to take some initial steps towards the development of a theoretical basis for a multimedia information system. To do so, we develop the motion of a structured multimedia database system. We begin by defining a mathematical model of a media-instance. A media-instance may be thought of as “glue” residing on top of a specific physical media-representation (such as video, audio, documents, etc). Using this “glue”, it is possible to define a general purpose logical query language to query multimedia data. This glue consists of a set of “states” (e.g., video frames, audio tracks, etc.) and “features”, together with relationships between states and/or features. A structured multimedia database system imposes a certain mathematical structures on the set of features/states. Using this notion of a structure, we are able to define indexing structures for processing queries, methods to relax queries when answers do not exist to those queries, as well as sound, complete and terminating procedures to answer such queries (and their relaxations, when appropriate). We show how a media-presentation can be generated by processing a sequence of queries, and furthermore we show that when these queries are extended to include constraints, then these queries can not only generate presentations, but also generate temporal synchronization properties and spatial layout properties for such presentations. We describe the architecture of a prototype multimedia database system based on the principles described in this paper.

Journal ArticleDOI
TL;DR: It is proved that the exponent of periodicity of a minimal solution of a word equation is of order 2, which implies an exponential improvement of known upper bounds on complexity of word-unification algorithms.
Abstract: The exponent of periodicity is an important factor in estimates of complexity of word-unification algorithms. We prove that the exponent of periodicity of a minimal solution of a word equation is of order 21.07d, where d is the length of the equation. We also give a lower bound 20.29d so our upper bound is almost optimal and exponentially better than the original bound (6d)22d4+ 2. Consequently, our result implies an exponential improvement of known upper bounds on complexity of word-unification algorithms.

Journal ArticleDOI
TL;DR: It is shown that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Γ p 4, and a new relationship between query complexity and time complexity in exact learning is shown.
Abstract: We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, including results on learning a natural subclass of DNF formulas, and on learning with membership queries alone. Query complexity has previously been used to prove lower bounds on the time complexity of exact learning. We show a new relationship between query complexity and time complexity in exact learning: If any “honest” class is exactly and properly learnable with polynomial query complexity, but not learnable in polynomial time, then P = NP. In particular, we show that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Gp4.

Journal ArticleDOI
TL;DR: In this article, the main construction of an n-user atomic variable directly from single-writer, single-reader atomic variables using O(n) control bits and O (n) accesses per Read/Write running in O(1) parallel time is presented.
Abstract: Sharing data between multiple asynchronous users—each of which can atomically read and write the data—is a feature that may help to increase the amount of parallelism in distributed systems. An algorithm implementing this feature is presented. The main construction of an n-user atomic variable directly from single-writer, single-reader atomic variables uses O(n) control bits and O(n) accesses per Read/Write running in O(1) parallel time.

Journal ArticleDOI
TL;DR: This work introduces a new primitive, the Resource Controller, which abstracts the problem of controlling the total amount of resources consumed by a distributed algorithm, and presents an efficient distributed algorithm to implement this abstraction.
Abstract: This paper introduces a new distributed data object called Resource Controller that provides an abstraction for managing the consumption of a global resource in a distributed system. Examples of resources that may be managed by such an object include; number of messages sent, number of nodes participating in the protocol, and total CPU time consumed.The Resource Controller object is accessed through a procedure that can be invoked at any node in the network. Before consuming a unit of resource at some node, the controlled algorithm should invoke the procedure at this node, requesting a permit or a rejection.The key characteristics of the Resource Controller object are the constraints that it imposes on the global resource consumption. An (M, W)-Controller guarantees that the total number of permits granted is at most M; it also ensures that, if a request is rejected, then at least M—W permits are eventually granted, even if no more requests are made after the rejected one.In this paper, we describe several message and space-efficient implementations of the Resource Controller object. In particular, we present an (M, W)-Controller whose message complexity is O(n log2n log(M/(W + 1)) where n is the total number of nodes. This is in contrast to the O(nM) message complexity of a fully centralized controller which maintains a global counter of the number of granted permits at some distinguished node and relays all the requests to the node.

Journal ArticleDOI
TL;DR: This paper presents a taxonomy of languages for multiparty interaction that covers all proposals of which it is aware, and presents a comprehensive analysis of the computational complexity of the multiparty intervention scheduling problem, the problem of scheduling multiparty interactions in a given execution environment.
Abstract: A multiparty interaction is a set of I/O actions executed jointly by a number of processes, each of which must be ready to execute its own action for any of the actions in the set to occur An attempt to participate in an interaction delays a process until all other participants are available Although a relatively new concept, the multiparty interaction has found its way into a number of distributed programming languages and algebraic models of concurrency In this paper, we present a taxonomy of languages for multiparty interaction that covers all proposals of which we are aware Based on this taxonomy, we then present a comprehensive analysis of the computational complexity of the multiparty interaction scheduling problem, the problem of scheduling multiparty interactions in a given execution environment

Journal ArticleDOI
TL;DR: These applications provide the first examples of networks that can be embedded more efficiently in hypercubes than in butterflies, and show that analogues of these results hold for networks that are structurally related to the butterfly network.
Abstract: The power of butterfly-like networks as multicomputer interconnection networks is studied, by considering how efficiently the butterfly can emulate other networks. Emulations are studied formally via graph embeddings, so the topic here becomes : How efficiently can one embed the graph underlying a given interconnection network in the graph underlying the butterfly network ? Within this framework, the slowdown incurred by an emulation is measured by the sum of the dilation and the congestion of the corresponding embedding (respectively, the maximum amount that the embedding stretches an edge of the guest graph, and the maximum traffic across any edge of the host graph) ; the efficiency of resource utilization in an emulation is measured by the expansion of the corresponding embedding (the ratio of the sizes of the host to guest graph). Three main results expose a number of optimal emulations by butterfly networks. Call a family of graphs balanced if complete binary trees can be embedded in the family with simultaneous dilation, congestion, and expansion 0(1). (1) The family of butterfly graphs is balanced. (2) (a) Any graph < from a family of maxdegree-d graphs having a recursive separator of size S(x) can be embedded in any balanced graph family with simultaneous dilation O(log(d Σ i S(2 -i |G|))) and expansion O(1). (b) Any dilation-D embedding of a maxdegree-d graph in a butterfly graph can be converted to an embedding having simultaneous dilation O(D) and congestion O(dD). (3) Any embedding of a planar graph G in a butterfly graph must have dilation Ω(log Σ (G)/Φ(G), where : Σ(G) is the size of the smallest (1/3, 2/3)-node-separator of G, and Φ(G) is the size of G's largest interior face. Applications of these results include : (1) The n-node X-tree network can be emulated by the butterfly network with slowdown O(log log n) and expansion 0(1) ; no embedding has dilation smaller than Ω(log log n), independent of expansion. (2) Every embedding of the n x n mesh in the butterfly graph has dilation Ω(log n) ; any expansion-O(1) embedding in the butterfly graph achieves dilation O(log n). These applications provide the first examples of networks that can be embedded more efficiently in hypercubes than in butterflies. We also show that analogues of these results hold for networks that are structurally related to the butterfly network. The upper bounds hold for the hypercube and the de Bruijn networks, possibly with altered constants. The lower bounds hold-at least in weakened form-for the de Bruijn network.

Journal ArticleDOI
TL;DR: This work presents optimal algorithms for sorting on parallel CREW and EREW versions of the pointer machine model based on a parallel mergesort using linked lists rather than arrays, and shows how to exploit the “locality” of the approach to solve the set expression evaluation problem.
Abstract: We present optimal algorithms for sorting on parallel CREW and EREW versions of the pointer machine model. Intuitively, one can view our methods as being based on a parallel mergesort using linked lists rather than arrays (the usual parallel data structure). We also show how to exploit the “locality” of our approach to solve the set expression evaluation problem, a problem with applications to database querying and logic-programming in O(log n) time using O(n) processors. Interestingly, this is an asymptotic improvement over what seems possible using previous techniques.

Journal ArticleDOI
TL;DR: This work proposes these combinatorial conditions to be “balancing analogs” of the well known Zero-One principle holding for sorting networks, and develops a combinatorials framework involving the transfer parameters, which precisely delimit the boundary between counting networks and sorting networks.
Abstract: Balancing networks, originally introduced by Aspnes et al. (Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, pp. 348-358, May 1991), represent a new class of distributed, low-contention data structures suitable for solving many fundamental multi-processor coordination problems that can be expressed as balancing problems. In this work, we present a mathematical study of the combinatorial structure of balancing networks, and a variety of its applications.Our study identifies important combinatorial transfer parameters of balancing networks. In turn, necessary and sufficient combinatorial conditions are established, expressed in terms of transfer parameters, which precisely characterize many important and well studied classes of balancing networks such as counting networks and smoothing networks. We propose these combinatorial conditions to be “balancing analogs” of the well known Zero-One principle holding for sorting networksWithin the combinatorial framework we develop, our first application is in deriving combinatorial conditions, involving the transfer parameters, which precisely delimit the boundary between counting networks and sorting networks.

Journal ArticleDOI
TL;DR: The following lower bound is proved that matches the upper bound above asymptotically as n≥m→∞:.
Abstract: Let M(m,n) be the minimum number of comparators needed in a comparator network that merges m elements x1 ≤ x2 ≤ … ≤ xm and n elements y1 ≤ y2 ≤ … ≤ ym, where n ≥ m. Batcher's odd-even merge yields the following upper bound: M(m,n) ≤ ½(m + n)log2m + O(n);in particular,M(n,n) ≤ n log2n + o(n)We prove the following lower bound that matches the upper bound above asymptotically as n ≥ m →∞;M(m,n) ≥ ½(m+n)log2m - O(m) in particular,M(n,n) ≥ n log2 - O(n).Our proof technique extends to give similarily tight lower bounds for the size of monotone Boolean circuits for merging, and for the size of switching networks capable of realizing the set of permutations that arise from merging.

Journal ArticleDOI
TL;DR: The optimization effect of the “magic sets” rewriting technique for datalog queries is analyzed and some supplementary or alternative techniques that avoid many shortcomings of the basic technique are presented.
Abstract: We analyze the optimization effect of the “magic sets” rewriting technique for datalog queries and present some supplementary or alternative techniques that avoid many shortcomings of the basic technique. Given a magic sets rewritten query, the set of facts generated for the original, nonmagic predicates by the seminaive bottom-up evaluation is characterized precisely. It is shown that—because of the additional magic facts—magic sets processing may result in generating an order of magnitude more facts than the straightforward naive evaluation. A refinement of magic sets in factorized magic sets is defined. These magic sets retain most of the efficiency of original magic sets in regards to the number of nonmagic facts generated and have the property that a linear-time bound with respect to seminaive evaluation is guaranteed in all cases. An alternative technique for magic sets, called envelopes, which has several desirable properties over magic sets, is introduced. Envelope predicates are never recursive with the original predicates; thus, envelopes can be computed as a preprocessing task. Envelopes also allow the utilization of multiple sideways information passing strategies (sips) for a rule. An envelope-transformed program may be “readorned” according to another choice of sips and reoptimized by magic sets (or envelopes), thus making possible an optimization effect that cannot be achieved by magic sets based on a particular choice of sips.

Journal ArticleDOI
TL;DR: A linear-time algorithm to decide for any fixed deterministic context-free language L and input string w whether w is a suffix of some string in L, which may be extended to produce syntactic structures (parses) without an increase in time complexity.
Abstract: We present a linear-time algorithm to decide for any fixed deterministic context-free language L and input string w whether wis a suffix of some string in L. In contrast to a previously published technique, the decision procedure may be extended to produce syntactic structures (parses) without an increase in time complexity. We also show how this algorithm may be applied to pocess incorrect input in linear time.