Showing papers in "Journal of the ACM in 1974"
TL;DR: An algorithm is presented which solves the string-to-string correction problem in time proportional to the product of the lengths of the two strings.
Abstract: The string-to-string correction problem is to determine the distance between two strings as measured by the minimum cost sequence of “edit operations” needed to change the one string into the other. The edit operations investigated allow changing one symbol of a string into another single symbol, deleting one symbol from a string, or inserting a single symbol into a string. An algorithm is presented which solves this problem in time proportional to the product of the lengths of the two strings. Possible applications are to the problems of automatic spelling correction and determining the longest subsequence of characters common to two strings.
3,252 citations
TL;DR: An efficient algorithm to determine whether an arbitrary graph G can be embedded in the plane is described, which used depth-first search and has time and space bounds.
Abstract: This paper describes an efficient algorithm to determine whether an arbitrary graph G can be embedded in the plane. The algorithm may be viewed as an iterative version of a method originally proposed by Auslander and Parter and correctly formulated by Goldstein. The algorithm used depth-first search and has O(V) time and space bounds, where V is the number of vertices in G. An ALGOL implementation of the algorithm succesfully tested graphs with as many as 900 vertices in less than 12 seconds.
1,183 citations
TL;DR: It is shown that arithmetic expressions with n ≥ 1 variables and constants; operations of addition, multiplication, and division; and any depth of parenthesis nesting can be evaluated in time 4 log 2 + 10(n - 1) using processors which can independently perform arithmetic operations in unit time.
Abstract: It is shown that arithmetic expressions with n ≥ 1 variables and constants; operations of addition, multiplication, and division; and any depth of parenthesis nesting can be evaluated in time 4 log2n + 10(n - 1)/p using p ≥ 1 processors which can independently perform arithmetic operations in unit time. This bound is within a constant factor of the best possible. A sharper result is given for expressions without the division operation, and the question of numerical stability is discussed.
864 citations
TL;DR: It is proved that the optimal order of one general class of multipoint iterations is 2 and that an upper bound on the order of a multipoint iteration based on n evaluations of ƒ (no derivatives) is 2.
Abstract: The problem is to calculate a simple zero of a nonlinear function ƒ by iteration. There is exhibited a family of iterations of order 2n-1 which use n evaluations of ƒ and no derivative evaluations, as well as a second family of iterations of order 2n-1 based on n — 1 evaluations of ƒ and one of ƒ′. In particular, with four evaluations an iteration of eighth order is constructed. The best previous result for four evaluations was fifth order.It is proved that the optimal order of one general class of multipoint iterations is 2n-1 and that an upper bound on the order of a multipoint iteration based on n evaluations of ƒ (no derivatives) is 2n.It is conjectured that a multipoint iteration without memory based on n evaluations has optimal order 2n-1.
664 citations
TL;DR: A technique is developed which improves all of the dynamic programming methods by asquare root factor and can be incorporated into the more general 0-1 knapsack problem obtaining a square root improvement in the asymptotic behavior.
Abstract: Given r numbers s1, …, sr, algorithms are investigated for finding all possible combinations of these numbers which sum to M. This problem is a particular instance of the 0-1 unidimensional knapsack problem. All of the usual algorithms for this problem are investigated in terms of both asymptotic computing times and storage requirements, as well as average computing times. We develop a technique which improves all of the dynamic programming methods by a square root factor. Empirical studies indicate this new algorithm to be generally superior to all previously known algorithms. We then show how this improvement can be incorporated into the more general 0-1 knapsack problem obtaining a square root improvement in the asymptotic behavior. A new branch and search algorithm that is significantly faster than the Greenberg and Hegerich algorithm is also presented. The results of extensive empirical studies comparing these knapsack algorithms are given
570 citations
TL;DR: An approach is presented for taking advantage of the structure of some special theories with simplifiers, commutativity, and associativity, which are valuable concepts to build in by means of a “natural” notation and/or new inference rules.
Abstract: To prove really difficult theorems, resolution principle programs need to make better inferences and to make them faster. An approach is presented for taking advantage of the structure of some special theories. These are theories with simplifiers, commutativity, and associativity, which are valuable concepts to build in, since they so frequently occur in important theories, for example, number theory (plus and times) and set theory (union and intersection). The object of the approach is to build in such concepts in a (refutation) complete, valid, efficient (in time) manner by means of a “natural” notation and/or new inference rules. Some of the many simplifiers that can be built in are axioms for (left and right) identities, inverses, and multiplication by zero.As for results, commutativity is built in by a straightforward modification to the unification (matching) algorithm. The results for simplifiers and associativity are more complicated. These theoretical results can be combined with one another and/or extended to either C-linear refutation completeness or theories with partial ordering, total ordering, or sets. How these results can serve as the basis of practical computer programs is discussed.
276 citations
TL;DR: This paper deals with a combinatorial minimization problem arising from studies on multimodule memory organizations with a particular solution proposed and it is demonstrated that it is close to optimum.
Abstract: This paper deals with a combinatorial minimization problem arising from studies on multimodule memory organizations. Instead of searching for an optimum solution, a particular solution is proposed and it is demonstrated that it is close to optimum. Lower bounds for the objective functions are obtained and compared with the corresponding values of the particular solution. The maximum percentage deviation of this solution from optimum is also established.
263 citations
IBM1
TL;DR: A vector-valued normal process and its diffusion equation are introduced in order to obtain an approximate solution to the joint distribution of queue lengths in a general network of queues.
Abstract: The practical value of queueing theory in engineering applications such as in computer modeling has been limited, since the interest in mathematical tractability has almost always led to an oversimplified model. The diffusion process approximation is an attempt to break away from the vogue in queueing theory.The present paper introduces a vector-valued normal process and its diffusion equation in order to obtain an approximate solution to the joint distribution of queue lengths in a general network of queues. In this model, queueing processes of various service stations which interact with each other are approximated by a vector-valued Wiener process with some appropriate boundary conditions. Some numerical examples are presented and compared with Monte Carlo simulation results. A companion paper, Part II, discusses transient solutions via the diffusion approximation.
252 citations
TL;DR: Firm lower bounds are given to minimax measures of bits stored and bits accessed for each of four retrieval questions, and representations and algorithms for a bit-addressable machine which come within factors of two or three of attaining all four bounds at once for files of any size.
Abstract: We consider a set of static files or inventories, each consisting of the same number of entries, each entry a binary word of the same fixed length selected (with replacement) from the set of all binary sequences of that length, and the entries in each file sorted into lexical order. We also consider several retrieval questions of interest for each such file. One is to find the value of the jth entry, another to find the number of entries of value less than k.When a binary representation of such a file is stored in computer memory and an algorithm or machine which knows only the file parameters (i.e. number of entries, number of possible values per entry) accesses some of the stored bits to answer a retrieval question, the number of bits stored and the number of bits accessed per retrieval question are two cost measures for the storage and retrieval task which have been used by Minsky and Papert. Bits stored depends on the representation chosen: bits accessed also depends on the retrieval question asked and on the algorithm used.We give firm lower bounds to minimax measures of bits stored and bits accessed for each of four retrieval questions, and construct representations and algorithms for a bit-addressable machine which come within factors of two or three of attaining all four bounds at once for files of any size. All four factors approach one for large enough files.
249 citations
TL;DR: An attempt is made to apply information-theoretic computational complexity to meta-mathematics by measuring the difficulty of proving a given set of theorems, in terms of the number of bits of axioms that are assumed, and the size of the proofs needed to deduce the theoremic proofs.
Abstract: An attempt is made to apply information-theoretic computational complexity to meta-mathematics. The paper studies the number of bits of instructions that must be given to a computer for it to perform finite and infinite tasks, and also the time it takes the computer to perform these tasks. This is applied to measuring the difficulty of proving a given set of theorems, in terms of the number of bits of axioms that are assumed, and the size of the proofs needed to deduce the theorems from the axioms.
208 citations
TL;DR: Given a graph H, a graph G is sought such that H is the line graph of G, if G exists, and the algorithm obtains H within the order of E + O + N steps.
Abstract: Given a graph H with E edges and N nodes, a graph G is sought such that H is the line graph of G, if G exists. The algorithm does this within the order of E steps, in fact in E + O(N) steps. This algorithm is optimal in its complexity.
TL;DR: The backward edges of a reducible flow graph are unique and it is shown that there is a “natural” single-entry loop associated with each backward edge of a reducing flow graph.
Abstract: It is established that if G is a reducible flow graph, then edge (n, m) is backward (a back latch) if and only if either n = m or m dominates n in G. Thus, the backward edges of a reducible flow graph are unique.Further characterizations of reducibility are presented. In particular, the following are equivalent: (a) G = (N, E, n0) is reducible. (b) The “dag” of G is unique. (A dag of a flow graph G is a maximal acyclic flow graph which is a subgraph of G.) (c) E can be partitioned into two sets E1 and E2 such that E1 forms a dag D of G and each (n, m) in E2 has n = m or m dominates n in G. (d) Same as (c), except each (n, m) in E2 has n = m or m dominates n in D. (e) Same as (c), except E2 is the back edge set of a depth-first spanning tree for G. (f) Every cycle of G has a node which dominates the other nodes of the cycle.Finally, it is shown that there is a “natural” single-entry loop associated with each backward edge of a reducible flow graph.
IBM1
TL;DR: A communication system consisting of a number of buffered input terminals connected to a computer by a single channel is analyzed and the stationary distributions of the length of the waiting line and the queueing delay are calculated.
Abstract: A communication system consisting of a number of buffered input terminals connected to a computer by a single channel is analyzed. The terminals are polled in sequence and the data is removed from the terminal's buffer. When the buffer has been emptied, the channel, for an interval of randomly determined length, is used for system overhead and/or to transmit data to the terminals. The system then continues with a poll of the next terminal. The stationary distributions of the length of the waiting line and the queueing delay are calculated for the case of identically distributed input processes.
TL;DR: It is shown that the multisalesmen problem can be solved by solving the standard traveling salesman problem on an expanded graph.
Abstract: It is shown that the multisalesmen problem can be solved by solving the standard traveling salesman problem on an expanded graph. The expanded graph has m — 1 more nodes than the original graph where m is the number of salesmen available at the base.
TL;DR: It is proved that, if S is an unsatisfiable Horn set, there exists a strictly-unit refutation of S employing binary resolution alone, thus eliminating the need for factoring and a theorem similar to Theorem 1 for paramodulation-based inference systems is proven in Theorem 3 but with the inclusion of factoring as an inference rule.
Abstract: The key concepts for this automated theorem-proving paper are those of Horn set and strictly-unit refutation. A Horn set is a set of clauses such that none of its members contains more than one positive literal. A strictly-unit refutation is a proof by contradiction in which no step is justified by applying a rule of inference to a set of clauses all of which contain more than one literal. Horn sets occur in many fields of mathematics such as the theory of groups, rings, Moufang loops, and Henkin models. The usual translation into first-order predicate calculus of the axioms of these and many other fields yields a set of Horn clauses. The striking feature of the Horn property for finite sets of clauses is that its presence or absence can be determined by inspection. Thus, the determination of the applicability of the theorems and procedures of this paper is immediate.In Theorem 1 it is proved that, if S is an unsatisfiable Horn set, there exists a strictly-unit refutation of S employing binary resolution alone, thus eliminating the need for factoring; moreover, one of the immediate ancestors of each step of the refutation is in fact a positive unit clause. A theorem similar to Theorem 1 for paramodulation-based inference systems is proven in Theorem 3 but with the inclusion of factoring as an inference rule. In Section 3 two reduction procedures are discussed. For the first, Chang's splitting, a rule is provided to guide both the choice of clauses and the way in which to split. The second reduction procedure enables one to refute a Horn set by refuting but one of a corresponding family of simpler subproblems.
TL;DR: The authors consider the extension of the results contained herein to free-form curve and surface design using polynomialsplines, which have several advantages over the techniques described in the present paper.
Abstract: The mth degree Bernstein polynomial approximation to a function ƒ defined over [0, 1] is Σmμ=0 ƒ(μ/m)φμ(s), where the weights φμ(s) are binomial density functions. The Bernstein approximations inherit many of the global characteristics of ƒ, like monotonicity and convexity, and they always are at least as “smooth” as ƒ, where “smooth” refers to the number of undulations, the total variation, and the differentiability class of ƒ. Historically, their relatively slow convergence in the L∞-norm has tended to discourage their use in practical applications. However, in a large class of problems the smoothness of an approximating function is of greater importance than closeness of fit. This is especially true in connection with problems of computer-aided geometric design of curves and surfaces where aesthetic criteria and the intrinsic properties of shape are major considerations. For this latter class of problems, P. Bezier of Renault has successfully exploited the properties of parametric Bernstein polynomials. The purpose of this paper is to analyze the Bezier techniques and to explore various extensions and generalizations. In a sequel, the authors consider the extension of the results contained herein to free-form curve and surface design using polynomial splines. These B-spline methods have several advantages over the techniques described in the present paper.
TL;DR: An improved algorithm requiring only 3.5 times the power of currently available fast algorithms to solve a set of linear equations with a non-Hermitian Toeplitz associated matrix.
Abstract: The solution of a set of m linear equations with a non-Hermitian Toeplitz associated matrix is considered. Presently available fast algorithms solve this set with 4m2 “operations” (an “operation” is defined here as a set of one addition and one multiplication). An improved algorithm requiring only 3m2 “operations” is presented.
TL;DR: A technique is introduced for analyzing simulations of stochastic systems in the steady state, with the existence of a random grouping of observations which produces independent identically distributed blocks from the start of the simulation.
Abstract: A technique is introduced for analyzing simulations of stochastic systems in the steady state. From the viewpoint of classical statistics, questions of simulation run duration and of starting and stopping simulations are addressed. This is possible because of the existence of a random grouping of observations which produces independent identically distributed blocks from the start of the simulation. The analysis is presented in the context of the general multiserver queue, with arbitrarily distributed interarrival and service times. In this case, it is the busy period structure of the system which produces the grouping mentioned above. Numerical illustrations are given for the M/M/1 queue. Statistical methods are employed so as to obtain confidence intervals for a variety of parameters of interest, such as the expected value of the stationary customer waiting time, the expected value of a function of the stationary waiting time, the expected number of customers served and length of a busy cycle, the tail of the stationary waiting time distribution, and the standard deviation of the stationary waiting time. Consideration is also given to determining system sensitivity to errors and uncertainty in the input parameters.
TL;DR: A general algorithm based on this characterization of a sextuple of S, E, D, L, L and U is presented and the dependence of the computational requirements on the choice of algorithm parameters is investigated theoretically.
Abstract: Branch-and-bound implicit enumeration algorithms for permutation problems (discrete optimization problems where the set of feasible solutions is the permutation group Sn) are characterized in terms of a sextuple (Bp S,E,D,L,U), where (1) Bp is the branching rule for permutation problems, (2) S is the next node selection rule, (3) E is the set of node elimination rules, (4) D is the node dominance function, (5) L is the node lower-bound cost function, and (6) U is an upper-bound solution cost. A general algorithm based on this characterization is presented and the dependence of the computational requirements on the choice of algorithm parameters, S, E, D, L, and U is investigated theoretically. The results verify some intuitive notions but disprove others.
IBM1
TL;DR: A transient solution is obtained for a cyclic queueing model using the technique of eigenfunction expansion and modeling and performance problems of a typical multiprogrammed computer system are applied.
Abstract: Quite often explicit information about the behavior of a queue over a fairly short period is wanted. This requires solving the nonequilibrium solution of the queue-length distribution, which is usually quite difficult mathematically. The first half of Part II shows how the diffusion process approximation can be used to answer this question. A transient solution is obtained for a cyclic queueing model using the technique of eigenfunction expansion. The second half of Part II applies the earlier results of Part I to modeling and performance problems of a typical multiprogrammed computer system. Such performance measures as utilization, throughput, response time and its distribution, etc., are discussed in some detail.
TL;DR: A technique for simulating GI/G/s queues is shown to apply to simulations of discrete and continuous-time Markov chains and to address questions of simulation run duration and of starting and stopping simulations because of the existence of a random grouping of observations.
Abstract: A technique for simulating GI/G/s queues is shown to apply to simulations of discrete and continuous-time Markov chains. It is possible to address questions of simulation run duration and of starting and stopping simulations because of the existence of a random grouping of observations which produces independent identically distributed blocks from the start of the simulation. This grouping allows confidence intervals to be obtained for a general function of the steady-state distribution of the Markov chain. The technique is illustrated with simulation of an (s, S) inventory model in discrete time and the classical repairman problem in continuous time. Consideration is also given to determining system sensitivity to errors and uncertainty in the input parameters.
TL;DR: A replacement system is Church-Rosser if starting with any object a unique irreducible object is reached, and starting with objects equivalent under ≡, equivalent irredUCible objects are reached.
Abstract: The central notion in a replacement system is one of a transformation on a set of objects. Starting with a given object, in one “move” it is possible to reach one of a set of objects. An object from which no move is possible is called irreducible. A replacement system is Church-Rosser if starting with any object a unique irreducible object is reached. A generalization of the above notion is a replacement system (S, ⇒, ≡), where S is a set of objects, ⇒ is a transformation, and ≡ is an equivalence relation on S. A replacement system is Church-Rosser if starting with objects equivalent under ≡, equivalent irreducible objects are reached. Necessary and sufficient conditions are determined that simplify the task of testing if a replacement system is Church-Rosser. Attention will be paid to showing that a replacement system (S, ⇒, ≡) is Church-Rosser using information about parts of the system, i.e. considering cases where ⇒ is ⇒1 ∪ ⇒2, or ≡ is (≡1 ∪ ≡2)*.
TL;DR: It is shown that simple algorithms exist which yield fault probabilities close to optimal with only a modest increase in memory, and performance bounds are obtained which are independent of the page request probabilities.
Abstract: The topic of this paper is a probabilistic analysis of demand paging algorithms for storage hierarchies. Two aspects of algorithm performance are studied under the assumption that the sequence of page requests is statistically independent: the page fault probability for a fixed memory size and the variation of performance with memory. Performance bounds are obtained which are independent of the page request probabilities. It is shown that simple algorithms exist which yield fault probabilities close to optimal with only a modest increase in memory.
IBM1
TL;DR: A search procedure is given which will determine whether Hamilton paths or circuits exist in a given graph, and will find one or all of them.
Abstract: A search procedure is given which will determine whether Hamilton paths or circuits exist in a given graph, and will find one or all of them. A combined procedure is given for both directed and undirected graphs. The search consists of creating partial paths and making deductions which determine whether each partial path is a section of any Hamilton path whatever, and which direct the extension of the partial paths.
TL;DR: The amount of store necessary to operate a dynamic storage allocation system, subject to certain constraints, with no risk of breakdown due to storage fragmentation, is considered and upper and lower bounds are given.
Abstract: The amount of store necessary to operate a dynamic storage allocation system, subject to certain constraints, with no risk of breakdown due to storage fragmentation, is considered. Upper and lower bounds are given for this amount of store, both of them stronger than those established earlier. The lower bound is the exact solution of a related problem concerning allocation of blocks whose size is always a power of 2.
TL;DR: A deductive system is described which combines aspects of resolution with that of natural deduction and whose performance compares favorably with the best predicate calculus theorem provers.
Abstract: A deductive system is described which combines aspects of resolution (e.g. unification and the use of Skolem functions) with that of natural deduction and whose performance compares favorably with the best predicate calculus theorem provers.
TL;DR: Rigorous proofs that the families of deterministic, LR (k), and bounded right context languages are coextensive are presented for the first time.
Abstract: A parsing method for strict deterministic grammars is presented and a technique for using it to parse any deterministic language is indicated. An important characterization of the trees of strict deterministic grammars is established. This is used to prove iteration theorems for (strict) deterministic languages, and hence proving that certain sets are not in these families becomes comparatively straightforward. It is shown that every strict deterministic grammar is LR(0) and that any strict deterministic grammar is equivalent to a bounded right context (1, 0) grammar. Thus rigorous proofs that the families of deterministic, LR (k), and bounded right context languages are coextensive are presented for the first time.
TL;DR: The major results of the paper are the definition of a new scheduling rule based on the known service time distributions, and the proof that expected total loss is always minimized by using this new rule.
Abstract: An analytic model of a single processor scheduling problem is investigated. The scheduling objective is to minimize the total loss incurred by a finite number of initially available requests when each request has an associated linear loss function. The assumptions of the model are that preemption is allowed with negligible loss of processor time, and that the distribution of actual service times is known for each class of requests. A request is associated with a class by any of its characteristics except its actual service time. A contrived example demonstrates that one reasonable scheduling rule does not always minimize expected total loss. The major results of the paper are the definition of a new scheduling rule based on the known service time distributions, and the proof that expected total loss is always minimized by using this new rule. Brief consideration is given to generalizations of the model in which new requests arrive randomly, and preemption requires a non-negligible amount of processor time.
TL;DR: Christofides' algorithm for finding the chromatic number of a graph is improved both in speed and memory space by using a depth-first search rule to search for a shortest path in a reduced subgraph tree.
Abstract: Christofides' algorithm for finding the chromatic number of a graph is improved both in speed and memory space by using a depth-first search rule to search for a shortest path in a reduced subgraph tree.
IBM1
TL;DR: This paper examines the problem of allocating storage for extendible arrays in the light of the author's earlier work on data graphs and addressing schemes, and a formal analog of the assertion that simplicity of array extension precludes simplicity of traversal is proved.
Abstract: Arrays are among the best understood and most widely used data structures. Yet even now, there are no satisfactory techniques for handling algorithms involving extendible arrays (where, e.g., rows and/or columns can be appended dynamically). In this paper, the problem of allocating storage for extendible arrays is examined in the light of the author's earlier work on data graphs and addressing schemes. A formal analog of the assertion that simplicity of array extension precludes simplicity of traversal (marching along rows/columns) is proved. Two strategies for constructing extendible realizations of arrays are formulated, and certain inherent limitations of such realizations are established.