# Showing papers in "Communications of The ACM in 1975"

••

TL;DR: The multidimensional binary search tree (or k-d tree) as a data structure for storage of information to be retrieved by associative searches is developed and it is shown to be quite efficient in its storage requirements.

Abstract: This paper develops the multidimensional binary search tree (or k-d tree, where k is the dimensionality of the search space) as a data structure for storage of information to be retrieved by associative searches. The k-d tree is defined and examples are given. It is shown to be quite efficient in its storage requirements. A significant advantage of this structure is that a single data structure can handle many types of queries very efficiently. Various utility algorithms are developed; their proven average running times in an n record file are: insertion, O(log n); deletion of the root, O(n(k-1)/k); deletion of a random node, O(log n); and optimization (guarantees logarithmic performance of searches), O(n log n). Search algorithms are given for partial match queries with t keys specified [proven maximum running time of O(n(k-t)/k)] and for nearest neighbor queries [empirically observed average running time of O(log n).] These performances far surpass the best currently known algorithms for these tasks. An algorithm is presented to handle any general intersection query. The main focus of this paper is theoretical. It is felt, however, that k-d trees could be quite useful in many applications, and examples of potential uses are given.

7,159 citations

••

TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.

Abstract: In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.

6,619 citations

••

TL;DR: Human visual perception and the fundamental laws of optics are considered in the development of a shading rule that provides better quality and increased realism in generated images.

Abstract: The quality of computer generated images of three-dimensional scenes depends on the shading technique used to paint the objects on the cathode-ray tube screen. The shading algorithm itself depends in part on the method for modeling the object, which also determines the hidden surface algorithm. The various methods of object modeling, shading, and hidden surface removal are thus strongly interconnected. Several shading techniques corresponding to different methods of object modeling and the related hidden surface algorithms are presented here. Human visual perception and the fundamental laws of optics are considered in the development of a shading rule that provides better quality and increased realism in generated images.

3,393 citations

••

Bell Labs

^{1}TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.

Abstract: This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern matching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the pattern matching machine in processing the text string is independent of the number of keywords. The algorithm has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.

3,270 citations

••

TL;DR: So-called “guarded commands” are introduced as a building block for alternative and repetitive constructs that allow nondeterministic program components for which at least the activity evoked, but possibly even the final state, is not necessarily uniquely determined by the initial state.

Abstract: So-called “guarded commands” are introduced as a building block for alternative and repetitive constructs that allow nondeterministic program components for which at least the activity evoked, but possibly even the final state, is not necessarily uniquely determined by the initial state. For the formal derivation of programs expressed in terms of these constructs, a calculus will be be shown.

2,048 citations

••

TL;DR: The problem of finding a longest common subsequence of two strings has been solved in quadratic time and space and an algorithm is presented which will solve this problem in QuadraticTime and in linear space.

Abstract: The problem of finding a longest common subsequence of two strings has been solved in quadratic time and space. An algorithm is presented which will solve this problem in quadratic time and in linear space.

1,164 citations

••

TL;DR: This procedure is an extension and improvement of the circle-finding concept sketched by Duda and Hart as an extension of the Hough straight-line finder.

Abstract: We describe an efficient procedure for detecting approximate circles and approximately circular arcs of varying gray levels in an edge-enhanced digitized picture. This procedure is an extension and improvement of the circle-finding concept sketched by Duda and Hart [2] as an extension of the Hough straight-line finder [6].

624 citations

••

SofTech, Inc.

^{1}TL;DR: The proposed language features serve to highlight exception handling issues by showing how deficiencies in current approaches can be remedied.

Abstract: This paper defines exception conditions, discusses the requirements exception handling language features must satisfy, and proposes some new language features for dealing with exceptions in an orderly and reliable way. The proposed language features serve to highlight exception handling issues by showing how deficiencies in current approaches can be remedied.

529 citations

••

Yale University

^{1}TL;DR: Correctness proofs of a parallel system can often be greatly simplified because the assumption that a statement is indivisible can be relaxed and still preserve properties such as halting.

Abstract: When proving that a parallel program has a given property it is often convenient to assume that a statement is indivisible, i.e. that the statement cannot be interleaved with the rest of the program. Here sufficient conditions are obtained to show that the assumption that a statement is indivisible can be relaxed and still preserve properties such as halting. Thus correctness proofs of a parallel system can often be greatly simplified.

485 citations

••

TL;DR: The method is of interest in certain packing and optimum layout problems because it consists of first determining the minimal-perimeter convex polygon that encloses the given curve and then selecting the rectangle of minimum area capable of containing this polygon.

Abstract: This paper describes a method for finding the rectangle of minimum area in which a given arbitrary plane curve can be contained. The method is of interest in certain packing and optimum layout problems. It consists of first determining the minimal-perimeter convex polygon that encloses the given curve and then selecting the rectangle of minimum area capable of containing this polygon. Three theorems are introduced to show that one side of the minimum-area rectangle must be collinear with an edge of the enclosed polygon and that the minimum-area encasing rectangle for the convex polygon is also the minimum-area rectangle for the curve.

441 citations

••

TL;DR: It is shown how the use of macros can considerably shorten the computation time in many cases and the general backtrack technique has allowed the solution of two previously open combinatorial problems, the computation of new terms in a well-known series, and the substantial reduction in computation time for the solution to another combinatorsial problem.

Abstract: The purpose of this paper is twofold. First, a brief exposition of the general backtrack technique and its history is given. Second, it is shown how the use of macros can considerably shorten the computation time in many cases. In particular, this technique has allowed the solution of two previously open combinatorial problems, the computation of new terms in a well-known series, and the substantial reduction in computation time for the solution to another combinatorial problem.

••

IBM

^{1}TL;DR: The need to envision and architecture data base systems in a hierarchical level by level framework is stressed, and formulations presented are necessary to be used in conjunction with any index selection criteria to determine the optimum set of index keys.

Abstract: The need to envision and architecture data base systems in a hierarchical level by level framework is stressed. The inverted data base (file) organization is then analyzed, considering implementation oriented aspects. The inverted directory is viewed realistically as another large data base which itself is subjected to inversion. Formulations are derived to estimate average access time (read only) and storage requirements, formalizing the interaction of data base content characteristics, logical complexity of queries, and machine timing and blocking specifications identified as having a first-order effect on performance. The formulations presented are necessary to be used in conjunction with any index selection criteria to determine the optimum set of index keys.

••

TL;DR: A new selection algorithm is presented which is shown to be very efficient on the average, both theoretically and practically.

Abstract: A new selection algorithm is presented which is shown to be very efficient on the average, both theoretically and practically. The number of comparisons used to select the ith smallest of n numbers is n + min(i,n-i) + o(n). A lower bound within 9 percent of the above formula is also derived.

••

PARC

^{1}TL;DR: The reasons for mechanizing program analysis are presented, a system, Metric, which is able to analyze simple Lisp programs and produce closed-form expressions for their running time expressed in terms of size of input is described.

Abstract: One means of analyzing program performance is by deriving closed-form expressions for their execution behavior. This paper discusses the mechanization of such analysis, and describes a system, Metric, which is able to analyze simple Lisp programs and produce, for example, closed-form expressions for their running time expressed in terms of size of input. This paper presents the reasons for mechanizing program analysis, describes the operation of Metric, explains its implementation, and discusses its limitations.

••

TL;DR: Algorithms for a multiprocessing compactifying garbage collector are presented and discussed and particular attention is given to the problems of marking and relocating list cells while another processor may be operating on them.

Abstract: Algorithms for a multiprocessing compactifying garbage collector are presented and discussed. The simple case of two processors, one performing LISP-like list operations and the other performing garbage collection continuously, is thoroughly examined. The necessary capabilities of each processor are defined, as well as interprocessor communication and interlocks. Complete procedures for garbage collection and for standard list processing primitives are presented and thoroughly explained. Particular attention is given to the problems of marking and relocating list cells while another processor may be operating on them. The primary aim throughout is to allow the list processor to run unimpeded while the other processor reclaims list storage The more complex case involving several list processors and one or more garbage collection processors are also briefly discussed.

••

TL;DR: An approach for implementing a “smart” interface to support a relational view of data is proposed to employ automatic programming techniques so that the interface analyzes and efficiently refines the high level query specification supplied by the user.

Abstract: An approach for implementing a “smart” interface to support a relational view of data is proposed. The basic idea is to employ automatic programming techniques so that the interface analyzes and efficiently refines the high level query specification supplied by the user. A relational algebra interface, called SQUIRAL, which was designed using this approach, is described in detail. SQUIRAL seeks to minimize query response time and space utilization by: (1) performing global query optimization, (2) exploiting disjoint and pipelined concurrency, (3) coordinating sort orders in temporary relations, (4) employing directory analysis, and (5) maintaining locality in page references. Algorithms for implementing the operators of E. F. Codd's relational algebra are presented, and a methodology for composing them to optimize the performance of a particular user query is described.

••

TL;DR: A working analysis and generation program for natural language, which handles paragraph length input, which is a system of preferential choice between deep semantic patterns, based on what the paper calls “semantic density.”

Abstract: The paper describes a working analysis and generation program for natural language, which handles paragraph length input. Its core is a system of preferential choice between deep semantic patterns, based on what we call “semantic density.” The system is contrasted: with syntax oriented linguistic approaches, andwith theorem proving approaches to the understanding problem.

••

TL;DR: A technique is presented which allows a class of solid objects to be synthesized and stored using a computer and has the advantage that operations are concise, readily composed, and are given in terms of easily imagined solids.

Abstract: A technique is presented which allows a class of solid objects to be synthesized and stored using a computer. Synthesis begins with primitive solids like a cube, wedge, or cylinder. Any solid can be moved, scaled, or rotated. Solids may also be added together or subtracted. Two algorithms to perform addition are described. For practical designers, the technique has the advantage that operations are concise, readily composed, and are given in terms of easily imagined solids. Quite short sequences of operations suffice to build up complex solids bounded by many faces.

••

IBM

^{1}TL;DR: The relational model of data, the XRM Relational Memory System, and the SEQUEL language, a relational data sublanguage intended for ad hoc interactive problem solving by non-computer specialists, are reviewed.

Abstract: The relational model of data, the XRM Relational Memory System, and the SEQUEL language have been covered in previous papers and are reviewed. SEQUEL is a relational data sublanguage intended for ad hoc interactive problem solving by non-computer specialists. A version of SEQUEL that has been implemented in a prototype interpreter is described. The interpreter is designed to minimize the data accessing operations required to respond to an arbitrary query. The optimization algorithms designed for this purpose are described.

••

IBM

^{1}TL;DR: Structured programming has proved to be an important methodology for systematic program design and development as mentioned in this paper, and is characterized in terms of the selection and solution of certain elementary equations defined in the algebra of functions.

Abstract: Structured programming has proved to be an important methodology for systematic program design and development. Structured programs are identified as compound function expressions in the algebra of functions. The algebraic properties of these function expressions permit the reformulation (expansion as well as reduction) of a nested subexpression independently of its environment, thus modeling what is known as stepwise program refinement as well as program execution. Finally, structured programming is characterized in terms of the selection and solution of certain elementary equations defined in the algebra of functions. These solutions can be given in general formulas, each involving a single parameter, which display the entire freedom available in creating correct structured programs.

••

TL;DR: A step-by-step approach to model the dynamic behavior and evaluate the performance of computing systems is proposed based on a technique of variable aggregation and the concept of nearly decomposable systems, both borrowed from Econometrics.

Abstract: A step-by-step approach to model the dynamic behavior and evaluate the performance of computing systems is proposed. It is based on a technique of variable aggregation and the concept of nearly decomposable systems, both borrowed from Econometrics. This approach is taken in order to identify in multiprogramming paging systems (i) unstable regimes of operations and (ii) critical computing loads which bring the system into states of saturation. This analysis leads to a more complete definition of the circumstances in which “thrashing” can set in.

••

IBM

^{1}TL;DR: An algorithm is given for computing the transitive closure of a binary relation that is represented by a Boolean matrix, similar to Warshall's although it executes faster for sparse matrices on most computers, particularly in a paging environment.

Abstract: An algorithm is given for computing the transitive closure of a binary relation that is represented by a Boolean matrix. The algorithm is similar to Warshall's although it executes faster for sparse matrices on most computers, particularly in a paging environment.

••

Bell Labs

^{1}TL;DR: The design and implementation of a system for typesetting mathematics, designed to be easy to learn and to use by people who know neither mathematics nor typesetting, is described.

Abstract: This paper describes the design and implementation of a system for typesetting mathematics. The language has been designed to be easy to learn and to use by people (for example, secretaries and mathematical typists) who know neither mathematics nor typesetting. Experience indicates that the language can be learned in an hour or so, for it has few rules and fewer exceptions. For typical expressions, the size and font changes, positioning, line drawing, and the like necessary to print according to mathematical conventions are all done automatically. For example, the input sum from i=0 to infinity x sub i = pi over 2 produces ∑∞i=0xi = p/2The syntax of the language is specified by a small context-free grammar; a compiler-compiler is used to make a compiler that translates this language into typesetting commands. Output may be produced on either a phototypesetter or on a terminal with forward and reverse half-line motions. The system interfaces directly with text formatting programs, so mixtures of text and mathematics may be handled simply.This paper was typeset by the authors using the system described.

••

TL;DR: A recursive algorithm for computing the inverse of a matrix from the LU factors based on relationships in Takahashi, et al., is examined.

Abstract: A recursive algorithm for computing the inverse of a matrix from the LU factors based on relationships in Takahashi, et al., is examined. The formulas for the algorithm are given; the dependency relationships are derived; the computational costs are developed; and some general comments on application and stability are made.

••

TL;DR: The end-order tree algorithm is shown to be an advantageous, immediate replacement for the algorithm in use with current simulation languages, and the most promising algorithm uses the indexed list concept.

Abstract: Four algorithms are considered which can be used to schedule events in a general purpose discrete simulation system. Two of the algorithms are new, one is based on an end-order tree structure for event notices, and another uses an indexed linear list. The algorithms are tested with a set of typical stochastic scheduling distributions especially chosen to show the advantages and limitations of the algorithms. The end-order tree algorithm is shown to be an advantageous, immediate replacement for the algorithm in use with current simulation languages. The most promising algorithm uses the indexed list concept. It will require an adaptive routine before it can be employed in general purpose simulators, but its performance is such that further study would be fruitful.

••

TL;DR: It is shown how efficient LR and LL parsers can be constructed directly from certain classes of these specifications.

Abstract: Methods of describing the syntax of programming languages in ways that are more flexible and natural than conventional BNF descriptions are considered. These methods involve the use of ambiguous context-free grammars together with rules to resolve syntactic ambiguities. It is shown how efficient LR and LL parsers can be constructed directly from certain classes of these specifications.

••

TL;DR: Peaks in a digitized waveform are detected by an algorithm incorporating piecewise linear approximation and tabular parsing techniques, of sufficient speed to allow on-line real-time processing.

Abstract: Peaks in a digitized waveform are detected by an algorithm incorporating piecewise linear approximation and tabular parsing techniques. Several parameters serve to identify the waveform context enabling accurate measurement of peak amplitude, duration, and shape. The algorithm is of sufficient speed to allow on-line real-time processing. An example of its application is demonstrated on an electrocardiogram.

••

TL;DR: A data sublanguage called SQUARE, intended for use in ad hoc, interactive problem solving by non-computer specialists, and is shown to be relationally complete; however, it avoids the quantifiers and bound variables required by languages based on the relational calculus.

Abstract: This paper presents a data sublanguage called SQUARE, intended for use in ad hoc, interactive problem solving by non-computer specialists. SQUARE is based on the relational model of data, and is shown to be relationally complete; however, it avoids the quantifiers and bound variables required by languages based on the relational calculus. Facilities for query, insertion, deletion, and update on tabular data bases are described. A syntax is given, and suggestions are made for alternative syntaxes, including a syntax based on English key words for users with limited mathematical background.

••

TL;DR: It is shown that any deterministic algorithm which solves the circularity problem for a grammar must for infinitely many cases use an exponential amount of time.

Abstract: Attribute grammars are an extension of context-free grammars devised by Knuth as a mechanism for including the semantics of a context-free language with the syntax of the language. The circularity problem for a grammar is to determine whether the semantics for all possible sentences (programs) in fact will be well defined. It is proved that this problem is, in general, computationally intractable. Specifically, it is shown that any deterministic algorithm which solves the problem must for infinitely many cases use an exponential amount of time. An improved version of Knuth's circularity testing algorithm is also given, which actually solves the problem within exponential time.

••

IBM

^{1}TL;DR: A high level and nonprocedural translation definition language, CONVERT, which provides very powerful and highly flexible data restructuring capabilities and is based on the simple underlying concept of a form which enables the users to visualize the translation processes, and thus makes data translation a much simpler task.

Abstract: This paper describes a high level and nonprocedural translation definition language, CONVERT, which provides very powerful and highly flexible data restructuring capabilities. Its design is based on the simple underlying concept of a form which enables the users to visualize the translation processes, and thus makes data translation a much simpler task.“CONVERT” has been chosen for conveying the purpose of the language and should not be confused with any other language or program bearing the same name.