scispace - formally typeset
Search or ask a question

Showing papers on "Sequential algorithm published in 1986"


Journal ArticleDOI
TL;DR: The focus of this work is on the theory of distributed discrete-event simulation, which may provide better performance by partitioning the simulation among the component processors.
Abstract: Traditional discrete-event simulations employ an inherently sequential algorithm. In practice, simulations of large systems are limited by this sequentiality, because only a modest number of events can be simulated. Distributed discrete-event simulation (carried out on a network of processors with asynchronous message-communicating capabilities) is proposed as an alternative; it may provide better performance by partitioning the simulation among the component processors. The basic distributed simulation scheme, which uses time encoding, is described. Its major shortcoming is a possibility of deadlock. Several techniques for deadlock avoidance and deadlock detection are suggested. The focus of this work is on the theory of distributed discrete-event simulation.

968 citations


Journal ArticleDOI
TL;DR: Two parallel formulations of the statistical cooling algorithm are proposed, i.e. a systolic algorithm and a clustered algorithm, based on the requirement that quasi-equilibrium is preserved throughout the optimization process.

119 citations



Book ChapterDOI
01 Jan 1986
TL;DR: It is shown that the parallel algorithm based on the requirement that quasi-equilibrium is preserved throughout the optimization process can be executed in polynomial time.
Abstract: Statistical Cooling is a new optimization technique based on Monte-Carlo iterative improvement. Here we propose a parallel formulation of the statistical cooling algorithm based on the requirement that quasi-equilibrium is preserved throughout the optimization process. It is shown that the parallel algorithm can be executed in polynomial time. Performance of the algorithm is discussed by means of an implementation on an experimental multi-processor architecture. It is concluded that substantial reductions of computation time can be achieved by the parallel algorithm in comparison with the sequential algorithm.

31 citations


Journal ArticleDOI
Lodi1, Pagli
TL;DR: A parallel algorithm to solve the visibility problem among n vertical segments in a plane, which can be implemented on a VLSI chip arranged as a mesh of trees, is presented.
Abstract: We present a parallel algorithm to solve the visibility problem among n vertical segments in a plane, which can be implemented on a VLSI chip arranged as a mesh of trees. Our algorithm determines all the pairs of segments that "see" each other in time O(log n); while the fastest sequential algorithm requires O(n log n). A lower bound to the area-time complexity of this problem of O(n2 log2 n) is also derived.

27 citations


Journal ArticleDOI
TL;DR: This study explores the question of transforming a sequential algorithm into an efficient parallel algorithm by considering the problem of balancing binary search trees by derived a new iterative balancing algorithm that has time complexity O(1) on an N-processor configuration.
Abstract: A recent trend in program methodologies is to derive efficient parallel programs from sequential programs. This study explores the question of transforming a sequential algorithm into an efficient parallel algorithm by considering the problem of balancing binary search trees. The derivation of the parallel algorithm makes use of stepwise refinement. The authors first derive a new iterative balancing algorithm that exploits the similarity of point restructuring required at all the nodes at the same level. From this they derive a parallel algorithm that has time complexity O(1) on an N-processor configuration. This achieves the theoretical limit of speedup possible in a multiprocessor configuration.

20 citations


Journal ArticleDOI
01 Feb 1986
TL;DR: In this paper, the authors describe an object recognition algorithm both on a sequential machine and on a single instruction multiple data (SIMD) parallel processor such as the MIT connection machine, which is shown to run three to four orders of magnitude faster than the sequential version.
Abstract: This paper describes an object recognition algorithm both on a sequential machine and on a single instruction multiple data (SIMD) parallel processor such as the MIT connection machine. The problem, in the way it is presently formulated on a sequential machine, is essentially a propagation of constraints through a tree of possibilities in an attempt to prune the tree to a small number of leaves. The tree can become excessively large, however, and so implementations on massively parallel machines are sought in order to speed up the problem. Two fast parallel algorithms are described here, a static algorithm and a dynamic algorithm. The static algorithm reformulates the problem by assigning every leaf in the completely expanded unpruned tree to a separate processor in the connection machine. Then pruning is done in nearly constant time by broadcasting constraints to the entire SIMD array. This parallel version is shown to run three to four orders of magnitude faster than the sequential version. For large recognition problems which would exceed the capacity of the machine, a dynamic algorithm is described which performs a series of loading and pruning steps, dynamically allocating and deallocating processors through the use of the connection machine's global router communications mechanism. Ce rapport decrit un algorithme pour la reconnaissance ? objets qui s'adapte a la fois a un ordinateur sequentiel ou la connection machine du MIT, un multiprocesseur parallele du type SIMD. Sur un ordinateurconventionel, ľ algorithme propage des contraintes a travers un arbre de solutions probables de facon aeliminer le plus de branches que possible. Puisque ľ arbre a tendance a devenir enorme, nous avons considere le transfert de ľ algorithme sur un systeme a multiprocesseurs. Deux algorithmes paralleles sont proposes, un sous forme statique, ľ autre dynamique. Ľ algorithme statique assigne un processeur de la connection machine a chacune des options generees dans ľ arbre. Ľ analyze se fait done en un temps quasi constant en transmettant a chacun des processeurs les contraintes necessaires pour trouver la solution. Ceci accelere ľ algorithme par troi.s ou quatre ordres de grandeur par rapport a la version sequentielle. Pour des problemes plus.complexes qui excederaient la capacitye du multiprocesseur, une autre solution assigne de facon dynamique les differents processeurs lors du traitement en utilisant le systeme ? intercommunication global de la connection machine. Mots cles: traitement parallele, reconnaissance ? objets, connection machine, ordinateur adressable parcontenu, recherche dans des structures a base ? arbres. Summary A well-established algorithm for object recognition on a sequential machine was explained and fast connection machine algorithms were devised. First, a static parallel algorithm was discussed, in which the entire tree of possible interpretations is loaded into the connection machine before run time. The search is then done in one step and therefore in constant time. For problems that are too large to fit in the array, a dynamic algorithm was devised. In this algorithm, the connection machine is loaded with as many levels of the tree as can fit. Then a pruning is done in parallel as in the static algorithm. However, processors that survive the initial pruning step and still represent consistent pairings must find unused processors which can be assigned to the new branches of the tree. This is similar to the way the sequential algorithm first described prunes the search space. Processors that are pruned are deallocated and reused on the next iteration. The router mechanism of the connection machine which is not used in the static algorithm is necessary here to support dynamically allocating and deallocating processors.

15 citations


01 Jun 1986
TL;DR: The concept of a P-complete algorithm is introduced to capture what it means for an algorithm to be inherently sequential, and a number of sequential greedy algorithms are P- complete, including the greedy algorithm for finding a path in a graph.
Abstract: This thesis addresses a number of theoretical issues in parallel computation. There are many open questions relating to what can be done with parallel computers and what are the most effective techniques to use to develop parallel algorithms. We examine various problems in hope of gaining insight to the general questions. One topic that is investigated is the relationship between sequential and parallel algorithms. We introduce the concept of a P-complete algorithm to capture what it means for an algorithm to be inherently sequential. We show that a number of sequential greedy algorithms are P-complete, including the greedy algorithm for finding a path in a graph. However, an algorithm being P-complete does not necessarily mean that the problem is difficult. In some cases, the natural sequential algorithm is P-complete but a different technique gives a fast parallel algorithm. This shows that it is necessary to use different techniques for parallel computation than are used for sequential computation. We give fast parallel algorithms for a number of simple graph theory problems. The algorithms illustrate a number of different techniques that are useful for parallel algorithms. The most important results are that the maximal path problem can be solved in RNC and that a depth first search tree can be constructed in approximately square root n parallel time, where n is the number of vertices. This shows that substantial speed up is possible on both of these problems using parallelism. The final topic that we address is parallel approximation of P-complete problems. P-complete problems probably cannot be solved by fast parallel algorithms. We give a number of results on approximating P-complete with parallel algorithms that are similar to results on approximating NP-complete problems with sequential algorithms. We give upper and lower bounds on the degree of approximation that is possible for some problems. We also investigate the role that numbers play in P-complete problems, showing that some P-complete problems remain difficult even if the numbers are small.

10 citations


Journal ArticleDOI
TL;DR: A parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5 is developed.
Abstract: Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.

9 citations


Journal ArticleDOI
TL;DR: A systolic algorithm for solving a Toeplitz least-squares problem of special form, for example, when Volterra convolution equations of the first kind are solved by regularization is described.

7 citations


Proceedings ArticleDOI
04 Apr 1986
TL;DR: A surprising result is that the parallel algorithm for the tridiagonal case can be significantly faster than the previously best sequential algorithm on large problems, and is effective on moderate size problems when run in serial mode.
Abstract: In this paper we present a parallel algorithm for the symmetric algebraic eigenvalue problem. The algorithm is based upon a divide and conquer scheme suggested by Cuppen for computing the eigensystem of a symmetric tridiagonal matrix. We extend this idea to obtain a parallel algorithm that retains a number of active parallel processes that is greater than or equal to the initial number throughout the course of the computation. Computation of the eigensystem of the tridiagonal matrix is reviewed. Also, brief analysis of the numerical properties and sensitivity to round off error is presented to indicate where numerical difficulties may occur. We show how to explicitly overlap the initial reduction to tridiagonal form with the parallel computation of the eigensystem of the tridiagonal matrix. The algorithm is therefore able to exploit parallelism at all levels of the computation and is well suited to a variety of architectures. Computational results have been presented in [4] for several machines. These results are very encouraging with respect to both accuracy and speedup. A surprising result is that the parallel algorithm for the tridiagonal case, even when run in serial mode, can be significantly faster than the previously best sequential algorithm on large problems, and is effective on moderate size problems when run in serial mode.

Proceedings ArticleDOI
10 Nov 1986
TL;DR: The theory of word synchronization process for transmission codes for digital transmission systems shows that the number of observations required to take a decision for constant probe algorithm is, average, larger than that for sequential algorithm.
Abstract: The problem of word synchronization process for transmission codes for digital transmission systems has been studied using statistical decision theory. The theory shows that the number of observations required to take a decision for constant probe algorithm is, average, larger that the number of observations necessary to take a decision for sequential algorithm. The conclusion for constant probe word synchronizer and sequential word synchronizer was verified by experiments with digital transmission system on optical fiber.© (1986) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.


Book ChapterDOI
01 Jan 1986
TL;DR: The paper deals with the synthesis problem for tree automata and a sequential synthesis algorithm is presented based on the hypothesis formulation-verification-modification scheme.
Abstract: The paper deals with the synthesis problem for tree automata. A sequential synthesis algorithm is presented based on the hypothesis formulation-verification-modification scheme.

01 Jan 1986
TL;DR: This paper summarizes the results obtained using various sequential and parallel methods to solve partial differential equations on a sbared main memory machine, the Sequent Symmetry.
Abstract: This paper summarizes the results obtained using various sequential and parallel methods to solve partial differential equations on a sbared main memory machine, the Sequent Symmetry. Four numerical methods are used and compared: 1) sequential band Gauss elimination, 2) parallel band Gauss elimination, 3) sequential Tensor Product Generalized ADI, and 4) parallel TPGADI. We discuss the various issues involved in the parallelization of a sequential algorithm to make best use of the Sequent Symmetry.

Book ChapterDOI
17 Sep 1986
TL;DR: It is proved: using a fixed number p of processing elements (PEs) the time complexity of a parallel partitioned algorithm is minimal if either all p PEs or if only one PE is used for executing one operation on datablocks.
Abstract: A general concept for the description of partitioned algorithms is presented. It is based on a partitioning of the occurring data in datablocks of equal size. For a class of partitioned algorithms including matrix multiplication, LU-decomposition of a matrix, solving a linear system of equations it is proved: using a fixed number p of processing elements (PEs) the time complexity of a parallel partitioned algorithm is minimal if either all p PEs or if only one PE is used for executing one operation on datablocks.