Showing papers on "Sequential algorithm published in 1988"

PDF

Open Access

Report•DOI•

Parallel Unification Scheduling in Prolog

[...]

01 Jan 1988

TL;DR: This thesis describes and evaluates techniques for speeding up unification, including an extension of Chang's static data-dependency analysis (SDDA), and ways in which these techniques may be applied to the Berkeley PLM machine.

...read moreread less

Abstract: : Unification, the fundamental operation in the Prolog logic programming language can take up to 50% of the execution time of a typical Prolog system. One approach to speeding up the unification operation is to perform it on parallel hardware. Although it has been shown that, in general, there is no parallel algorithm for unification that is better than the best sequential algorithm, there is a substantial subset of unification which may be done in parallel. Identifying these subsets involves gathering data using an extension of Chang's static data-dependency analysis (SDDA), then using that data to schedule the components of a unification for parallel unification. Improvements to the information gathered by SDDA may be achieved through procedures splitting, a source-level transformation of the program. This thesis describes and evaluates the above-mentioned techniques and their implementation. Results are compared to other techniques for speeding up unification. Ways in which these techniques may be applied to the Berkeley PLM machine are also described.

...read moreread less

21 citations

A fast parallel algorithm to determine edit distance

[...]

Thomas R. Mathies¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1988

TL;DR: An algorithm that runs in <9(logmlogrt) time and uses mn processors on a CRCW PRAM, where m and n are the lengths of the strings and the largest common submatrix of two matrices is considered and shown to be NP-hard.

...read moreread less

Abstract: We consider the problem of determining in parallel the cost of converting a source string to a destination string by a sequence of insert, delete and transform operations. Each operation has an integer cost in some fixed range. We present an algorithm that runs in <9(logmlogrt) time and uses mn processors on a CRCW PRAM, where m and n are the lengths of the strings. The best known sequential algorithm [MP83] runs in time 0(n/ log n) for strings of length n, indicating that our parallel algorithm (with time-processor product equal to 0(mn log m log n)) is nearly optimal. An instance of the edit distance problem is represented as a graph. The algorithm finds the shortest path in the graph using a path doubling method with efficient pruning due to the structure of the problem. Extensions of the algorithm solve approximate string matching and local best fit problems. The problem of finding the largest common submatrix of two matrices is considered and shown to be NP-hard. Finally we present an algorithm for exact two-dimensional pattern matching that runs in OClog n) time using n processors for a n x n search matrix.

...read moreread less

20 citations

Journal Article•DOI•

Two-dimensional convolution on a pyramid computer

[...]

Jik H. Chang¹, Oscar H. Ibarra¹, Ting-Chuen Pong¹, Stephen M. Sohn¹•Institutions (1)

University of Minnesota¹

01 Jul 1988-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the authors presented an algorithm for convolving a k*k window of weighting coefficients with an n*n image matrix on a pyramid computer of O(n/sup 2/) processors.

...read moreread less

Abstract: An algorithm for convolving a k*k window of weighting coefficients with an n*n image matrix on a pyramid computer of O(n/sup 2/) processors in time O(logn+k/sup 2/), excluding the time to load the image matrix, is presented. If k= Omega ( square root log n), which is typical in practice, the algorithm has a processor-time product O(n/sup 2/ k/sup 2/) which is optimal with respect to the usual sequential algorithm. A feature of the algorithm is that the mechanism for controlling the transmission and distribution of data in each processor is finite state, independent of the values of n and k. Thus, for convolving two (0, 1)-valued matrices using Boolean operations rather than the typical sum and product operations, the processors of the pyramid computer are finite-state. >

...read moreread less

18 citations

Journal Article•DOI•

A parallel algorithm for preemptive scheduling of uniform machines

[...]

Charles U. Martel¹•Institutions (1)

University of California, Davis¹

01 Dec 1988-Journal of Parallel and Distributed Computing

TL;DR: A fast parallel algorithm for preemptive scheduling of n independent jobs on m uniform machines and a parallel version of this algorithm for a Concurrent Read Exclusive Write (CREW) shared memory computer is developed.

...read moreread less

12 citations

Journal Article•DOI•

Broadcast normalization in systolic design

[...]

Ferng-ching Lin¹, I-Chen Wu²•Institutions (2)

National Taiwan University¹, Carnegie Mellon University²

01 Nov 1988-IEEE Transactions on Computers

TL;DR: A normalization method is introduced to fix the positions of the broadcast sources so that the derived design can be further transformed by retimings into a systolic array.

...read moreread less

Abstract: When a sequential algorithm is directly mapped into an array of processing elements, quite likely data broadcasts are required and their source places vary during the computation. The authors introduce a normalization method to fix the positions of the broadcast sources so that the derived design can be further transformed by retimings into a systolic array. The method is fully illustrated in designing systolic arrays for enumeration sort, solving simultaneous linear equations, and transitive closure. >

...read moreread less

12 citations

Journal Article•DOI•

On parallel unification for Prolog

[...]

James Harland¹, J. Jaffer²•Institutions (2)

University of Edinburgh¹, IBM²

01 Jul 1988-New Generation Computing

TL;DR: This work defines and discusses an objective measure of the effect of parallelism on a sequential algorithm, known as the potential parallel factor (PPF), which is applied to parallel versions of the unification algorithms of Yasuura and Jaffar.

...read moreread less

Abstract: Parallel unification algorithms are not nearly so numerous or well-developed as sequential ones In order to estimate the improvement in efficiency which may be expected, we define and discuss an objective measure of the effect of parallelism on a sequential algorithm This measure, known as thepotential parallel factor (PPF), is applied to parallel versions of the unification algorithms of Yasuura and Jaffar The PPFs for these algorithms are measured on a variety of running Prolog programs to estimate what increase in speed may be expected in a Prolog environment from the use of parallelism Other potential uses of parallelism may be evaluated by different applications of our general methods and techniques

...read moreread less

11 citations

Proceedings Article•DOI•

Pyramidal Edge Detection And Image Representation

[...]

A. Schrift¹, Yehoshua Y. Zeevi¹, Moshe Porat¹•Institutions (1)

Technion – Israel Institute of Technology¹

25 Oct 1988

TL;DR: A scheme of extracting edge information from parallel spatial frequency bands using the formalism of a Gaussian pyramid to create an integrated image of most significant edges of different scales is presented.

...read moreread less

Abstract: We present a scheme of extracting edge information from parallel spatial frequency bands. From these we create an integrated image of most significant edges of different scales. The frequency bands are realized using the formalism of a Gaussian pyramid in which the levels represent a bank of spatial lowpass filters. The integrated edge image is created in a top-down algorithm, starting from the smallest version of the image. The sequential algorithm uses mutual edge information of two consecutive levels to control the processing in the lower one. This edge detection algorithm constitutes an image-dependent nonuniform processing scheme. Computational results show that only 20%-50% of the operations are needed to create an edge pyramid, compared to the number required in the regular scheme. The proposed generic scheme of image-dependent processing can be also implemented with operators other than edge detectors to exploit the advantages inherent in biological processing of images.

...read moreread less

11 citations

Book•

Or-parallel Prolog in flat Concurrent Prolog

[...]

Ehud Shapiro¹•Institutions (1)

Weizmann Institute of Science¹

01 Feb 1988

TL;DR: This work describes a simple interpreter-based fcp implementation of the algorithm, analyzes its performance under Logix, and includes initial measurements of its speedup on the parallel implementation of fcp .

...read moreread less

Abstract: We describe a simple or -parallel execution algorithm for PROLOG that naturally collects all solutions to a goal. For a large class of programs the algorithm has O(log n) overhead and exhibits O(n/(log n)2) parallel speedup over the standard sequential algorithm. Its constituent parallel processes are independent, and hence the algorithm is suitable for implementation on non-shared-memory parallel computers. The algorithm can be implemented directly in Flat Concurrent PROLOG. We describe a simple interpreter-based fcp implementation of the algorithm, analyze its performance under Logix, and include initial measurements of its speedup on the parallel implementation of fcp . The implementation is easily extended. We show an extension that performs parallel demand-driven search. We define two parallel variants of cut, cut-clause and cut-goal, and describe their implementation. We discuss the execution of the algorithm on a parallel computer, and describe implementations of it that perform centralized and distributed dynamic load balancing. Since the fcp implementation of the algorithm relies on full test unification, the algorithm does not seem to have a similarly natural implementation in ghc or parlog .

...read moreread less

7 citations

Proceedings Article•DOI•

Mapping Sequential Processing Algorithms Onto Parallel Distributed Processing Architectures

[...]

Rodney L. Clark, Charles F. Hester, Perry C. Lindberg

20 Apr 1988

TL;DR: It is feasible to generalize techniques for mapping sequential algorithms onto a neural model of a parallel distributed processor and implement a neural compiler for sequential algorithms, according to this paper.

...read moreread less

Abstract: Teledyne Brown Engineering has developed techniques for mapping sequential algorithms onto a neural model of a parallel distributed processor. Any sequential algorithm (including NP algorithms) can be mapped onto a neural network. This paper discusses some practical considerations for implementation of sequential to parallel mappings (SPM). It is feasible to generalize these techniques and implement a neural compiler for sequential algorithms. Neural networks (the interconnection matrix) generated by the neural compiler will be implemented in a fixed, holographic, optical computer.

...read moreread less

7 citations

Journal Article•DOI•

An Algorithm for Simultaneous Order and Parameter Identification in Multivariable Systems

[...]

S.P. Bingulac¹, R.V. Krtolica•Institutions (1)

Virginia Tech¹

01 Aug 1988-IFAC Proceedings Volumes

TL;DR: An algorithm for simultaneous order identification and parameter estimation of linear, discrete, MIMO system with unknown observability indices is presented, considered as a multivariable extension of conventional loss function tests used to detect the order of SISO systems.

...read moreread less

6 citations

Book Chapter•DOI•

Is Average Superlinear Speedup Possible

[...]

Ewald Speckenmeyer¹•Institutions (1)

Technical University of Dortmund¹

03 Oct 1988

TL;DR: A mathematical model for analysing the speedup behaviour of a parallel k processor backtracking algorithm compared with sequential backtracking is studied and it is shown that in case of sufficiently unbalanced distributions superlinear speedups will occur in the average.

...read moreread less

Abstract: A mathematical model for analysing the speedup behaviour of a parallel k processor backtracking algorithm compared with sequential backtracking is studied. The essential parameter of a problem class, which is incorporated in the model, is the distribution of solutions in the corresponding backtracking trees. Under the model assumptions it is shown that in case of sufficiently unbalanced distributions superlinear speedups will occur in the average. Further a result is shown, indicating that in case of restricted classes of CNF-formulas unbalanced distributions of solutions actually occur.

...read moreread less

Proceedings Article•DOI•

Detection of ventricular fibrillation by sequential testing

[...]

Y.-S. Zhu¹, Nitish V. Thakor¹•Institutions (1)

Johns Hopkins University¹

25 Sep 1988

TL;DR: In this article, a sequential decision rule is described to discriminate probability distributions of VF from ventricular tachycardia (VT) and supraventricular thymus (SVT).

...read moreread less

Abstract: Ventricular fibrillation (VF) must be accurately detected by an automatic implantable cardioconverter-defibrillator and must also be discriminated from ventricular tachycardia (VT) and supraventricular tachycardia (SVT). A sequential decision rule is described to discriminate probability distributions of VF from VT and SVT. Intracardiac signals are first converted to binary sequences by comparison with a threshold. Probability distributions of threshold-crossing intervals are determined. The sequential test calculates a log-likelihood and compares that with preset detection thresholds. The thresholds are set so as to result in desired test accuracy. Essentially, the sequential algorithm trades off the time to reach decision (number of sequential decision steps) with accuracy. In a study of 170 electrograms from humans, 95.3% of VF signals are classified in 3 s, 97.6% in 5 s, and 100% in 7 s. The sequential algorithm offers ease of implementation for implantable devices and excellent performance. >

...read moreread less

Proceedings Article•DOI•

Three dimensional Delaunay triangulation on a Cray X-MP

[...]

D.A. Field¹, K. Yarnall•Institutions (1)

General Motors¹

14 Nov 1988

TL;DR: A software package for generation of tetrahedral finite-element mesh, whose kernel is a robust three-dimensional Delaunay triangulation algorithm, was ported to a CRAY X-MP for vector processing, reducing the total execution time for the critical subroutines of the kernel by a factor of six.

...read moreread less

Abstract: A software package for generation of tetrahedral finite-element mesh, whose kernel is a robust three-dimensional Delaunay triangulation algorithm, was ported to a CRAY X-MP for vector processing. The total execution time for the critical subroutines of the kernel decreased by a factor of six over scalar mode on the CRAY X-MP. The kernel is characterized by simple data structures and O(N/sup 2/) arithmetic operation counts, N being the number of finite-element nodes. Although the kernel is essentially a sequential algorithm, its simple data structures allow for key uses of vector processing and for streamlining sequential processing. >

...read moreread less

Journal Article•DOI•

Parallel computation of image curve velocity fields

[...]

Johan D'Haeyer¹, Ignace Bruyland¹•Institutions (1)

Ghent University¹

01 Aug 1988-Graphical Models \/graphical Models and Image Processing \/computer Vision, Graphics, and Image Processing

TL;DR: It is shown in the paper that regularization problems, such as the smoothest velocity field computation and the computation of the minimum dilatation velocity field, can be solved with a parallel algorithm or a fast sequential algorithm.

...read moreread less

Abstract: The computation of the velocity field along image curves belongs to the class of ill-posed problems (in the sense of Hadamard). Local measurements of image pattern changes are usually insufficient to solve the velocity field uniquely. Therefore regularization techniques are applied, yielding solutions that are robust against noise and that are correct for a limited class of curve velocity fields. It is shown in the paper that regularization problems, such as the smoothest velocity field computation and the computation of the minimum dilatation velocity field, can be solved with a parallel algorithm or a fast sequential algorithm. This follows from the block-tridiagonal structure to which these variational techniques give rise.

...read moreread less

Dissertation•

Parallel simplex algorithms and loop spreading

[...]

Youfeng Wu

01 Jan 1988

TL;DR: Parallel solutions for two classes of linear programs are presented and it is shown that it is possible to get linear improvement in performance and a particular variation of the decomposed simplex algorithm which can run 2 times faster than the original one is discovered.

...read moreread less

Abstract: Parallel solutions for two classes of linear programs are presented. First we parallelized the two-phase revised simplex algorithm and showed that it is possible to get linear improvement in performance. The simplex algorithm is the best known algorithm for solving linear programs, and we claim our result is the best one which can be achieved. Next we study the parallelization of the decomposed simplex algorithm. One of our new parallel algorithms has achieved 2*P time of performance improvement over the decomposed simplex algorithm using P processors. Meanwhile, we discovered a particular variation of the decomposed simplex algorithm which can run 2 times faster than the original one. The new parallel algorithm linearly speedups the fast sequential algorithm. As in any parallel program, unbalanced processor load causes the performance of the parallel decomposed simplex algorithm to drop significantly when the size of the input data is not a multiple of the number of available processors. To remove this limitation, we invented a load balance technique called Loop Spreading that evenly distributes parallel tasks on multiple processors without a drop in performance even when the size of the input data is not a multiple of the number of processors. Loop Spreading is a general technique that can be used automatically by a compiler to balance processor load in any language that supports parallel loop constructs.

...read moreread less

Book Chapter•DOI•

Effective Algoritm for Global Optimization with Parallel Computations

[...]

Roman G. Strongin, Yaroslav D. Sergeev

01 Jan 1988

TL;DR: Appearance of multiprocessor computer systems and local computer networks gives wide latitude for constructing optimization techniques using parallel iterations including simultaneous (due to many processors) computations (trials) of function to be optimized values at several points in a parameter space.

...read moreread less

Abstract: Appearance of multiprocessor computer systems and local computer networks gives wide latitude for constructing optimization techniques using parallel iterations including simultaneous (due to many processors) computations (trials) of function to be optimized values at several points in a parameter space. Each trial appearing in sach parallel iteration could be performed at a separate processing unit using the same (shared or copied) program.

...read moreread less

Book Chapter•DOI•

Real-Time Computation of Control Torques for Mechanical Manipulators using Concurrent Processors

[...]

A. V. Ramesh¹, Shiva K. Das¹, Senol Utku¹, M. Salama²•Institutions (2)

Duke University¹, California Institute of Technology²

01 Jan 1988

TL;DR: In this work, a parallel stratagem is detailed which gives about four times speedup over a sequential scheme for a six-links robot-arm.

...read moreread less

Abstract: The fast real-time control of a rigid open link robot-arm necessitates that the forward kinematics and inverse dynamics problems be solved in as short a time as possible. The solution of the Newton-Euler equations sequentially, although fast compared with the Lagrange-Euler approach, may still not be fast enough for the real-time determination of applied joint torques and the efficient feedback control of the non-linear effects. In this work, a parallel stratagem is detailed which gives about four times speedup over a sequential scheme for a six-links robot-arm. For an n - link robot-arm, the stratagem uses 2n processing elements with 2 processing elements assigned to each link. The processors are arranged in two layers, with n processors in each layer. In general, the top layer computes the angular velocity terms and the bottom layer computes angular acceleration terms.

...read moreread less

Book Chapter•DOI•

Sequential and stable methods for the solution of mass recovery problems (estimation of the spectrum and of the impendance function)

[...]

Gy. Sonnevend¹•Institutions (1)

Eötvös Loránd University¹

01 Jan 1988