Showing papers on "Parallel processing (DSP implementation) published in 1991"

PDF

Open Access

Journal Article•DOI•

[...]

David H. Bailey, Eric Barszcz, John T. Barton, D. S. Browning, Russell Carter, Leonardo Dagum, Rod Fatoohi, Paul O. Frederickson, T. A. Lasinski, Robert Schreiber, Horst D. Simon, V. Venkatakrishnan, Sisira Weeratunga - Show less +9 more

01 Sep 1991

TL;DR: A new set of benchmarks has been developed for the performance evaluation of highly parallel supercom puters that mimic the computation and data move ment characteristics of large-scale computational fluid dynamics applications.

...read moreread less

Abstract: A new set of benchmarks has been developed for the performance evaluation of highly parallel supercom puters. These consist of five "parallel kernel" bench marks and three "simulated application" benchmarks. Together they mimic the computation and data move ment characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their "pencil and paper" specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional bench- marking approaches on highly parallel systems are avoided.

...read moreread less

2,246 citations

Journal Article•DOI•

High-speed ultrasound volumetric imaging system. II. Parallel processing and image display

[...]

O.T. von Ramm¹, Stephen W. Smith², H.G. Pavy¹•Institutions (2)

Duke University¹, Center for Devices and Radiological Health²

01 Jan 1991-IEEE Transactions on Ultrasonics Ferroelectrics and Frequency Control

TL;DR: The design, application, and evaluation of parallel processing to the high-speed volumetric ultrasound imaging system, which uses pulse-echo phased array principles to steer a 2-D array transducer of 289 elements in a pyramidal scan format is described.

...read moreread less

Abstract: For pt.I see ibid., vol.38, no.2, p.100-8 (1991). The authors describe the design, application, and evaluation of parallel processing to the high-speed volumetric ultrasound imaging system. The scanner produces images analogous to an optical camera or the human eye and supplies more information than conventional sonograms. Potential medical applications include improved anatomic visualization, tumor localization, and better assessment of cardiac function. The system uses pulse-echo phased array principles to steer a 2-D array transducer of 289 elements in a pyramidal scan format. Parallel processing in the receive mode produces 4992 scan lines at a rate of approximately 8 frames/s. Echo data for the scanned volume is presented online as projection images with depth perspective, stereoscopic pairs, or multiple tomographic images. The authors also describe the techniques developed for the online display of volumetric images on a conventional CRT oscilloscope and show preliminary volumetric images for each display mode. >

...read moreread less

433 citations

Journal Article•DOI•

Lazy task creation: a technique for increasing the granularity of parallel programs

[...]

E. Mohr¹, David A. Kranz², Robert H. Halstead•Institutions (2)

Yale University¹, Massachusetts Institute of Technology²

01 Jul 1991-IEEE Transactions on Parallel and Distributed Systems

TL;DR: In this paper, the authors present a lazy task creation method for a parallel implementation of Scheme called Mul-T that combines parallel tasks dynamically at runtime, based on a load-based inlining method.

...read moreread less

Abstract: When a parallel algorithm is written naturally, the resulting program often produces tasks of a finer grain than an implementation can exploit efficiently. Two solutions to the granularity problem that combine parallel tasks dynamically at runtime are discussed. The simpler load-based inlining method, in which tasks are combined based on dynamic bad level, is rejected in favor of the safer and more robust lazy task creation method, in which tasks are created only retroactively as processing results become available. The strategies grew out of work on Mul-T, an efficient parallel implementation of Scheme, but could be used with other languages as well. Mul-T implementations of lazy task creation are described for two contrasting machines, and performance statistics that show the method's effectiveness are presented. Lazy task creation is shown to allow efficient execution of naturally expressed algorithms of a substantially finer grain than possible with previous parallel Lisp systems. >

...read moreread less

300 citations

Journal Article•DOI•

Contributions of topography and parallel processing to odor coding in the vertebrate olfactory pathway.

[...]

John S. Kauer¹•Institutions (1)

Tufts Medical Center¹

01 Feb 1991-Trends in Neurosciences

TL;DR: Analysis of the olfactory system using a combination of physiological measurements and computational approaches might elucidate the principles by which odors are discriminated.

...read moreread less

206 citations

Journal Article•DOI•

Parallel processing in the basal ganglia: up to a point.

[...]

G. Percheron¹, Michel Filion•Institutions (1)

French Institute of Health and Medical Research¹

01 Feb 1991-Trends in Neurosciences

173 citations

Journal Article•DOI•

Natural Language Processing With Modular PDP Networks And Distributed Lexicon

[...]

Risto Miikkulainen¹, Michael G. Dyer¹•Institutions (1)

University of California, Los Angeles¹

01 Jul 1991-Cognitive Science

TL;DR: An approach to connectionist natural language processing is proposed, which is based on hierarchically organized modular parallel distributed processing (PDP) networks and a central lexicon of distributed input/output representations.

...read moreread less

159 citations

Splatting: a parallel, feed-forward volume rendering algorithm

[...]

Lee Westover¹•Institutions (1)

University of North Carolina at Chapel Hill¹

01 Jul 1991

TL;DR: This thesis presents a feed-forward algorithm, called splatting, that directly renders rectilinear volume meshes, a naturally parallel algorithm that adheres well to the requirements imposed by signal processing theory.

...read moreread less

Abstract: Volume rendering is the generation of images from discrete samples of volume data. The volume data is sampled in at least three dimensions and comes in three basic classes: the rectilinear mesh-for example, a stack of computed tomography scans; the curvilinear mesh-for example, computational fluid dynamic data sets of the flow of air over an airplane wing; and the unstructured mesh-for example, a collection of ozone density readings at multiple elevations from a set of collection stations in the United States. Previous methods coerced the volumetric data into line and surface primitives that were viewed on conventional computer graphics displays. This coercion process has two fundamental flaws: viewers are never sure whether they are viewing a feature of the data or an artifact of the coercion process; and the insertion of a geometric modeling procedure into the middle of the display pipeline hampers interactive viewing. New direct rendering approaches that operate on the original data are replacing coercion approaches. These new methods, which avoid the artifacts introduced by conventional graphics primitives, fall into two basic categories: feed-backward methods that attempt to map the image plane onto the data, and feed-forward methods that attempt to map each volume element onto the image plane. This thesis presents a feed-forward algorithm, called splatting, that directly renders rectilinear volume meshes. The method achieves interactive speed through parallel execution, successive refinement, table-driven shading, and table-driven filtering. The method achieves high image quality by paying careful attention to signal processing principles during the process of reconstructing a continuous volume from the sampled input. This thesis' major contribution to computer graphics is the splatting algorithm. It is a naturally parallel algorithm that adheres well to the requirements imposed by signal processing theory. The algorithm has uncommon features. First, it can render volumes as either clouds or surfaces by changing the shading functions. Second, it can smoothly trade rendering time for image quality at several stages of the rendering pipeline. In addition this thesis presents a theoretical framework for volume rendering.

...read moreread less

140 citations

Proceedings Article•DOI•

Hardware-assisted replay of multiprocessor programs

[...]

David F. Bacon¹, Seth Copen Goldstein¹•Institutions (1)

University of California, Berkeley¹

01 Dec 1991

TL;DR: A hardware/software design is presented that allows the order of memory and the CPU''s to be allowed along with hardware and software control to replay execution and represents several orders of magnitude improvement in both performance and log size over purely software-based methods proposed previously.

...read moreread less

Abstract: Shared-memory parallel programs can be highly non-deterministic due to the unpredictable order in which shared references are satisfied. However, deterministic execution is extremely important for debugging and can also be used for fault-tolerance and other replay-based algorihtms. We present a hardware/software design that allows the order of memory and the CPU''s. This log can then be used along with hardware and software control to replay execution. Simulation of several parallel programs shows that our device records no more than 1.17 MB/second for an application exhibiting fine-grained sharing behavior on a 16-way multiprocessor consisting of 12 MIP CPU''s. In addition, no probe effect on performance degradation is introduced. This represents several orders of magnitude improvement in both performance and log size over purely software-based methods proposed previously.

...read moreread less

115 citations

Patent•

Parallel data processing system combining a SIMD unit with a MIMD unit and sharing a common bus, memory, and system controller

[...]

Takashi Kan¹•Institutions (1)

Mitsubishi¹

07 May 1991

TL;DR: In this paper, a SIMD type parallel processing unit (50) and a MIMD type Parallel Data Processing Unit (51) are connected to each other by a common bus (41) and memory (42), and a system controller (43) is provided to allow each of the parallel data processing units to perform its suitable processings.

...read moreread less

Abstract: There are SIMD type parallel data processing systems having a single instruction stream and multiple data streams and MIMD type parallel data processing systems having multiple instruction and data streams in the parallel data processing field for performing high-speed data processing. They have both merits and demerits and each have their suitable application fields. Because of this, it is extremely difficult to cover a wide range of application fields with either one of the systems. Then, a SIMD type parallel processing unit (50) and a MIMD type parallel data processing unit (51) are connected to each other by a common bus (41) and a memory (42), and a system controller (43) is provided to allow each of the parallel data processing units to perform its suitable processings, thus making it possible to apply the optimum parallel processing system to a wide range of application fields. That is, simple processings of a large volume of data are allocated to the SIMD type parallel data processing unit, while complex processings of a small volume of data are allocated to the MIMD type parallel data processing unit, whereby processings which have been difficult for a conventional computer to accomplish within an effective time, such as large-scale and complex processings of images, can be performed within a practical time at a high speed.

...read moreread less

113 citations

Journal Article•DOI•

Parallel simulated annealing using speculative computation

[...]

E.E. Witte¹, Roger D. Chamberlain¹, Mark A. Franklin¹•Institutions (1)

Washington University in St. Louis¹

01 Oct 1991-IEEE Transactions on Parallel and Distributed Systems

TL;DR: In this article, a parallel simulated annealing algorithm that is problem-independent, maintains the serial decision sequence, and obtains speedup which can exceed log/sub 2/P on P processors is discussed.

...read moreread less

Abstract: A parallel simulated annealing algorithm that is problem-independent, maintains the serial decision sequence, and obtains speedup which can exceed log/sub 2/P on P processors is discussed. The algorithm achieves parallelism by using the concurrency technique of speculative computation. Implementation of the parallel algorithm on a hypercube multiprocessor and application to a task assignment problem are described. The simulated annealing solutions are shown to be, on average, 28% better than the solutions produced by a random task assignment algorithm and 2% better than the solutions produced by a heuristic. >

...read moreread less

112 citations

Journal Article•DOI•

Maximum likelihood SPECT in clinical computation times using mesh-connected parallel computers

[...]

A.W. McCarthy¹, Michael I. Miller¹•Institutions (1)

Washington University in St. Louis¹

01 Jan 1991-IEEE Transactions on Medical Imaging

TL;DR: The authors show that for SPECT imaging on 64x64 image grids, the single-instruction, multiple data (SIMD) distributed array processor containing 64(2) processors performs the expectation-maximization (EM) algorithm with Good's smoothing at a rate of 1 iteration/1.5 s, promising for emission tomography fully Bayesian reconstructions including regularization in clinical computation times which are on the order of 1 min/slice.

...read moreread less

Abstract: Extending the work of A.W. McCarthy et al. (1988) and M.I. Miller and B. Roysam (1991), the authors demonstrate that a fully parallel implementation of the maximum-likelihood method for single-photon emission computed tomography (SPECT) can be accomplished in clinical time frames on massively parallel systolic array processors. The authors show that for SPECT imaging on 64*64 image grids, with 96 view angles, the single-instruction, multiple data (SIMD) distributed array processor containing 64/sup 2/ processors performs the expectation-maximization (EM) algorithm with Good's smoothing at a rate of 1 iteration/1.5 s. This promises for emission tomography fully Bayesian reconstructions including regularization in clinical computation times which are on the order of 1 min/slice. The most important result of the implementations is that the scaling rules for computation times are roughly linear in the number of processors. >

...read moreread less

Journal Article•DOI•

Experimental analysis of a mixed-mode parallel architecture using bitonic sequence sorting

[...]

Samuel A. Fineberg¹, Thomas L. Casavant¹, Howard Jay Siegel²•Institutions (2)

University of Iowa¹, Purdue University²

01 Feb 1991-Journal of Parallel and Distributed Computing

TL;DR: Experimentation aimed at determining the potential benefit of mixed-mode SIMD/MIMD parallel architectures is reported, based on timing measurements made on the PASM system prototype at Purdue utilizing carefully coded synthetic variations of a well-known algorithm.

...read moreread less

Journal Article•DOI•

The medical archival system: an information retrieval system based on distributed parallel processing

[...]

Russell J. Yount¹, John K. Vries¹, Carolyn D. Councill¹•Institutions (1)

University of Pittsburgh¹

11 Jun 1991-Information Processing and Management

TL;DR: The software design of MARS is described and its implementation as a practical system for large-scale information management is described.

...read moreread less

Abstract: The Medical ARchival System (MARS) is an information retrieval system utilizing distributed parallel processing. It features a modular design, machine independence, and a Boolean query interface, based in a UNIX environment. Developed at the University of Pittsburgh in response to the information needs of a large academic health center, MARS integrates textual data from a wide variety of sources to create a single, comprehensive medical records information system. It currently contains 850,000 medical reports, 2,500,000 medical references, and 500,000,000 indexed words. This paper describes the software design of MARS and its implementation as a practical system for large-scale information management.

...read moreread less

Patent•

Retrofitting digital video surveillance system

[...]

Jeffrey D. Blum, Mark J. Sandford

02 Aug 1991

TL;DR: In this paper, a system consisting of a plurality of processing boards having a substantially similar architecture is presented, which includes several frame grabber/frame storage processing boards, each of which digitizes the analog video signals from the video cameras and stores the digital data in a solid state buffer memory.

...read moreread less

Abstract: A system which retrofits to an existing surveillance system and cooperates with sensors, video cameras and video monitors of the existing surveillance system. The system comprises a plurality of processing boards having a substantially similar architecture. Several frame grabber/frame storage processing boards are provided, each of which digitize the analog video signals from the video cameras and stores the digital data in a solid state buffer memory. Several display boards are provided to display the digitized video data on display monitors. A controller board controls the exchange of video data and command or control messages over a video link and between processing boards, respectively. Additional expansion boards may be added to support additional buffering and system options. Each processing board is built around a parallel processing computer chip.

...read moreread less

Journal Article•DOI•

Random access protocols for high-speed interprocessor communication based on an optical passive star topology

[...]

Patrick W. Dowd¹•Institutions (1)

State University of New York System¹

01 Jun 1991-Journal of Lightwave Technology

TL;DR: Three examples of star-coupled structures are introduced, one of which exhibits optical self-routing, and the complexity of the communication subsystem is reduced since intermediate buffering and routing of packets are eliminated.

...read moreread less

Abstract: A multiple-instruction multiple-data (MIMD) distributed memory parallel computer system environment is considered. Media access control protocols that maintain good performance with high capacity optical channels are investigated. Three examples of star-coupled structures are introduced, one of which exhibits optical self-routing. Self-routing single-step optically interconnected communication structures can be designed through the incorporation of agile laser diode sources and wavelength tunable optical filters in a wavelength-division multiple-access environment. Intermediary latencies typical of MIMD distributed memory systems are eliminated. The degree and diameter of the resulting structures are dramatically reduced, and the complexity of the communication subsystem is reduced since intermediate buffering and routing of packets are eliminated. >

...read moreread less

Journal Article•DOI•

The DINO parallel programming language

[...]

Matthew Rosing¹, Robert B. Schnabel¹, Robert P. Weaver¹•Institutions (1)

University of Colorado Boulder¹

01 Sep 1991-Journal of Parallel and Distributed Computing

TL;DR: The syntax and semantics of the DINO language is described, examples of DINO programs are given, a critique of theDINO language features are presented, and the performance of code generated by the Dino compiler is discussed.

...read moreread less

Journal Article•DOI•

Multigrid methods on parallel computers—a survey of recent developments

[...]

Oliver A. McBryan¹, Paul O. Frederickson², Johannes Linden, Anton Schüller, Karl Solchenbach, Klaus Stüben, Clemens-August Thole, Ulrich Trottenberg - Show less +4 more•Institutions (2)

University of Colorado Boulder¹, Ames Research Center²

01 Apr 1991-Impact of Computing in Science and Engineering

TL;DR: It is demonstrated that high performance efficiencies are attainable for multigrid on massively parallel computers, as indicated by an example of poor efficiency on 65,536 processors, and that parallel machines open the possibility of finding really new approaches to solving standard problems.

...read moreread less

Abstract: Multigrid methods have been established as being among the most efficient techniques for solving complex elliptic equations. We sketch the multigrid idea, emphasizing that a multigrid solution is generally obtainable in a time directly proportional to the number of unknown variables on serial computers. Despite this, even the most powerful serial computers are not adequate for solving the very large systems generated, for instance, by discretization of fluid flow in three dimensions. A breakthrough can be achieved here only by highly parallel supercomputers. On the other hand, parallel computers are having a profound impact on computational science. Recently, highly parallel machines have taken the lead as the fastest supercomputers, a trend that is likely to accelerate in the future. We describe some of these new computers, and issues involved in using them. We describe standard parallel multigrid algorithms and discuss the question of how to implement them efficiently on parallel machines. The natural approach is to use grid partitioning. One intrinsic feature of a parallel machine is the need to perform interprocessor communication. It is important to ensure that time spent on such communication is maintained at a small fraction of computation time. We analyze standard parallel multigrid algorithms in two and three dimensions from this point of view, indicating that high performance efficiencies are attainable under suitable conditions on moderately parallel machines. We also demonstrate that such performance is not attainable for multigrid on massively parallel computers, as indicated by an example of poor efficiency on 65,536 processors. The fundamental difficulty is the inability to keep 65,536 processors busy when operating on very coarse grids. This example indicates that the straightforward parallelization of multigrid (and other) algorithms may not always be optimal. However, parallel machines open the possibility of finding really new approaches to solving standard problems. In particular, we present an intrinsically parallel variant of standard multigrid. This “PSMG” (parallel superconvergent multigrid) method allows all processors to be used at all times. even when processing on the coarsest grid levels. The sequential version of this method is not a sensible algorithm

...read moreread less

Studying parallel program behavior with upshot

[...]

Virginia Herrarte, Ewing Lusk

01 Aug 1991

TL;DR: This is a description of and a user's manual for upshot, an X-based graphics tool for viewing log files produced by parallel programs.

...read moreread less

Abstract: This is a description of and a user's manual for upshot, an X-based graphics tool for viewing log files produced by parallel programs.

...read moreread less

Journal Article•DOI•

Parallel versus serial processing in visual search: further evidence from subadditive effects of visual quality.

[...]

Howard E. Egeth¹, Dale Dagenbach¹•Institutions (1)

Johns Hopkins University¹

01 May 1991-Journal of Experimental Psychology: Human Perception and Performance

TL;DR: In this paper, a diagnostic for distinguishing between serial and parallel processing in visual search is proposed, which is based on testing for subadditive effects of a within-trial visual quality manipulation on target-absent trials.

...read moreread less

Abstract: The authors propose a diagnostic for distinguishing between serial and parallel processing in visual search; it is based on testing for subadditive effects of a within-trial visual quality manipulation on target-absent trials. It was evaluated in 2 experiments wherein parallel and serial processing might be expected on the basis of previous work and was then applied to a more uncertain situation in a third experiment. The diagnostic indicates parallel processing of stimuli that differ from each other on a featural basis (Xs and Os) and canonical letters that differ in line arrangement (Ts and Ls) but serial processing when Ts and Ls are randomly rotated. These results form a coherent pattern that is understandable in terms of the literature on visual search, and thus they suggest that the diagnostic may be a useful addition to the methodology used to distinguish between serial and parallel processes.

...read moreread less

Patent•

Page-description language interpreter for a parallel-processing system

[...]

Fumio Nagasaka¹•Institutions (1)

Epson¹

05 Apr 1991

TL;DR: In this paper, the rasterize processing for obtaining printing picture element information from a source file described in a page-description language is distributed-processed by a plurality of information processing units (6a, 6b, 6c) loose connected via a network.

...read moreread less

Abstract: The rasterize processing for obtaining printing picture element information from a source file described in a page-description language is distributed-processed by a plurality of information processing units (6a, 6b, 6c) loose connected via a network (7). In the information processing unit (6a) which generates a printing request, a client process (210) converts a source file (19) into an intermediate code file (10) and further divides the intermediate code file into a plurality of partial files executable in the rasterize processing, independently. A part of these plural partial files is given to a rasterizer (212) of the information processing unit (6a) which generates a printing request, so as to be rasterized into picture element information. The remaining part of the plural partial files are distributed to the other information processing units (6b, 6c) via the network. In each of these other information processing units (6b, 6c), the distributed partial file is received by a server process (211), transmitted to the rasterizer (212) to form partial picture element information. These partial picture element information formed by these other information processing units (6b, 6c) are returned to the information processing unit (6a) which generates the printing request. In this information processing unit (6a) which generates the printing request, the client process (210) combines the picture element information returned from the other information processing units (6b, 6c) with the picture element information formed by the rasterizer (212) of its own unit, to form the entire picture element information. The entire picture element information is transmitted to a printing unit (21).

...read moreread less

Patent•

Input/output system for parallel processing arrays

[...]

John R. Nickolls, Won S. Kim, John Zapisek, William T. Blank

06 Dec 1991

TL;DR: In this article, a massively parallel processor includes an array of processor element (20), or PEs, and a multi-stage router interconnection network (30), which is used both for I/O communications and for simultaneous PE to PE communications.

...read moreread less

Abstract: A massively parallel processor includes an array of processor element (20), or PEs, and a multi-stage router interconnection network (30), which is used both for I/O communications and for simultaneous PE to PE communications. The I/O system (10) for the massively parallel processor is based on a globally shared addressable I/O RAM buffer memory (50) that has parallel address and data buses (52) to the I/O devices (80, 82) and other parallel address and data buses (42) which are coupled to a router I/O element array (40). The router I/O element array is in turn coupled to the bit-serial router ports (e.g. S2 ^_O ^_XO) of the second stage (430) of the router interconnection network. The router I/O array provides the corner turn conversion between the massive array of bit-serial router lines (32) and the relatively few parallel buses (52) to the I/O devices.

...read moreread less

Book•

Advances in languages and compilers for parallel processing

[...]

Alexandru Nicolau, David Gelernter, Thomas Gross, David Padua

02 Jan 1991

TL;DR: "Advances in Languages and Compilers for Parallel Processing" discusses languages and language extensions, presents two innovative environments for parallel programming, describes techniques for debugging parallel programs, and takes up the important issue of data organization and management during parallel processing.

...read moreread less

Abstract: These twenty-three contributions represent some of the best research on software for parallel computers being done in universities and industry today."Advances in Languages and Compilers for Parallel Processing" discusses languages and language extensions, presents two innovative environments for parallel programming, describes techniques for debugging parallel programs, and takes up the important issue of data organization and management during parallel processing. New compiler techniques for parallelizing loops are covered as are new results in code scheduling and new approaches to dependency analysis and representation. The book concludes with an interesting insight into the measurement of parallelism implicit in ordinary programs and methods for dealing with programming and compiling for distributed and shared memory multiprocessors.

...read moreread less

Patent•

Character recognition system using the generalized hough transformation and method

[...]

Fumihiko Saitoh¹•Institutions (1)

IBM¹

31 Jul 1991

TL;DR: In this article, a character recognition system and method using the generalized Hough transform are disclosed, in which a template table which stores edge point parameters to be used for the GHT is compressed so as to include only predetermined parameters, and is then divided into a plurality of template tables which are respectively loaded in the memories of a pluralityof subprocessors operating in parallel under the control of a main processor.

...read moreread less

Abstract: A character recognition system and method using the generalized Hough transform are disclosed. A template table which stores edge point parameters to be used for the generalized Hough transform is compressed so as to include only predetermined parameters, and is then divided into a plurality of template tables which are respectively loaded in the memories of a plurality of subprocessors operating in parallel under the control of a main processor. In performing recognition processing, these subprocessors operate in parallel according to their related partial template tables. Character recognition using the generalized Hough transform provides a high rate of character recognition. Also, parallel processing using the compressed template tables and partial template tables helps shorten table search time and computation time, thereby increasing processing efficiency.

...read moreread less

Journal Article•DOI•

Parallel Newton type methods for power system stability analysis using local and shared memory multiprocessors

[...]

J.S. Chai¹, N. Zhu¹, Anjan Bose¹, Daniel Tylavsky¹•Institutions (1)

Arizona State University¹

01 Nov 1991-IEEE Transactions on Power Systems

TL;DR: The main thrust is to explore the match between the algorithms, their implementation, and the machine architectures, and to present various considerations together with the results.

...read moreread less

Abstract: Both the very dishonest Newton (VDHN) and the successive over relaxed (SOR) Newton algorithms have been implemented on the iPSC/2 and Alliant FX/8 computers for power system dynamic simulation using complex generator and nonlinear load models. The main thrust is to explore the match between the algorithms, their implementation, and the machine architectures. For example, the less parallel but sequentially faster VDHN runs faster on the hypercube (iPSC/2) whereas the more parallel SOR-Newton requires data sharing more often because of the extra iterations and does better on the Alliant. The implementation on the hypercube requires significant manual programming to schedule the processors and their communication whereas the compiler in the Alliant recognizes parallel steps but only if the software is properly coded. The authors present these various considerations together with the results. >

...read moreread less

Journal Article•DOI•

Algorithms for unboundedly parallel simulations

[...]

Albert G. Greenberg¹, Boris Dmitrievich Lubachevsky², Isi Mitrani•Institutions (2)

Bell Labs¹, University of Newcastle²

01 Aug 1991-ACM Transactions on Computer Systems

TL;DR: Efficient parallel simulations are given for a variety of queueing networks having a global first come first served structure, and the problem of simulating the arrival and departure times for the first N jobs to a single G/G/l queue is solved in time proportional to N/P + log P using P processors.

...read moreread less

Abstract: New methods are presented for parallel simulation of discrete event systems that, when applicable, can usefully employ a number of processors much larger than the number of objects in the system being simulated, Abandoning the distributed event list approach, the simulation problem is posed using recurrence relations. We bring three algorithmic ideas to bear on parallel simulation: parallel prefix computation, parallel merging, and iterative folding. Efficient parallel simulations are given for (in turn) the G/G/l queue, a variety of queueing networks having a global first come first served structure (e.g., a series of queues with finite buffers), acyclic networks of queues, and networks of queues with feedbacks and cycles. In particular, the problem of simulating the arrival and departure times for the first N jobs to a single G/G/l queue is solved in time proportional to N/P + log P using P processors.

...read moreread less

Proceedings Article•DOI•

Practical prefetching techniques for parallel file systems

[...]

David Kotz¹, Carla Schlatter Ellis•Institutions (1)

Dartmouth College¹

01 Dec 1991

TL;DR: Results show that prefetching can be implemented efficiently even for the more complex parallel file access patterns, and the ability of these policies across a range of architectural parameters is tested.

...read moreread less

Abstract: Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. In a previous paper the authors showed that prefetching and caching have the potential to deliver the performance benefits of parallel file systems to parallel applications. They describe experiments with practical prefetching policies, and show that prefetching can be implemented efficiently even for the more complex parallel file access patterns. They also test the ability of these policies across a range of architectural parameters. (see IEEE Trans. on Parallel and Distributed Systems, vol.1, no.2, p.218-30, 1990). >

...read moreread less

Journal Article•DOI•

Modeling red pine tree survival with an artificial neural network

[...]

Biing T. Guan¹, George Z. Gertner¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Nov 1991-Forest Science

Patent•

Adaptive fast fuzzy clustering system

[...]

Michael A. Bickel

29 Apr 1991

TL;DR: In this article, a parallel processing computer system for clustering data points in continuous feature space by adaptively separating classes of patterns is presented, which is based upon the gaps between successive data values within single features.

...read moreread less

Abstract: A parallel processing computer system for clustering data points in continuous feature space by adaptively separating classes of patterns. The preferred embodiment for this massively parallel system includes preferably one computer processor per feature and requires a single a priori assumption of central tendency in the distributions defining the pattern classes. It advantageously exploits the presence of noise inherent in the data gathering to not only classify data points into clusters, but also measure the certainty of the classification for each data point, thereby identifying outliers and spurious data points. The system taught by the present invention is based upon the gaps between successive data values within single features. This single feature discrimination aspect is achieved by applying a minimax comparison involving gap lengths and locations of the largest and smallest gaps. Clustering may be performed in near-real-time on huge data spaces having unlimited numbers of features.

...read moreread less

Journal Article•DOI•

Cognitive architectures from the standpoint of an experimental psychologist

[...]

W. K. Estes

01 Jan 1991-Annual Review of Psychology

Patent•

Architecture for integrated concurrent vector signal processor

[...]

Alexander Genusov¹, Ram B. Friedlander¹, Peter Feldman¹, Vlad Fruchter¹, Ricardo Jaliff¹, Asaf Mohr¹, Rafi Retter¹ - Show less +3 more•Institutions (1)

Zoran Corporation¹

31 May 1991

TL;DR: In this article, a vector signal processor for concurrent, parallel processing of complex vectors is described. The principal processing units are an execution unit, data movement unit, control/register unit, a vector buffer unit, an instruction fetch unit, and a bus interface unit.

...read moreread less

Abstract: Multiple special purpose processing units are provided in a vector signal processor for concurrent, parallel processing, particularly of complex vectors. The principal processing units are an execution unit, a data movement unit, a control/register unit, a vector buffer unit, an instruction fetch unit, and a bus interface unit.

...read moreread less

Collapse