scispace - formally typeset
Search or ask a question
Book

Computer Architecture and Parallel Processing

TL;DR: The authors have divided the use of computers into the following four levels of sophistication: data processing, information processing, knowledge processing, and intelligence processing.
Abstract: The book is intended as a text to support two semesters of courses in computer architecture at the college senior and graduate levels. There are excellent problems for students at the end of each chapter. The authors have divided the use of computers into the following four levels of sophistication: data processing, information processing, knowledge processing, and intelligence processing.
Citations
More filters
Book
01 Oct 1992
TL;DR: This book provides an introduction to the design and analysis of parallel algorithms, with the emphasis on the application of the PRAM model of parallel computation, with all its variants, to algorithm analysis.
Abstract: Written by an authority in the field, this book provides an introduction to the design and analysis of parallel algorithms. The emphasis is on the application of the PRAM (parallel random access machine) model of parallel computation, with all its variants, to algorithm analysis. Special attention is given to the selection of relevant data structures and to algorithm design principles that have proved to be useful. Features *Uses PRAM (parallel random access machine) as the model for parallel computation. *Covers all essential classes of parallel algorithms. *Rich exercise sets. *Written by a highly respected author within the field. 0201548569B04062001

1,577 citations


Cites background from "Computer Architecture and Parallel ..."

  • ...Parallel architectures have been described in several books (see, for example, [18, 29])....

    [...]

Journal ArticleDOI
TL;DR: The PVM system is a programming environment for the development and execution of large concurrent or parallel applications that consist of many interacting, but relatively independent, components that operate on a collection of heterogeneous computing elements interconnected by one or more networks.
Abstract: The PVM system is a programming environment for the development and execution of large concurrent or parallel applications that consist of many interacting, but relatively independent, components. It is intended to operate on a collection of heterogeneous computing elements interconnected by one or more networks. The participating processors may be scalar machines, multiprocessors, or special-purpose computers, enabling application components to execute on the architecture most appropriate to the algorithm. PVM provides a straightforward and general interface that permits the description of various types of algorithms (and their interactions), while the underlying infrastructure permits the execution of applications on a virtual computing environment that supports multiple parallel computation models. PVM contains facilities for concurrent, sequential, or conditional execution of application components, is portable to a variety of architectures, and supports certain forms of error detection and recovery.

1,324 citations


Cites background from "Computer Architecture and Parallel ..."

  • ...However, most of the research efforts have concentrated either upon computational models [1], parallel versions of algorithms, or machine architectures; relatively little attention has been given to software development environments or program construction techniques that are required in order to translate algorithms into operational programs....

    [...]

Book
25 Oct 1989
TL;DR: This book introduces a new approach to the design and implementation of software systems which will help users of large scale parallel systems coordinate many concurrent activities toward a single goal and proposes a selection of independent algorithmic skeletons which describes the structure of a particular style of algorithm.
Abstract: This book introduces a new approach to the design and implementation of software systems which will help users of large scale parallel systems coordinate many concurrent activities toward a single goal It assesses the strengths an weaknesses of this approach with existing alternativesCole's system proposes a selection of independent algorithmic skeletons, each of which describes the structure of a particular style of algorithm The user must describe a solution to a problem as an instance of the appropriate skeleton The implementation task is simplified by the fact that each skeleton may be considered independently, in contrast to the monolithic programming interfaces of existing systems at a similar level of abstractionThe book describes four skeletons based on the notions of fixed degree divide and conquer, task queues, iterative combination, and clustering Each is introduced in terms of the abstraction it presents to the user Implementation on a square grid of autonomous processor memory pairs is considered and examples of problems which could be solved in terms of the skeleton are presentedMurray I Cole is a Lecturer in the Computing Science Department of the University of Glasgow "Algorithmic Skeletons" is included in the series Research Monographs in Parallel and Distributed Computing, Copublished with Pitman Publishing,

1,001 citations


Cites background from "Computer Architecture and Parallel ..."

  • ...It is well known [17] that, in general, the significant factors which act to reduce pipeline efficiency are gaps in the flow of data and uneven length (in time) of stages....

    [...]

Proceedings ArticleDOI
01 May 1996
TL;DR: Experiments on a 12-node SGI Challenge multiprocessor indicate that the new non-blocking queue consistently outperforms the best known alternatives; it is the clear algorithm of choice for machines that provide a universal atomic primitive (e.g., compare_and_swap or load_linked/store_conditional).
Abstract: Drawing ideas from previous authors, we present a new non-blocking concurrent queue algorithm and a new two-lock queue algorithm in which one enqueue and one dequeue can proceed concurrently. Both algorithms are simple, fast, and practical; we were surprised not to find them in the literature. Experiments on a 12-node SGI Challenge multiprocessor indicate that the new non-blocking queue consistently outperforms the best known alternatives; it is the clear algorithm of choice for machines that provide a universal atomic primitive (e.g., compare_and_swap or load_linked/store_conditional). The two-lock concurrent queue outperforms a single lock when several processes are competing simultaneously for access; it appears to be the algorithm of choice for busy queues on machines with non-universal atomic primitives (e.g., test_and_set). Since much of the motivation for non-blocking algorithms is rooted in their immunity to large, unpredictable delays in process execution, we report experimental results both for systems with dedicated processors and for systems with several processes multiprogrammed on each processor.

939 citations


Cites background from "Computer Architecture and Parallel ..."

  • ...Hwang and Briggs [7], Sites [17], and Stone [20] present lock-free algorithms based oncompare and swap....

    [...]

Patent
02 May 2003
TL;DR: A mass storage system made of flash electrically erasable and programmable read only memory (EEPROM) cells organized into blocks, the blocks in turn being grouped into memory banks, is managed to even out the numbers of erase and rewrite cycles experienced by the memory banks in order to extend the service lifetime of the memory.
Abstract: A mass storage system made of flash electrically erasable and programmable read only memory (“EEPROM”) cells organized into blocks, the blocks in turn being grouped into memory banks, is managed to even out the numbers of erase and rewrite cycles experienced by the memory banks in order to extend the service lifetime of the memory system. Since this type of memory cell becomes unusable after a finite number of erase and rewrite cycles, although in the tens of thousands of cycles, uneven use of the memory banks is avoided so that the entire memory does not become inoperative because one of its banks has reached its end of life while others of the banks are little used. Relative use of the memory banks is monitored and, in response to detection of uneven use, have their physical addresses periodically swapped for each other in order to even out their use over the lifetime of the memory.

822 citations