Showing papers on "Scalability published in 1987"

PDF

Open Access

Journal Article•DOI•

Parallel Depth First Search, Part II: Analysis

[...]

Vipin Kumar, V. Nageshwara Rao¹•Institutions (1)

06 Dec 1987-International Journal of Parallel Programming

TL;DR: A work-distribution algorithm is presented that guarantees close to optimal performance on a shared-memory/ω-network-with-message-combining architecture (e.g. RP3) and is applicable to other parallel algorithms in which work is dynamically shared between different processors.

...read moreread less

Abstract: This paper presents the analysis of a parallel formulation of depth-first search. At the heart of this parallel formulation is a dynamic work-distribution scheme that divides the work between different processors. The effectiveness of the parallel formulation is strongly influenced by the work-distribution scheme and the target architecture. We introduce the concept of isoefficiency function to characterize the effectiveness of different architectures and work-distribution schemes. Many researchers considered the ring architecture to be quite suitable for parallel depth-first search. Our analytical and experimental results show that hypercube and shared-memory schemes are significantly better. The analysis of previously known work-distribution schemes motivated the design of substantially improved schemes for ring and shared-memory architectures. In particular, we present a work-distribution algorithm which guarantees close to optimal performance on a shared-memory/C-network-with-message-combining architecture (e.g. RP3). Much of the analysis presented in this paper is applicable to other parallel algorithms in which work is dynamically shared between different processors (e.g., parallel divide-and-conquer algorithms). The concept of isoefficiency is useful in characterizing the scalability of a variety of parallel algorithms.

...read moreread less

176 citations

Journal Article•DOI•

A proposed new structure for SEU immunity in SRAM employing drain resistance

[...]

A. Ochoa, C.L. Axness, H.T. Weaver, J.S. Fu

01 Nov 1987-IEEE Electron Device Letters

TL;DR: In this article, a static random access memory (SRAM) cell is proposed, in which resistors are used to delay ion-induced transients conventionally, and to divide down voltage transients at the information node.

...read moreread less

Abstract: A novel static random access memory (SRAM) cell is proposed (LRAM) in which resistors are used to delay ion-induced transients conventionally, and to divide down voltage transients at the information node. The voltage divider is a new concept in SEU hardening and has practical value for technologies where the voltage transient duration is significantly different for responses to ion strikes at p- and n-channel drains. In combination, the two pairs of resistors allow much reduced resistor values with the advantage of faster access times, better temperature stability, and better scalability. Advanced simulations in which transport and circuit effects are modeled simultaneously are used to project the viability of the LRAM concept and data from single-cell test structures and support the analysis.

...read moreread less

27 citations

Decentralized naming in distributed computer systems

[...]

T. P. Mann

12 Dec 1987

TL;DR: This thesis introduces a new paradigm for name service called decentralized naming, and the multicast name mapping technique is shown to have optimum resiliency, in the sense that whenever an object is accessible at all, it is accessible by name.

...read moreread less

Abstract: Designing a global character-string naming facility is an important and difficult problem in distributed systems. Providing global names--names that have the same meaning on any participating machine--is a vital step in welding a collection of individual computers into a single, coherent system. But the nature of large distributed systems makes it difficult to implement global naming with acceptable efficiency, fault tolerance, and security: network communication is costly, system components can fail independently, and parts of the system may belong to many autonomous and mutually-suspicious groups. Existing name service designs do not solve the problem in full; even the best current designs do not have the efficiency or capacity to name every object in a large system--they generally name only hosts or mailboxes, not files. This thesis introduces a new paradigm for name service called decentralized naming. Directories at different levels of the global naming hierarchy are implemented using different techniques. The uppermost (global) level employs conventional distributed name servers for scalability, while at lower (regional and local) levels, naming is handled directly by the managers of the named objects. The name mapping protocol uses multicast for fault tolerance and a specialized caching technique for efficiency. A capability system provides security against counterfeit replies to name lookup requests. The multicast name mapping technique is shown to have optimum resiliency, in the sense that whenever an object is accessible at all, it is accessible by name. An analytical model of cache performance is presented, is validated by comparison with measurements on a prototype implementation, and is used to set a limit on how large directories can grow before they must be treated as global rather than regional. The capability scheme is also analyzed: although it reduces both the efficiency and resiliency of name lookup, its impact can be made as small as desired by limiting the frequency with which security policy is allowed to change.

...read moreread less

8 citations

Replication and reconfiguration in a distributed mail repository

[...]

M. S. Day

01 May 1987

TL;DR: This thesis focuses on how to design and implement replication and reconfiguration for the distributed mail repository, considering these questions in the context of the programming language Argus, which was designed to support distributed programming.

...read moreread less

Abstract: Conventional approaches to programming produce centralized programs that run on a single computer. However, an unconventional approach can take advantage of low-cost communication and small, inexpensive computers. A distributed program provides service through programs executing at several nodes of a distributed system. Distributed programs can offer two important advantages over centralized programs: high availability and scalability. In a highly-available system, it is very likely that a randomly-chosen transaction will complete successfully. A scalable system''s capacity can be increased or decreased to match changes in the demands placed on the system. When a node is unavailable because of maintenance or a crash, transactions may fail unless copies of the node''s information are stored at other nodes. Thus, high availability requires replication of data. Both the maintenance of a highly-available system and scalability require the ability to modify and extend a system while it is running, called dynamic reconfiguration or simply reconfiguration. This thesis considers the problem of building scalable and highly-available distributed programs without using special processors with redundant hardware and software. It describes a design and implementation of an example distributed program, an electronic mail repository. The thesis focuses on how to design and implement replication and reconfiguration for the distributed mail repository, considering these questions in the context of the programming language Argus, which was designed to support distributed programming. The thesis makes three distinct contributions. First, it presents the replication techniques chosen for the distributed repository and a discussion of their implementation in Argus. Second, it describes a new method for designing and implementing reconfigurable distributed systems. The new method allows replacement of software components while preserving their state, but requires no changes to the underlying system or language. This contrasts with previous work on guardian replacement in Argus. Third, the thesis evaluates the utility of Argus for applications involving replication and reconfiguration.

...read moreread less

6 citations

Proceedings Article•DOI•

Strategies for decentralized resource management

[...]

M. Stumm¹•Institutions (1)

University of Toronto¹

01 Aug 1987

TL;DR: This paper presents several example solutions for managing resources in a decentralized fashion, using multicasting facilities, and concludes that decentralized solutions compare favorably to centralized solutions with respect to all three criteria.

...read moreread less

Abstract: Decentralized resource management in distributed systems has become more practical with the availability of communication facilities that support multicasting. In this paper we present several example solutions for managing resources in a decentralized fashion, using multicasting facilities. We review the properties of these solutions in terms of scalability, fault tolerance and efficiency. We conclude that decentralized solutions compare favorably to centralized solutions with respect to all three criteria.

...read moreread less

6 citations

Proceedings Article•DOI•

Key Factors in Reducing Soft Errors in Mega-Bit Drams: Funneling and Scalability

[...]

Eiji Takeda¹, Kan Takeuchi¹, Toru Toyabe¹, K. Ohshima¹, K. Itoh¹ - Show less +1 more•Institutions (1)

Hitachi¹

01 Apr 1987

TL;DR: In this article, the authors investigated the funneling phenomena in?-particle induced soft errors using a 3-D device simulator and a new experimental method, including descriptions of the scalability of funneling length(size effects and proximity effects), effects of reduced supply voltages, barrier effect in n+p structure, and influence of source/drain n+ diffusion layer in a switching MOS device.

...read moreread less

Abstract: The funneling phenomena in ?-particle induced soft errors are investigated using a 3-D device simulator and a new experimental method. The scope of this paper includes descriptions of: 1) scalability of funneling length(size effects and proximity effects), 2) effects of reduced supply voltages 3) barrier effect in n+-p structure, and 4) influence of source/drain n+ diffusion layer in a switching MOS device. A new funneling length model different from flu's model is proposed. It was also found that a "scaling law" for soft errors exists which determines a limitation of planar and trench cells.

...read moreread less

3 citations

A performance analysis of architectural scalability

[...]

Daniel Reed, Alex Yuen-Wai Kwok

01 Jan 1987

TL;DR: This work analyzes the performance scalability issue at three different levels: (1) the algorithm restructuring level, (2) the compiler optimization level, and (3) the machine organization level.

...read moreread less

Abstract: The peak performance of a multiprocessor system is rarely attainable. Algorithm penalty, interprocessor communication cost, and synchronization overhead are major factors limiting the performance scalability. The MPPP is a trace driven simulation facility developed for the performance prediction of large scale multiple vector-processor systems. This facility, when used with the Parafrase program restructurer, provides a powerful tool for studying the performance of parallel Fortran programs under variations of architecture, system, and workload parameters. Using the MPPP facility, we analyze the performance scalability issue at three different levels: (1) the algorithm restructuring level, (2) the compiler optimization level, and (3) the machine organization level. Johnsson's narrow banded linear systems solver, the fast Fourier transform, the preconditioned conjugate gradient method, and the Gaussian elimination with pivoting algorithm were used as sample workload in our analysis.

...read moreread less

1 citations

Keynote Speaker: Day 1 A Scalable and Reconfigurable Network Topology for Medical Imaging

[...]

Hamid R. Arabnia

01 Jan 1987

TL;DR: In distributed memory multiprocessor systems, the processing elements can be considered to be nodes that are connected together via an interconnection network and the requirement of network symmetry ensures that each node in the network is identical to any other, thereby greatly reducing the architecture and algorithm design effort.

...read moreread less

Abstract: Inherent limitations on the computational power of sequential uniprocessor systems have lead to the development of parallel multiprocessor systems. The two major issues in the formulation and design of parallel multiprocessor systems are algorithm design and architecture design. The parallel multiprocessor systems should be so designed so as to facilitate the design and implementation of the efficient parallel algorithms that exploit optimally the capabilities of the system. From an architectural point of view, the system should have low hardware complexity, be capable of being built of components that can be easily replicated, should exhibit desirable costperformance characteristics, be cost effective and exhibit good scalability in terms of hardware complexity and cost with increasing problem size. In distributed memory multiprocessor systems, the processing elements can be considered to be nodes that are connected together via an interconnection network. In order to facilitate algorithm and architecture design, we require that the interconnection network have a low diameter, the system be symmetric and each node in the system have low degree of connectivity. Further, it is also desirable that the system configuration and behavior be amenable to a suitable and tractable mathematical description. The requirement of network symmetry ensures that each node in the network is identical to any other, thereby greatly reducing the architecture and algorithm design effort. For most symmetric network topologies, however, the requirements of low degree of connectivity for each node and low network diameter are often conflicting. Low network diameter often entails that each node in the network have a high degree of connectivity resulting in a drastic increase in the number of inter-processor connection links.

...read moreread less

1 citations