scispace - formally typeset
Search or ask a question

Showing papers on "Scalability published in 1987"


Journal ArticleDOI
TL;DR: A work-distribution algorithm is presented that guarantees close to optimal performance on a shared-memory/ω-network-with-message-combining architecture (e.g. RP3) and is applicable to other parallel algorithms in which work is dynamically shared between different processors.
Abstract: This paper presents the analysis of a parallel formulation of depth-first search. At the heart of this parallel formulation is a dynamic work-distribution scheme that divides the work between different processors. The effectiveness of the parallel formulation is strongly influenced by the work-distribution scheme and the target architecture. We introduce the concept of isoefficiency function to characterize the effectiveness of different architectures and work-distribution schemes. Many researchers considered the ring architecture to be quite suitable for parallel depth-first search. Our analytical and experimental results show that hypercube and shared-memory schemes are significantly better. The analysis of previously known work-distribution schemes motivated the design of substantially improved schemes for ring and shared-memory architectures. In particular, we present a work-distribution algorithm which guarantees close to optimal performance on a shared-memory/C-network-with-message-combining architecture (e.g. RP3). Much of the analysis presented in this paper is applicable to other parallel algorithms in which work is dynamically shared between different processors (e.g., parallel divide-and-conquer algorithms). The concept of isoefficiency is useful in characterizing the scalability of a variety of parallel algorithms.

176 citations


Journal ArticleDOI
TL;DR: In this article, a static random access memory (SRAM) cell is proposed, in which resistors are used to delay ion-induced transients conventionally, and to divide down voltage transients at the information node.
Abstract: A novel static random access memory (SRAM) cell is proposed (LRAM) in which resistors are used to delay ion-induced transients conventionally, and to divide down voltage transients at the information node. The voltage divider is a new concept in SEU hardening and has practical value for technologies where the voltage transient duration is significantly different for responses to ion strikes at p- and n-channel drains. In combination, the two pairs of resistors allow much reduced resistor values with the advantage of faster access times, better temperature stability, and better scalability. Advanced simulations in which transport and circuit effects are modeled simultaneously are used to project the viability of the LRAM concept and data from single-cell test structures and support the analysis.

27 citations


12 Dec 1987
TL;DR: This thesis introduces a new paradigm for name service called decentralized naming, and the multicast name mapping technique is shown to have optimum resiliency, in the sense that whenever an object is accessible at all, it is accessible by name.
Abstract: Designing a global character-string naming facility is an important and difficult problem in distributed systems. Providing global names--names that have the same meaning on any participating machine--is a vital step in welding a collection of individual computers into a single, coherent system. But the nature of large distributed systems makes it difficult to implement global naming with acceptable efficiency, fault tolerance, and security: network communication is costly, system components can fail independently, and parts of the system may belong to many autonomous and mutually-suspicious groups. Existing name service designs do not solve the problem in full; even the best current designs do not have the efficiency or capacity to name every object in a large system--they generally name only hosts or mailboxes, not files. This thesis introduces a new paradigm for name service called decentralized naming. Directories at different levels of the global naming hierarchy are implemented using different techniques. The uppermost (global) level employs conventional distributed name servers for scalability, while at lower (regional and local) levels, naming is handled directly by the managers of the named objects. The name mapping protocol uses multicast for fault tolerance and a specialized caching technique for efficiency. A capability system provides security against counterfeit replies to name lookup requests. The multicast name mapping technique is shown to have optimum resiliency, in the sense that whenever an object is accessible at all, it is accessible by name. An analytical model of cache performance is presented, is validated by comparison with measurements on a prototype implementation, and is used to set a limit on how large directories can grow before they must be treated as global rather than regional. The capability scheme is also analyzed: although it reduces both the efficiency and resiliency of name lookup, its impact can be made as small as desired by limiting the frequency with which security policy is allowed to change.

8 citations


01 May 1987
TL;DR: This thesis focuses on how to design and implement replication and reconfiguration for the distributed mail repository, considering these questions in the context of the programming language Argus, which was designed to support distributed programming.
Abstract: Conventional approaches to programming produce centralized programs that run on a single computer. However, an unconventional approach can take advantage of low-cost communication and small, inexpensive computers. A distributed program provides service through programs executing at several nodes of a distributed system. Distributed programs can offer two important advantages over centralized programs: high availability and scalability. In a highly-available system, it is very likely that a randomly-chosen transaction will complete successfully. A scalable system''s capacity can be increased or decreased to match changes in the demands placed on the system. When a node is unavailable because of maintenance or a crash, transactions may fail unless copies of the node''s information are stored at other nodes. Thus, high availability requires replication of data. Both the maintenance of a highly-available system and scalability require the ability to modify and extend a system while it is running, called dynamic reconfiguration or simply reconfiguration. This thesis considers the problem of building scalable and highly-available distributed programs without using special processors with redundant hardware and software. It describes a design and implementation of an example distributed program, an electronic mail repository. The thesis focuses on how to design and implement replication and reconfiguration for the distributed mail repository, considering these questions in the context of the programming language Argus, which was designed to support distributed programming. The thesis makes three distinct contributions. First, it presents the replication techniques chosen for the distributed repository and a discussion of their implementation in Argus. Second, it describes a new method for designing and implementing reconfigurable distributed systems. The new method allows replacement of software components while preserving their state, but requires no changes to the underlying system or language. This contrasts with previous work on guardian replacement in Argus. Third, the thesis evaluates the utility of Argus for applications involving replication and reconfiguration.

6 citations


Proceedings ArticleDOI
M. Stumm1
01 Aug 1987
TL;DR: This paper presents several example solutions for managing resources in a decentralized fashion, using multicasting facilities, and concludes that decentralized solutions compare favorably to centralized solutions with respect to all three criteria.
Abstract: Decentralized resource management in distributed systems has become more practical with the availability of communication facilities that support multicasting. In this paper we present several example solutions for managing resources in a decentralized fashion, using multicasting facilities. We review the properties of these solutions in terms of scalability, fault tolerance and efficiency. We conclude that decentralized solutions compare favorably to centralized solutions with respect to all three criteria.

6 citations


Proceedings ArticleDOI
Eiji Takeda1, Kan Takeuchi1, Toru Toyabe1, K. Ohshima1, K. Itoh1 
01 Apr 1987
TL;DR: In this article, the authors investigated the funneling phenomena in?-particle induced soft errors using a 3-D device simulator and a new experimental method, including descriptions of the scalability of funneling length(size effects and proximity effects), effects of reduced supply voltages, barrier effect in n+p structure, and influence of source/drain n+ diffusion layer in a switching MOS device.
Abstract: The funneling phenomena in ?-particle induced soft errors are investigated using a 3-D device simulator and a new experimental method. The scope of this paper includes descriptions of: 1) scalability of funneling length(size effects and proximity effects), 2) effects of reduced supply voltages 3) barrier effect in n+-p structure, and 4) influence of source/drain n+ diffusion layer in a switching MOS device. A new funneling length model different from flu's model is proposed. It was also found that a "scaling law" for soft errors exists which determines a limitation of planar and trench cells.

3 citations


01 Jan 1987
TL;DR: This work analyzes the performance scalability issue at three different levels: (1) the algorithm restructuring level, (2) the compiler optimization level, and (3) the machine organization level.
Abstract: The peak performance of a multiprocessor system is rarely attainable. Algorithm penalty, interprocessor communication cost, and synchronization overhead are major factors limiting the performance scalability. The MPPP is a trace driven simulation facility developed for the performance prediction of large scale multiple vector-processor systems. This facility, when used with the Parafrase program restructurer, provides a powerful tool for studying the performance of parallel Fortran programs under variations of architecture, system, and workload parameters. Using the MPPP facility, we analyze the performance scalability issue at three different levels: (1) the algorithm restructuring level, (2) the compiler optimization level, and (3) the machine organization level. Johnsson's narrow banded linear systems solver, the fast Fourier transform, the preconditioned conjugate gradient method, and the Gaussian elimination with pivoting algorithm were used as sample workload in our analysis.

1 citations


01 Jan 1987
TL;DR: In distributed memory multiprocessor systems, the processing elements can be considered to be nodes that are connected together via an interconnection network and the requirement of network symmetry ensures that each node in the network is identical to any other, thereby greatly reducing the architecture and algorithm design effort.
Abstract: Inherent limitations on the computational power of sequential uniprocessor systems have lead to the development of parallel multiprocessor systems. The two major issues in the formulation and design of parallel multiprocessor systems are algorithm design and architecture design. The parallel multiprocessor systems should be so designed so as to facilitate the design and implementation of the efficient parallel algorithms that exploit optimally the capabilities of the system. From an architectural point of view, the system should have low hardware complexity, be capable of being built of components that can be easily replicated, should exhibit desirable costperformance characteristics, be cost effective and exhibit good scalability in terms of hardware complexity and cost with increasing problem size. In distributed memory multiprocessor systems, the processing elements can be considered to be nodes that are connected together via an interconnection network. In order to facilitate algorithm and architecture design, we require that the interconnection network have a low diameter, the system be symmetric and each node in the system have low degree of connectivity. Further, it is also desirable that the system configuration and behavior be amenable to a suitable and tractable mathematical description. The requirement of network symmetry ensures that each node in the network is identical to any other, thereby greatly reducing the architecture and algorithm design effort. For most symmetric network topologies, however, the requirements of low degree of connectivity for each node and low network diameter are often conflicting. Low network diameter often entails that each node in the network have a high degree of connectivity resulting in a drastic increase in the number of inter-processor connection links.

1 citations