scispace - formally typeset
Search or ask a question

Showing papers on "Distributed algorithm published in 1985"


Journal ArticleDOI
TL;DR: A new simulation technique, referred to as a synchronizer, which is a new, simple methodology for designing efficient distributed algorithms in asynchronous networks, is proposed and is proved to be within a constant factor of the lower bound.
Abstract: The problem of simulating a synchronous network by an asynchronous network is investigated. A new simulation technique, referred to as a synchronizer, which is a new, simple methodology for designing efficient distributed algorithms in asynchronous networks, is proposed. The synchronizer exhibits a trade-off between its communication and time complexities, which is proved to be within a constant factor of the lower bound.

762 citations


Journal ArticleDOI
TL;DR: A distributed algorithm is presented that realizes mutual exclusion among N nodes in a computer network that requires at most N message exchanges for one mutual exclusion invocation.
Abstract: A distributed algorithm is presented that realizes mutual exclusion among N nodes in a computer network. The algorithm requires at most N message exchanges for one mutual exclusion invocation. Accordingly, the delay to invoke mutual exclusion is smaller than in an algorithm of Ricart and Agrawala, which requires 2*(N - 1) message exchanges per invocation. A drawback of the algorithm is that the sequence numbers contained in the messages are unbounded. It is shown that this problem can be overcome by slightly increasing the number of message exchanges.

333 citations


Journal ArticleDOI
01 Aug 1985
TL;DR: A new technique to handle complex Markov models is presented based on a description using stochastic automatas and is dedicated to distributed algorithms modelling.
Abstract: In this paper a new technique to handle complex Markov models is presented. This method is based on a description using stochastic automatas and is dedicated to distributed algorithms modelling. One example of a mutual exclusion algorithm in a distributed environment is extensively analysed. The mathematical analysis is based on tensor algebra for matrices.

319 citations


Journal ArticleDOI
TL;DR: A distributed algorthim for load balancing which is network topology independent is proposed in this paper and the main objective of this paper is to describe the dynamic process migration protocol based on the proposed drafting algorithm.
Abstract: It is desirable for the load in a distributed system to be balanced evenly. A dynamic process migration protocol is needed in order to achieve load balancing in a user transparent manner. A distributed algorthim for load balancing which is network topology independent is proposed in this paper. Different network topologies and low-level communications protocols affect the choice of only some system design parameters. The "drafting" algorithm attempts to compromise two contradictory goals: maximize the processor utilization and minimize the communication overhead. The main objective of this paper is to describe the dynamic process migration protocol based on the proposed drafting algorithm. A sample distributed system is used to further illustrate the drafting algorithm and to show how to define system design parameters. The system performance is measured by simulation experiments based on the sample system.

227 citations


Book ChapterDOI
17 Jun 1985
TL;DR: This work establishes a very natural connection between distributed processes and the logic of knowledge which promises to shed light on both areas of knowledge.
Abstract: We establish a very natural connection between distributed processes and the logic of knowledge which promises to shed light on both areas.

184 citations


Journal ArticleDOI
Baruch Awerbuch1
TL;DR: A new distributed Depth-First-Search (DFS) algorithm for an asynchronous communication network, whose communication and time complexities are O(|E|) and O( |V|), respectively.

168 citations


Journal ArticleDOI
01 Dec 1985
TL;DR: A new software architecture for fault-tolerant distributed programs is presented that allows replication to be added transparently and flexibly to existing programs, and integration of the replication mechanisms into current programming languages is accomplished by means of stub compilers.
Abstract: This dissertation presents a new software architecture for fault-tolerant distributed programs. This new architecture allows replication to be added transparently and flexibly to existing programs. Tuning the availability of a replicated program becomes a programming-in-the-large problem that a programmer need address only after the individual modules have been written and verified. The increasing reliance that people place on computer systems makes it essential that those systems remain available. The low cost of computer hardware and the high cost of computer software make replicated distributed programs an attractive solution to the problem of providing fault-tolerant operation. A troupe is a set of replicas of a module, executing on machines that have independent failure modes. Troupes are the building blocks of replicated distributed programs and the key to achieving high availability. Individual members of a troupe do not communicate among themselves, and are unaware of one another''s existence; this property is what distinguishes troupes from other software architectures for fault tolerance. Replicated procedure call is introduced to handle the many- to-many pattern of communication between troupes. Replicated procedure cal is an elegant and powerful way of expressing many distributed algorithms. The semantics of replicated procedure call can be summarized as exactly-once execution at all replicas. An implementation of troupes and replicated procedure call is described. Experiments were conducted to measure the performance of this implementation; an analysis of the results of these experiments is presented. The problem of concurrency control for troupes is examined, and algorithms for replicated atomic transactions are presented as a solution. Binding and reconfiguration mechanisms for replicated distributed programs are described, and the problem of when to replace failed troupe members is analyzed. Several issues relating to programming languages and environments for reliable distributed applications are discussed. Integration of the replication mechanisms into current programming languages is accomplished by means of stub compilers. Four stub compilers are examined, and some lessons learned form them are presented. A language for specifying troupe configurations is described, and the design of a configuration manager, a programming-in-the-large tool for configuring replicated distributed programs, is presented.

160 citations


01 Jan 1985
TL;DR: Two methods for distributed simulation are studied, applicable to discrete time simulation models and are fully distributed in the sense that they require no central control.
Abstract: Simulation is one example of an application that shows great potential benefits from distributed processing The conventional approach to simulation, that of sequentially processing the events, does not exploit the natural parallelism existing in some simulation models This is particularly true in large models, where submodels often interact weakly and can be simulated in parallel The decreasing cost of multiprocessor systems also suggests that a distributed approach to simulation can be workable Moreover, such an approach can be very attractive, since time and memory limitations, often major constraints with simulation programs, may be alleviated by distributing the load among several processors Distributed simulation requires a set of processors that can communicate by sending messages along the links of a communication network or via a shared memory The processors each simulate a submodel of the overall model and interact when necessary Submodel interactions produce the interprocessor communication in the simulator Two methods for distributed simulation are studied in this thesis Both methods are applicable to discrete time simulation models and are fully distributed in the sense that they require no central control In one method, each processor can simulate independently as long as it is certain that no events will arrive that belongs to the past of the simulation process In the second method, processors are not concerned about future arriving events They simulate independently and roll back if an event arrives that belongs to the past The thesis consists of two parts The first presents some centralized and distributed algorithms for efficient utilization of the second method The issue of load balancing is also discussed in this part and some heuristic algorithms are presented The second part of the work consists of mathematical modeling and analysis of models of both methods The analysis gives some insight into the effects of different system parameters on the performance The performance of each method is compared with the other and also with single processor simulation The mathematical models are then confirmed and complemented with the simulation results Finally, results of the implementation of the second method are presented

95 citations


Journal ArticleDOI
TL;DR: Two very different distributed scheduling algorithms which contain explicit mechanisms for stability are presented and evaluated and indicate how very specific the treatment of stability is to the algorithm and environnent under consideration.
Abstract: Many distributed scheduling algorithms have been developed and reported in the current literature. However, very few of them explicitly treat stability issues. This paper first discusses stability issues for distributed scheduling algorithms in general terms. Two very different distributed scheduling algorithms which contain explicit mechanisms for stability are then presented and evaluated with respect to individual specific stability issues. One of the agorithms is based on stochastic learning automata and the other on bidding. The results indicate how very specific the treatment of stability is to the algorithm and environnent under consideration.

85 citations


Journal ArticleDOI
TL;DR: Growth of distributed systems has attained unstoppable momentum and if the authors better understood how to think about, analyze, and design distributed systems, they could direct their implementation with more confidence.
Abstract: Growth of distributed systems has attained unstoppable momentum. If we better understood how to think about, analyze, and design distributed systems, we could direct their implementation with more confidence.

78 citations


Proceedings ArticleDOI
01 Dec 1985
TL;DR: A distributed algorithm for solving the classical linear cost assignment problem that employs exclusively pure relaxation steps whereby the prices of sources and sinks are changed individually on the basis of only local node price information.
Abstract: Relaxation methods for optimal network flow problems resemble classical coordinate descent, Jacobi, and Gauss-Seidel methods for solving unconstrained non-linear optimization problems or systems of nonlinear equations. In their pure form they modify the dual variables (node prices) one at a time using only local node information while aiming to improve the dual cost. They are particularly well suited for distributed implementation on massively parallel machine. For problems with strictly convex arc costs they can be shown to converge even if relaxation at each node is carried out asynchronously with out-of-date price information from neighboring nodes [1]. For problems with linear arc costs relaxation methods have outperformed by a substantial margin the classical primal simplex and primal-dual methods on standard benchmark problems [2], [3]. However in these particular methods it is necessary to change sometimes the prices of several nodes as a group in addition to carrying out pure relaxation steps. As a result global node price information is needed occasionally, and distributed implementation becomes somewhat complicated. In this paper we describe a distributed algorithm for solving the classical linear cost assignment problem. It employs exclusively pure relaxation steps whereby the prices of sources and sinks are changed individually on the basis of only local (neighbor) node price information. The algorithm can be implemented in an asynchronous (chaotic) manner, and seems quite efficient for problems with a small arc cost range. It has an interesting interpretation as an auction where economic agents compete for resources by making successively higher bids.

Journal ArticleDOI
TL;DR: Lower bounds of n2/2 and 3n2/4 messages are proved for a worst case execution of any algorithm to solve the ranking and sorting problems, respectively.
Abstract: We study the problems of sorting and ranking n processors that have initial values (not necessarily distinct) in a distributed system. Sorting means that the initial values have to move around in the network and be assigned to the processors according to their distinct identities, while ranking means that the numbers 1, 2,..., n have to be assigned to the processors according to their initial values; ties between initial values can be broken in any chosen way. Assuming a tree network, and assuming that a message can contain an initial value, an identity, or a rank, we present an algorithm for the ranking problem that uses, in the worst case, at most n2/2+O(n) such messages. The algorithm is then extended to perform sorting, using in the worst case at most 3n2/4+O(n) messages. Both algorithms are using a total of O(n) space. The algorithms are extended to general networks. The expected behavior of these algorithms for three classes of trees is discussed. Assuming that the initial values, identities, and ranks can be compared only within themselves, lower bounds of n2/2 and 3n2/4 messages are proved for a worst case execution of any algorithm to solve the ranking and sorting problems, respectively.

Journal ArticleDOI
TL;DR: The problem of sorting a file distributed over a number of sites of a communication network is examined, and distributed solution algorithms are presented and their communication complexity analyzed both in the worst and in the average case.
Abstract: The problem of sorting a file distributed over a number of sites of a communication network is examined. Two versions of this problem are investigated; distributed solution algorithms are presented; and their communication complexity analyzed both in the worst and in the average case. The worst case bounds are shown to be sharp, with respect to order of magnitude, for large files.

Proceedings ArticleDOI
21 Oct 1985
TL;DR: Two new BFS algorithms with improved communication complexity are presented, one of which uses the technique of the first recursively and achieves O(Eµ2 √logVloglogV) in communication and O(V¿2√log Vloglog V) in time.
Abstract: This paper develops a new distributed BFS algorithm for an asynchronous communication network. This paper presents two new BFS algorithms with improved communication complexity. The first algorithm has complexity O((E+V1.5)?logV) in communication and O(V1.5?logV) in time. The second algorithm uses the technique of the first recursively and achieves O(E?2 √logVloglogV) in communication and O(V?2√logVloglogV) in time.


Journal ArticleDOI
TL;DR: It is shown that in some cases global time can be assumed while designing an algorithm, but need not be implemented—in these cases it can be replaced with Lamport's logical time in a routine manner, which clearly preserves the correctness of the algorithm.

Journal ArticleDOI
Sakata1, Ueda1
TL;DR: This article presents an experimental mail system in which computer-based integration of office procedures is achieved in terms of the documentprocessing functions and the information media supported.
Abstract: ty of electronic inter_ multimedia documents , graphics, images, and e is playing an important lining office procedures oping integrated office syswide variety of systems, such ronic mail and computer-based ge systems, is now available, eng document processing in distribdenvironments. However, for dese systems to realize their full potential as effective office facilities, they must provide a variety ofunified document-processing functions, including composing, editing, filing, retrieving, and transmitting, as well as multimedia document handling. This article presents an experimental mail system in which computer-based integration of office procedures is achieved in terms of the documentprocessing functions and the information media supported. Interoffice document exchange is based on an office communication architecture with a two-layered structure: the Information Interchange Architecture, or IIA, and the Information Content Architecture, or ICA.1'2 The IIA defines an office system model of distributed en-

Journal ArticleDOI
TL;DR: The author investigates the problem of distributing data structures by developing a distributed version of an extensible hash file, which is a dynamic indexing structure which could be useful in a distributed database.
Abstract: In spite of the amount of work recently devoted to distributed systems, distributed applications are relatively rare. One hypothesis to explain this scarcity of examples is a lack of experience with algorithm design techniques tailored to an environment in which out-of-date and incomplete information is the rule. Since the design of data structures is an important aspect of traditional algorithm design, the author feels that it is important to consider the problem of distributing data structures. She investigates these issues by developing a distributed version of an extensible hash file, which is a dynamic indexing structure which could be useful in a distributed database.

Journal ArticleDOI
TL;DR: An efficient decentralized algorithm for synchronized termination of a distributed computation is presented, it is assumed that distributed processes are connected via unidirectional channels into a strongly connected network, in which no central controller exists.
Abstract: An efficient decentralized algorithm for synchronized termination of a distributed computation is presented. It is assumed that distributed processes are connected via unidirectional channels into a strongly connected network, in which no central controller exists. The number of processes and the network configuration are not known a priori. The number of steps required to terminate distributed computation after all processes met their local termination conditions is proportional to the diameter D of the network (D + 1 steps).

DOI
01 Jul 1985
TL;DR: A clustering algorithm is described for the decomposition of networks using a distributed array processor, and the way in which the clusters relate to power-system load-flow analysis is shown.
Abstract: A clustering algorithm is described for the decomposition of networks using a distributed array processor. The algorithm is applied to the analysis of medium-sized networks, and the way in which the clusters relate to power-system load-flow analysis is shown. A sparsity oriented development of the method suitable for the analysis of large networks is derived.

Book ChapterDOI
16 Dec 1985
TL;DR: This work presents a class of efficient algorithms for termination detection in a distributed system that do not require the FIFO property for the communication channels and assumes the connectivity of the processes are simple.
Abstract: We present a class of efficient algorithms for termination detection in a distributed system. These algorithms do not require the FIFO property for the communication channels. Assumptions regarding the connectivity of the processes are simple. Messages for termination detection are processed and sent out from a process only when it is idle. Thus it is expected that these messages would not interfere much with the underlying computation, i.e., the computation not related to termination detection. The messages have a fixed, short length. After termination has occurred, it is detected within a small number of message communications.

Journal ArticleDOI
F. Parr1, J. Auerbach1, B. Goldstein1
TL;DR: The point of view is presented that the PC user should be provided with a unified view of the heterogeneous distributed system to which he is connected and the proposed method is to formalize the notion of a service request and provide distributed services by function shipping service requests to remote nodes able to provide the service.
Abstract: This paper surveys some of the issues involved in building useful distributed systems involving PC's and hosts. Alternative communications techniques for micro-mainframe communication are compared. The point of view is presented that the PC user should be provided with a unified view of the heterogeneous distributed system to which he is connected. The proposed method is to formalize the notion of a service request and provide distributed services by function shipping service requests to remote nodes able to provide the service, e.g., personal computers will ship requests which they cannot satisfy locally to hosts on the network. Providing a unified view of data which allows PC application programs to access files on mainframes is an example of a service which can be built by intercepting and shipping service requests. Examples from current IBM products are used to illustrate approaches. The views presented are the authors' own, based on systems research in progress at the IBM Thomas J. Watson Research Center.

Proceedings ArticleDOI
01 Mar 1985
TL;DR: Two very different distributed scheduling algorithms which contain explicit mechanisms for stability are presented and evaluated and indicate how very specific the treatment of stability is to the algorithm and environnent under consideration.
Abstract: Many distributed scheduling algorithms have been developed and reported in the current literature. However, very few of them explicitly treat stability issues. This paper first discusses stability issues for distributed scheduling algorithms in general terms. Two very different distributed scheduling algorithms which contain explicit mechanisms for stability are then presented and evaluated with respect to individual specific stability issues. One of the agorithms is based on stochastic learning automata and the other on bidding. The results indicate how very specific the treatment of stability is to the algorithm and environnent under consideration.

Proceedings ArticleDOI
01 Aug 1985
TL;DR: The heoretical limitations of distributed matchmaking are established, and the techniques are applied to several network topologies, including store-and-forward d6mputer networks of this type.
Abstract: In the very large multiprocessor systems and, on a gander scale, computer networks now emerging, processes are not tied to fixed processors but run on processors taken from a pool of processors. Processors are released when a process dies, migrates or when the process crashes. In distributed operating systems using the service concept, processes can be clients asking for a service, servers giving a service or both. Establishing communication between a process asking for a service and a process giving that service, without centralized control in a distributed environment with mobile processes, constitutes the problem of 1 distributed matchmaking. Logically, such a match-making phase precedes routing in store-and-forward d6mputer networks of this type. Algorithms for distributed match-making are developed and their complexity is investigated in terms of message passes and in terms of storage needed. The heoretical limitations of distributed matchmaking are established, and the techniques are applied to several network topologies.

Journal ArticleDOI
TL;DR: The pretranslator being developed and a number of issues which have arisen with regard to the distributed execution of a single Ada program, including language semantics, objects of distribution and their mutual access, network timing, and execution environments are described.
Abstract: The Ada Research Group of the Robotics Research Laboratory at The University of Michigan is currently developing a real-time distributed computing capability based upon the premises that real-time distributed languages provide the best approach to real-time distributed computing and, given the focus on the language level, that Ada offers an excellent candidate language. The first phase of the group's work was on analysis of real-time distributed computing. The second, and current, phase is the development of a pretranslator which translates an Ada program into n Ada programs, each being targeted for one of a group of processors and each having required communication support software automatically created and attached by the pre-translator. This paper describes the pretranslator being developed and a number of issues which have arisen with regard to the distributed execution of a single Ada program, including language semantics, objects of distribution and their mutual access, network timing, and execution environments.


Proceedings Article
01 Jan 1985
TL;DR: DIB uses a distributed algorithm that divides the problem into subproblems and dynamically allocates them to any number of (potentially nonhomogeneous) machines and requires only minimal support from the distributed operating system.
Abstract: DIB is a general-purpose package that allows a wide range of applications such as recursive backtrack, branch and bound, and alpha-beta search to be implemented on a multicomputer. It is very easy to use. The application program needs to specify only the root of the recursion tree, the computation to be performed at each node, and how to generate children at each node. In addition, the application program may optionally specify how to synthesize values of tree nodes from their children's values and how to disseminate information (such as bounds) either globally or locally in the tree. DIB uses a distributed algorithm, transparent to the application programmer, that divides the problem into subproblems and dynamically allocates them to any number of (potentially nonhomogeneous) machines. This algorithm requires only minimal support from the distributed operating system. DIB can recover from failures of machines even if they are not detected. DIB currently runs on the Crystal multicomputer at the University of Wisconsin-Madison. Many applications have been implemented quite easily, including exhaustive traversal (N queens, knight's tour, negamax tree evaluation), branch and bound (traveling salesman) and alpha-beta search (the game of NIM). Speedup is excellent for exhaustive traversal and quite good for branch and bound.

Journal ArticleDOI
TL;DR: A distributed computer system based on a task-level dataflow architecture can reduce traffic, speed communication between processors, and tolerate hardware faults by automatically reassigning computations to a healthy processor by asking when and how to do node reassignment as the data flow architecture and processor are designed.

Journal ArticleDOI
TL;DR: The notion of conflicts relation using which a designer can construct either an optimistic or a pessimistic concurrency control scheme is introduced and incorporates primitives for constructing nested atomic actions.
Abstract: A distributed program is a collection of several processes which execute concurrently, possibly in different nodes of a distributed system, and which cooperate with each other to realize a common goal. In this paper, we present a design of communication and synchronization primitives for distributed programs. The primitives are designed such that they can be provided by a kernel of a distributed operating system. An important feature of the design is that the configuration of a process, i.e., identities of processes with which the process communicates, is specified separately from the computation performed by the process. This permits easy configuration and reconfiguration of processes. We identify different kinds of communication failures, and provide distinct mechanisms for handling them. The communication primitives are not atomic actions. To enable the construction of atomic actions, two new program components, atomic agent and manager are introduced. These are devoid of policy decisions regarding concurrency control and atomic commitment. We introduce the notion of conflicts relation using which a designer can construct either an optimistic or a pessimistic concurrency control scheme. The design also incorporates primitives for constructing nested atomic actions.

Journal ArticleDOI
TL;DR: An interprocess communication structure for a distributed language is described which provides message level communication, multicast, and a generalized naming facility for low level algorithms which, for example, might be used in a distributed operating system to support resource allocation or enhance reliability.
Abstract: An interprocess communication structure for a distributed language is described which provides message level communication, multicast, and a generalized naming facility. The design is oriented to the needs of low level algorithms which, for example, might be used in a distributed operating system to support resource allocation or enhance reliability. The proposal is illustrated by programming several distributed algorithms from the literature. An implementation is described that takes advantage of physical multicast technology, and reduces to more conventional schemes for common communication paradigms.