Showing papers on "Distributed algorithm published in 1990"

PDF

Open Access

Book•

Principles of Distributed Database Systems

[...]

M. Tamer zsu¹, Patrick Valduriez²•Institutions (2)

University of Alberta¹, French Institute for Research in Computer Science and Automation²

01 Aug 1990

TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.

...read moreread less

Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.

...read moreread less

2,395 citations

Journal Article•DOI•

Scheduling broadcasts in multihop radio networks

[...]

Anthony Ephremides¹, T. V. Truong²•Institutions (2)

University of Maryland, College Park¹, IBM²

01 Apr 1990-IEEE Transactions on Communications

TL;DR: In this article, a comprehensive study of the problem of scheduling broadcast transmissions in a multihop, mobile packet radio network is provided that is based on throughput optimization subject to freedom from interference.

...read moreread less

Abstract: A comprehensive study of the problem of scheduling broadcast transmissions in a multihop, mobile packet radio network is provided that is based on throughput optimization subject to freedom from interference. It is shown that the problem is NP complete. A centralized algorithm that runs in polynomial time and results in efficient (maximal) schedules is proposed. A distributed algorithm that achieves the same schedules is then proposed. The algorithm results in a maximal broadcasting zone in every slot. >

...read moreread less

498 citations

Journal Article•DOI•

GENITOR II.: a distributed genetic algorithm

[...]

Darrell Whitley¹, Timothy Starkweather¹•Institutions (1)

Colorado State University¹

20 Oct 1990-Journal of Experimental and Theoretical Artificial Intelligence

TL;DR: A distributed version of GENITOR which uses many smaller distributed populations in place of a single large population is introduced, and is able to optimize a broad range of sample problems more accurately and more consistently than GENITor with a single population.

...read moreread less

Abstract: GENITOR is a genetic algorithm which employs one-at-a-time reproduction and allocates reproductive opportunities according to rank to achieve selective pressure. Theoretical arguments and empirical evidence suggest that GENITOR is less vulnerable to some of the biases that degrade performance in standard genetic algorithms. A distributed version of GENITOR which uses many smaller distributed populations in place of a single large population is introduced. GENITOR II is able to optimize a broad range of sample problems more accurately and more consistently than GENITOR with a single population. GENITOR II also appears to be more robust than a single population genetic algorithm, yielding better performance without parameter tuning. We present some preliminary analyses to explain the performance advantage of the distributed algorithm. A distributed search is shown to yield improved search on several classes of problems, including binary encoded feedforward neural networks, the Traveling Salesman Pr...

...read moreread less

294 citations

Distributed Computing: Models and Methods.

[...]

Leslie Lamport, Nancy Lynch¹•Institutions (1)

Massachusetts Institute of Technology¹

08 Aug 1990

TL;DR: An important problem in distributed computing is to provide a user with a non-distributed view of a distributed system to implement a distributed file system that allows the client programmer to ignore the physical location of his data.

...read moreread less

Abstract: Jan van Leeuwen asked me to write a chapter on distributed systems for this handbook. I realized that I wasn’t familiar enough with the literature on distributed algorithms to write it by myself, so I asked Nancy Lynch to help. I also observed that there was no chapter on assertional verification of concurrent algorithms. (That was probably due to the handbook’s geographical origins, since process algebra rules in Europe.) So I included a major section on proof methods. As I recall, I wrote most of the first three sections and Lynch wrote the fourth section on algorithms pretty much by herself.

...read moreread less

160 citations

Journal Article•DOI•

Automatically increasing the fault-tolerance of distributed algorithms

[...]

Gil Neiger¹, Sam Toueg²•Institutions (2)

Georgia Institute of Technology¹, Cornell University²

01 Sep 1990-Journal of Algorithms

TL;DR: Two new translation mechanisms for synchronous systems are described that can be used to translate any protocol tolerant of the most benign failures into a protocol tolerantOf the most severe with respect to fault-tolerance.

...read moreread less

133 citations

Journal Article•DOI•

Time-optimal leader election in general networks

[...]

David Peleg¹•Institutions (1)

Stanford University¹

03 Jan 1990-Journal of Parallel and Distributed Computing

TL;DR: This note presents a simple time-optimal distributed algorithm for electing a leader in a general network that is also message-Optimal and thus performs better than previous algorithms for the problem.

...read moreread less

113 citations

Journal Article•

Design and implementation of an object-oriented strongly typed language for distributed applications

[...]

S. Krakowiak, M. Meysembourg, H. Nguyen Van, Michel Riveill, C. Roisin, X. Rousset de Pina - Show less +2 more

01 Aug 1990-Journal of Object-oriented Programming

86 citations

Journal Article•DOI•

A Fan-In Algorithm for Distributed Sparse Numerical Factorization

[...]

Cleve Ashcraft, Stanley C. Eisenstat, Joseph W. H. Liu

01 May 1990-Siam Journal on Scientific and Statistical Computing

TL;DR: A column-oriented distributed algorithm for factoring a large sparse symmetric positive definite matrix on a local-memory parallel processor that achieves good speedups on an Intel iPSC/2 hypercube.

...read moreread less

Abstract: This paper presents a column-oriented distributed algorithm for factoring a large sparse symmetric positive definite matrix on a local-memory parallel processor. Processors cooperate in computing each column of the Cholesky factor by calculating independent updates to the corresponding column of the original matrix. These updates are sent in a fan-in manner to the processor assigned to the column, which then completes the computation. Experimental results on an Intel iPSC/2 hypercube demonstrate that the method is effective and achieves good speedups.

...read moreread less

83 citations

Proceedings Article•DOI•

Distributed control for PARIS

[...]

Baruch Awerbuch¹, Israel Cidon², Inder Sarat Gopal², Marc Adam Kaplan², Shay Kutten² - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, IBM²

01 Aug 1990

TL;DR: The control protocols of the PARIS experimental network are described, which is currently operational as a laboratory prototype and will also be deployed within the AURORA Testbed that is part of the NSF/DARPA Gigabit Networking program.

...read moreread less

Abstract: 1 Introduction We describe the control protocols of the PARIS experimental network. This high bandwidth network for integrated communication (data, voice, video) ia currently operational as a laboratory prototype. It will also be deployed within the AURORA Testbed that is part of the NSF/DARPA Gigabit Networking program. The high bandwidth dictates the need of specialized hardware to support faster packet handling and control protocols. A new network control architecture is presented which exploits the specialized hardware in order to support the expected real time needs of future traffic. In particular, since control information can be distributed quickly, decisions can be made based upon more complete and accurate information. In some respects , this has the effect of having the benefits of centralized control (e.g. easier bandwidth resource allocation to connections), while retaining the fault-tolerance and scalability of a distributed architecture. Packet switching networks have changed considerably in recent years. One factor has been the dramatic increase in the capacity of the communication links. The advent of fiber optic media has pushed the transmission speed of communication links to more than a Gigabit/set, representing an increase of several orders of magnitude over typical links in most packet switching networks ([KMS87]) that are still in use today. Increases in link speeds have not been matched by proportionate increases in the processing speeds of communication nodes. Another factor is the changed nature of t&Tic carried by these networks. As opposed to solely data networks, or solely voice networks, it is now accepted that packet switching networks (or variants of packet switching networks like ATM ([K&37])) will form the basis for multimedia high speed networks that will carry voice, data and video through a common set of nodes and links. The disparity between communication and processing speeds suggests that processing may become the main bottleneck in future networks. To avoid this possibility, these networks will be built with high speed switching hardware to off-load the routine packet handling and routing functions from the processor ([CGK88]). In addition, the real time trafEc (e.g. voice) requires that the route selection function be capable of guaranteeing the avaiiability of bandwidth on the links along the chosen path for a particular traffic stream. Otherwise, conges-Permission to copy without fee all or part of this material is granted provided that the copies are not made or dixr-ibuted for direct commercial advantage, the ACM copyright notice and the title of …

...read moreread less

79 citations

Proceedings Article•DOI•

Practical application and implementation of distributed system-level diagnosis theory

[...]

R. Bianchini¹, K. Goodwin¹, Daniel S. Nydick•Institutions (1)

Carnegie Mellon University¹

26 Jun 1990

TL;DR: A DSD project that consists of the implementation of a distributed self-diagnosis algorithm and its application to distributed computer networks is presented and the EVENT-SELF algorithm presented combines the rigor associated with theoretical results with the resource limitations associated with actual systems.

...read moreread less

Abstract: A DSD (distributed self-diagnosing) project that consists of the implementation of a distributed self-diagnosis algorithm and its application to distributed computer networks is presented. The EVENT-SELF algorithm presented combines the rigor associated with theoretical results with the resource limitations associated with actual systems. Resource limitations identified in real systems include available message capacity for the communication network and limited processor execution speed. The EVENT-SELF algorithm differs from previously published algorithms by adopting an event-driven approach to self-diagnosability. Algorithm messages are reduced to those messages required to indicate changes in system those messages required to indicate changes in system state. Practical issues regarding the CMU-ECE DSD implementation are considered. These issues include the reconfiguration of the testing subnetwork for environments in which processors can be added and removed. One of the goals of this work is to utilize the developed CMU-ECE DSD system as an experimental test-bed environment for distributed applications. >

...read moreread less

67 citations

Journal Article•DOI•

Project Athena as a distributed computer system

[...]

G.A. Champine, D.E. Geer¹, W.N. Ruh²•Institutions (2)

Massachusetts Institute of Technology¹, IBM²

01 Sep 1990-IEEE Computer

TL;DR: Project Athena, established in 1983 to improve the quality of education at MIT by providing campuswide, high-quality computing based on a large network of workstations, is discussed, focusing on the design of Athena's distributed workstation system.

...read moreread less

Abstract: Project Athena, established in 1983 to improve the quality of education at MIT (Massachussetts Institute of Technology) by providing campuswide, high-quality computing based on a large network of workstations, is discussed, focusing on the design of Athena's distributed workstation system. The requirements of the system are outlined distributed system models are reviewed, other distributed operating systems are described, and issues in distributed systems are examined. The distributed-system model for Athena is discussed. Athena has three major components; workstations a network, and servers. The approach taken by the Athena developers was to implement a set of network services to replace equivalent time-sharing services, in essence converting the time-sharing Unix model into a distributed operating system. >

...read moreread less

Proceedings Article•DOI•

A fault tolerant algorithm for distributed mutual exclusion

[...]

Ye-In Chang¹, Mukesh Singhal¹, M.T. Liu¹•Institutions (1)

Ohio State University¹

09 Oct 1990

TL;DR: There is a tradeoff between efficiency and reliability, and a system can be designed to balance these two criteria properly and achieve a higher degree of fault tolerance at the expense of increased message traffic.

...read moreread less

Abstract: A fault-tolerant mutual exclusion algorithm for distributed systems is presented. The algorithm uses a distributed queue strategy and maintains alternative paths at each site to provide a high degree of fault tolerance. However, owing to these alternative paths, the algorithm must use reverse messages to avoid the occurrence of directed cycles, which may form when the direction of edges is reversed after the token passes through. If there is no alternative path, the total number of the messages exchanged is O (2*log N) in light traffic and two messages in heavy traffic; however, in this case the system cannot tolerate even a single communication link or site failure. If there are alternative paths between sites, the system can achieve a higher degree of fault tolerance at the expense of increased message traffic (owing to reverse messages). Thus, there is a tradeoff between efficiency and reliability, and a system can be designed to balance these two criteria properly. A recovery procedure for restoring a recovering site consistently into the system is also presented. >

...read moreread less

Journal Article•DOI•

Flush primitives for asynchronous distributed systems

[...]

M. Ahuja¹•Institutions (1)

Ohio State University¹

22 Feb 1990-Information Processing Letters

TL;DR: Three channel primitives for sending messages: two-way- flush, forward-flush, and backward-flush are presented, collectively termed Flush, which can permit as much concurrency as non-FIFO channels and yet retain the properties of FIFOannels.

...read moreread less

Journal Article•DOI•

A super-parallel sorting algorithm based on neural networks

[...]

Yoshiyasu Takefuji¹, K.C. Lee¹•Institutions (1)

Case Western Reserve University¹

01 Nov 1990-IEEE Transactions on Circuits and Systems

TL;DR: A novel neural network parallel algorithm for sorting problems is presented that requires only two steps, and does not depend on the size of the problem, while the conventional parallel sorting algorithm using O(n) processors by F.T. Leighton (1984) needs the computation time O(log n/sup 2/).

...read moreread less

Abstract: A novel neural network parallel algorithm for sorting problems is presented. The proposed algorithm using O(n/sup 2/) processors requires only two steps, and does not depend on the size of the problem, while the conventional parallel sorting algorithm using O(n) processors by F.T. Leighton (1984) needs the computation time O(log n/sup 2/). A set of simulation results substantiates the proposed algorithm. The hardware system based on the proposed parallel algorithm is also presented. >

...read moreread less

Journal Article•DOI•

Distributed fault-tolerant embeddings of rings in hypercubes

[...]

Mee Yee Chan¹, Shiang-Jen Lee¹•Institutions (1)

University of Texas at Dallas¹

01 Dec 1990-Journal of Parallel and Distributed Computing

TL;DR: Simple distributed algorithms for successfully embedding a ring of size at least 2 n −2 f in an n -cube with f ⩽⌞(n + 1)/2⌟ faults are contributed.

...read moreread less

Journal Article•DOI•

A theorem on atomicity in distributed algorithms

[...]

Leslie Lamport

01 Jun 1990-Distributed Computing

TL;DR: In this article, a distributed algorithm is simplified by ignoring the time needed to send and deliver messages and instead pretending that a process sends a collection of messages as a single atomic action, with the messages delivered instantaneously as part of the action.

...read moreread less

Abstract: Reasoning about a distributed algorithm is simplified if we can ignore the time needed to send and deliver messages and can instead pretend that a process sends a collection of messages as a single atomic action, with the messages delivered instantaneously as part of the action. A theorem is derived that proves the validity of such reasoning for a large class of algorithms. It generalizes and corrects a well-known folk theorem about when an operation in a multiprocess program can be considered atomic.

...read moreread less

Proceedings Article•DOI•

A branch-and-bound-with-underestimates algorithm for the task assignment problem with precedence constraint

[...]

Gen-Huey Chen¹, Jyr-Shiarn Yur•Institutions (1)

National Taiwan University¹

01 Jan 1990

TL;DR: The problem of finding an optimal assignment of task modules with a precedence relationship in a distributed computing system is considered, and a well-known state-space reduction technique, branch-and-bound-with-underestimates, is applied, and two underestimate functions are defined.

...read moreread less

Abstract: The problem of finding an optimal assignment of task modules with a precedence relationship in a distributed computing system is considered. The objective of task assignment is to minimize the task turnaround time. The problem is known to be NP-complete for more than three processors. To solve the problem, a well-known state-space reduction technique, branch-and-bound-with-underestimates, is applied, and two underestimate functions are defined. Through experiments, their effectiveness is shown by comparing the proposed algorithm with both Wang and Tsai's (1988) algorithm and the A* algorithm with h(x)=0. >

...read moreread less

Simulating Synchronized Clocks and Common Knowledge in Distributed Systems

[...]

Gil Neiger¹, Sam Toueg²•Institutions (2)

Georgia Institute of Technology¹, Cornell University²

01 Jan 1990

TL;DR: In this paper, a large class of problems that can be solved using logical clocks as if they were perfectly synchronized clocks is formally characterized, and a broadcast primitive is also proposed to simplify the task of designing and verifying distributed algorithms.

...read moreread less

Abstract: Time and knowledge are studied in synchronous and asynchronous distributed systems. A large class of problems that can be solved using logical clocks as if they were perfectly synchronized clocks is formally characterized. For the same class of problems, a broadcast primitive that can be used as if it achieves common knowledge is also proposed. Thus, logical clocks and the broadcast primitive simplify the task of designing and verifying distributed algorithms: The designer can assume that processors have access to perfectly synchronized clocks and the ability to achieve common knowledge.

...read moreread less

Proceedings Article•DOI•

Optimal static load balancing of multi-class jobs in a distributed computer system

[...]

Chonggun Kim¹, Hisao Kameda¹•Institutions (1)

University of Electro-Communications¹

28 May 1990

TL;DR: A straightforward and efficient algorithm for optimal load balancing of multiclass jobs is derived and it is shown that for obtaining the optimal solution the authors' algorithm and the Dafermos algorithm require comparable computation times that are far less than that of the FD algorithm.

...read moreread less

Abstract: This model is an extension of the Tantawi and Towsley (1985) single-job-class model as applied to a multiple-job-class model. Some properties of the optimal solution are shown. On the basis of these properties, a straightforward and efficient algorithm for optimal load balancing of multiclass jobs is derived. The performance of this algorithm is compared with that of two other well-known algorithms for multiclass jobs, the flow deviation (FD) algorithm and the Dafermos algorithm. The authors' algorithm and the FD algorithm both require a comparable amount of storage that is far less than that required by the Dafermos algorithm. Numerical experiments show that for obtaining the optimal solution the authors' algorithm and the Dafermos algorithm require comparable computation times that are far less than that of the FD algorithm. >

...read moreread less

Proceedings Article•DOI•

A robust group membership algorithm for distributed real-time systems

[...]

Paul D. Ezhilchelvan¹, R. de Lemos¹•Institutions (1)

Universities UK¹

05 Dec 1990

TL;DR: An algorithm is presented by which nonfaulty processors of a group of fixed size will be able to maintain a consistent and timely knowledge of the group membership.

...read moreread less

Abstract: An algorithm is presented by which nonfaulty processors of a group of fixed size will be able to maintain a consistent and timely knowledge of the group membership. The authors assume an architecture in which the broadcast network is accessed by some time domain multiplexing techniques where the exclusive right to transmit messages is granted to each processor once in every 'cycle'. In an execution of the proposed algorithm, every nonfaulty processor knows of any processor failure within at most two cycles following the cycle in which the failure occurred, and a restarted processor can join the group in two cycles. At most less than half the number of processors are assumed to fail in any three consecutive cycles. >

...read moreread less

Book•

The impact of vector and parallel architectures on the Gaussian elimination algorithm

[...]

Yves Robert¹•Institutions (1)

École Normale Supérieure¹

01 Jan 1990

TL;DR: This paper presents three case studies of Gaussian elimination in vector multiprocessor computing, a model system for Gaussian elimation, and methodologies for systolic arrays for dependence mapping method, complexity results, folding.

...read moreread less

Abstract: Introduction: background - Gaussian elimination, speedup and efficiency vector and parallel architectures: pipeline computers vector computers parallel computers three case studies. Part 1 Parallel algorithm design - vector multiprocessor computing - vectorization of vector-vectr operations, Gaussian elimination in terms of vector-vector kernels, vector register re-use, Gaussian elimination interms of matrix-vector kernels, cache re-use, Gaussian elimination in terms of matrix-matrix kernels, vectorization epilogue, fine-grain parallelism, parallel Gaussian elimination hypercube computing - topological properties of hypercubes, broadcasting, centralized Gaussian elimination, local pipelined algorithms, a word on speedup evaluation, matrices over finite fields systolic computing - 2D arrays, solving the triangular system on the fly, 1D arrays, matrices over finite fields. Part 2 Models and tools: task graph scheduling - task system for Gaussian elimation, bounds for parallel execution, an optimal schedule, with an arbitrary number of processors analysis of distributed algorithms - data allocation strategies, speedup evaluation on distributed memory machines design methodologies for systolic arrays - dependence mapping method, complexity results, folding.

...read moreread less

Proceedings Article•DOI•

Distributed computing problems in cellular robotic systems

[...]

Jing Wang¹, G. Beni¹•Institutions (1)

University of California, Santa Barbara¹

03 Jul 1990

TL;DR: The relationship between CRS and distributed computing is discussed and solutions to two problems encountered in designing pattern generation protocols for CRS, related to distributed mutual exclusion problem and distributed deadlock detection problem, are presented.

...read moreread less

Abstract: Cellular robotic systems (CRS) employ a large number of robots operating in cellular spaces under distributed control. In this paper, the relationship between CRS and distributed computing is discussed. Two problems encountered in designing pattern generation protocols for CRS, the n-way intersection problem and the knot detection problem, are related to distributed mutual exclusion problem and distributed deadlock detection problem, respectively. Solutions to these two problems, derived from their counterparts in distributed computing, are presented in the CRS context. >

...read moreread less

Journal Article•DOI•

Two algorithms for mutual exclusion in real-time distributed computer systems

[...]

Andrzej Goscinski¹•Institutions (1)

University of New South Wales¹

01 May 1990-Journal of Parallel and Distributed Computing

TL;DR: Two algorithms developed utilizing a priority-based event-ordering which manage mutual exclusion in distributed systems—computer networks—are proposed, which are fully distributed and are insensitive to the relative speeds of node computers and communication links.

...read moreread less

Journal Article•DOI•

Resource management in large distributed systems

[...]

Andrzej Goscinski¹, Mirion Bearman²•Institutions (2)

University of New South Wales¹, University of Canberra²

01 Sep 1990-Operating Systems Review

TL;DR: This paper proposes that a resource management system for large distributed systems should have two levels --- a lower one, responsible for export and allocation of resources in local distributed systems, and an upper one, which manages special resources/services that are not provided locally.

...read moreread less

Abstract: In this paper, we propose that a resource management system for large distributed systems should have two levels --- a lower one, responsible for export and allocation of resources in local distributed systems, and an upper one, which manages special resources/services that are not provided locally. For a local environment, load balancing (implementing export and allocation of computational resources) is realized in a distributed way; and management of peripheral resources is developed based on a name server, which can be centralized, or distributed and replicated. The upper level has a centralized resource management center, which is responsible for export and allocation of both peripheral and computational resources. It contains two parts: a name server, which stores attributed names of all shareable resources and a resource manager, which allocates resources to requesting users of a large distributed system. Communication between the resource management center and the local systems is facilitated through integrating modules. This system is now designed based on the RHODOS distributed operating system.

...read moreread less

Proceedings Article•DOI•

Preventing state divergence in replicated distributed programs

[...]

A. Tulley¹, Santosh K. Shrivastava¹•Institutions (1)

Universities UK¹

09 Oct 1990

TL;DR: Replicated execution of distributed programs, which provides a means of masking hardware (processor) failures in a distributed system, is discussed and a generic mechanism for ensuring that nonfaulty replicas process messages in identical order, thereby preventing state divergence among such replicate entities, is presented.

...read moreread less

Abstract: Replicated execution of distributed programs, which provides a means of masking hardware (processor) failures in a distributed system, is discussed. Application-level entities (processes, objects) are replicated to execute on distinct processors. Such replica entities communicate by message passing. Nondeterminism within the replicas could cause messages to be processed in nonidentical order, producing a divergence of state. Possible sources of nondeterminism are identified, and a generic mechanism for ensuring that nonfaulty replicas process messages in identical order, thereby preventing state divergence among such replicate entities, is presented. >

...read moreread less

Book Chapter•DOI•

Distributed game tree search

[...]

R. Feldman¹, Burkhard Monien¹, Peter Mysliwietz¹, Oliver Vornberger²•Institutions (2)

University of Paderborn¹, University of Osnabrück²

01 Mar 1990

TL;DR: A distributed algorithm for searching game trees using a general strategy for distributed computing that can be applied also to other search algorithms and two new concepts are introduced in order to reduce search overhead and communication overhead.

...read moreread less

Abstract: We present a distributed algorithm for searching game trees. A general strategy for distributed computing is used that can be applied also to other search algorithms. Two new concepts are introduced in order to reduce search overhead and communication overhead: the “Young Brothers Wait Concept” and the “Helpful Master Concept”. We describe some properties of our distributed algorithm including optimal speedup on best ordered game trees.

...read moreread less

Book Chapter•DOI•

Efficient algorithms for crash recovery in distributed systems

[...]

Tony T.-Y. Juang¹, Subbarayan Venkatesan¹•Institutions (1)

University of Texas at Dallas¹

01 Nov 1990

TL;DR: It is shown that O(kn) messages are sufficient for rollingback all of the processors to the maximum consistent states when there are k failures, and for recovery in general networks and in ring networks Θ(n) message are necessary and sufficient when an arbitrary number of processors fail.

...read moreread less

Abstract: We consider the problem of recovering from processor failures efficiently in distributed systems. Each message received is logged in volatile storage when it is processed. At irregular intervals, each processor independently saves the contents of its volatile storage in stable storage. By appending only O(1) extra information to each message, we show that for recovery in general networks O(n2) messages are sufficient and in ring networks Θ(n) messages are necessary and sufficient when an arbitrary number of processors fail. By appending O(n) extra information to each message that is sent, we show that O(kn) messages are sufficient for rollingback all of the processors to the maximum consistent states when there are k failures.

...read moreread less

Proceedings Article•DOI•

Checkpointing and rollback-recovery in distributed object based systems

[...]

L. Lin¹, M. Ahamad¹•Institutions (1)

Georgia Institute of Technology¹

26 Jun 1990

TL;DR: By utilizing the structure of objects and operation invocations, the authors have derived efficient algorithms that involve fewer participants than when invocations are treated as messages and existing algorithms for message-based systems are used.

...read moreread less

Abstract: Checkpointing and rollback-recovery algorithms in distributed object-based systems are presented. By utilizing the structure of objects and operation invocations, the authors have derived efficient algorithms that involve fewer participants than when invocations are treated as messages and existing algorithms for message-based systems are used. It is planned to implement these algorithms and evaluate their performance in the context of the Clouds project at Georgia Tech. >

...read moreread less

Network Considerations for Distributed Multimedia Object Composition and Communication Thomas D. C. Little

[...]

Arif Ghafoor

01 Jan 1990

An Overview of DAI: Viewing Distributed AI as Distributed Search

[...]

Victor Lesser

01 Jul 1990

Collapse