Showing papers on "Consensus published in 1993"

PDF

Open Access

Journal Article•DOI•

The consensus problem in fault-tolerant computing

[...]

Michael Barborak¹, A.T. Dahbura², Miroslaw Malek¹•Institutions (2)

University of Texas at Austin¹, Motorola²

01 Jun 1993-ACM Computing Surveys

TL;DR: Research on the consensus problem is surveyed, approaches are compared, applications are outlined, and directions for future work are suggested.

...read moreread less

Abstract: The consensus problem is concerned with the agreement on a system status by the fault-free segment of a processor population in spite of the possible inadvertent or even malicious spread of disinformation by the fault segment of that population. The resulting protocols are useful throughout fault-tolerant distributed systems and will impact the design of other decision systems to come. This paper surveys research on the consensus problem, compares approaches, outlines applications, and suggests directions for future work.

...read moreread less

428 citations

Journal Article•DOI•

More choices allow more faults : set consensus problems in totally asynchronous systems

[...]

Soma Chaudhuri¹•Institutions (1)

Iowa State University¹

01 Jul 1993-Information & Computation

TL;DR: It is proved using a combinatorial argument that any k-resilient protocol for the k-set agreement problem would satisfy the uncertainty condition, while this is not true for any (k−1)-resilients in a totally asynchronous system.

...read moreread less

Abstract: We define the k-SET CONSENSUS PROBLEM as an extension of the CONSENSUS problem, where each processor decides on a single value such that the set of decided values in any run is of size at most k. We require the agreement condition that all values decided upon are initial values of some processor. We show that the problem has a simple (k−1)-resilient protocol in a totally asynchronous system. In an attempt to come up with a matching lower bound on the number of failures, we study the uncertainty condition, which requires that there must be some initial configuration from which all possible input values can be decided. We prove using a combinatorial argument that any k-resilient protocol for the k-set agreement problem would satisfy the uncertainty condition, while this is not true for any (k−1)-resilient protocol. This result seems to strengthen the conjecture that there is no k-resilient protocol for this problem. We prove this result for a restricted class of protocols. Our motivation for studying this problem is to test whether the number of choices allowed to the processors is related to the number of faults. We hope that this will provide intuition towards achieving better bounds for more practical problems that arise in distributed computing, e.g., the renaming problem. The larger goal is to characterize the boundary between possibility and impossibility in asynchronous systems given multiple faults.

...read moreread less

376 citations

Journal Article•DOI•

Time- and space-efficient randomized consensus

[...]

James Aspnes¹•Institutions (1)

Carnegie Mellon University¹

01 May 1993-Journal of Algorithms

TL;DR: A modified version of the protocol yields a weak shared coin whose bias is guaranteed to be in the range 1/2 ± ϵ regardless of scheduler behavior, and which is the first such protocol for the shared-memory model to guarantee that all processors agree on the outcome of the coin.

...read moreread less

59 citations

Journal Article•DOI•

Cloture Votes:n/4-resilient Distributed Consensus in t + 1 rounds

[...]

Piotr Berman¹, Juan A. Garay²•Institutions (2)

Pennsylvania State University¹, IBM²

01 Mar 1993-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: Cloture Votes—the protocol presented in this paper—takes further steps in this direction, by making consensus possible withn = 4t + 1,r = t - 1, and polynomial message size, and measuring the quality of a consensus protocol using the following parameters.

...read moreread less

Abstract: TheDistributed Consensus problem involvesn processors each of which holds an initial binary value. At mostt processors may be faulty and ignore any protocol (even behaving maliciously), yet it is required that the nonfaulty processors eventually agree on a value that was initially held by one of them. We measure the quality of a consensus protocol using the following parameters; total number of processorsn, number of rounds of message exchanger, and maximal message sizem. The known lower bounds are respectively 3t + 1,t + 1, and 1.

...read moreread less

49 citations

Proceedings Article•DOI•

Wait-free k-set agreement is impossible: the topology of public knowledge

[...]

Michael Saks, Fotios Zaharoglou¹•Institutions (1)

University of California, San Diego¹

01 Jun 1993

TL;DR: It is shown that for any k

...read moreread less

Abstract: In the classical consensus problem,each of n processors receives a private input value and produces a decision value which is one of the original input values,with the requirement that all processors decide the same value. A central result in distributed computing is that,in several standard models including the asynchronous shared-memory model,this problem has no determinis- tic solution. The k-set agreement problem is a generalization of the classical consensus proposed by Chaudhuri (Inform. and Comput.,105 (1993),pp. 132-158),where the agreement condition is weak- ened so that the decision values produced may be different,as long as the number of distinct values is at most k .F or n>k ≥ 2 it was not known whether this problem is solvable deterministically in the asynchronous shared memory model. In this paper,we resolve this question by showing that for any k

...read moreread less

46 citations

Journal Article•DOI•

Common knowledge and consistent simultaneous coordination

[...]

Gil Neiger¹, Mark R. Tuttle•Institutions (1)

Georgia Institute of Technology¹

01 Apr 1993-Distributed Computing

TL;DR: This work considers problems requiring consistent, simultaneous coordination in synchronous distributed systems and analyses these problems in terms of common knowledge in several failure models, showing that such problems cannot be solved, even in failure-free executions.

...read moreread less

Abstract: There is a very close relationship between common knowledge and simultaneity in synchronous distributed systems. The analysis of several well-known problems in terms of common knowledge has led to round-optimal protocols for these problems, including Reliable Broadcast, Distributed Consensus, and the Distributed Firing Squad problem. These problems require that the correct processors coordinate their actions in some way but place no restrictions on the behaviour of the faulty processors. In systems with benign processor failures, however, it is reasonable to require that the actions of a faulty processor be consistent with those of the correct processors, assuming it performs any action at all. We consider problems requiring consistent, simultaneous coordination. We then analyze these problems in terms of common knowledge in several failure models. The analysis of these stronger problems requires a stronger definition of common knowledge, and we study the relationship between these two definitions. In many cases, the two definitions are actually equivalent, and simple modifications of previous solutions yield round-optimal solutions to these problems. When the definitions differ, however, we show that such problems cannot be solved, even in failure-free executions.

...read moreread less

31 citations

Journal Article•DOI•

Fast consensus in networks of bounded degree

[...]

Piotr Berman¹, Juan A. Garay²•Institutions (2)

Pennsylvania State University¹, IBM²

01 Dec 1993-Distributed Computing

TL;DR: This paper shows how to achieve consensus in the butterfly network usingO(t+lognloglogn) one-bit parallel transmission steps, while tolerating the asymptotically optimal number of faulty processors (O(n/logn); and decreases the number of exceptions to O(t) by using additional links, while maintaining the same running time.

...read moreread less

Abstract: The Distributed Consensus problem involves n processors each of which holds an initial binary value. At most t of the processors may be faulty and ignore any protocol (even behaving maliciously), yet it is required that the non-faulty processors eventually agree on a value that was initially held by one of them. In this paper we focus on consensus in networks whose degree is bounded, following the work of Dwork, Peleg, Pippenger and Upfal [8]. In such a context, complete consensus among all the correct processors is not possible and some exceptions must be allowed. We first show how to achieve consensus in the butterfly network using O(t + log n loglog n) one-bit parallel transmission steps, while tolerating the asymptotically optimal number of faulty processors (O(n/log n)) and having the asymptotically minimal number of exceptions (O(t log t)). This result considerably improves on the running time of existing butterfly consensus protocols [2, 8]. In particular, it replaces the running time of O(n log n loglog n) of [2] with an asymptotically optimal one. As in [8], we can then decrease the number of exceptions to O(t) by using additional links, while maintaining the same running time. The protocol is derived from a consensus protocol for completely connected networks that is interesting in its own right: it achieves Distributed Consensus with optimal number of processors, asymptotically optimal total bit transfer and nearly optimal number of rounds.

...read moreread less

26 citations

Journal Article•DOI•

A partial equivalence between shared-memory and message-passing in an asynchronous fail-stop distributed environment

[...]

Amotz Bar-Noy¹, Danny Dolev², Danny Dolev¹•Institutions (2)

IBM¹, Hebrew University of Jerusalem²

01 Mar 1993-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: A randomized algorithm for the consensus problem in the message-passing model based on the algorithm of Aspnes and Herlihy [AH] in the shared-memory model is presented, which is the fastest known randomized algorithm that solves the consensusproblem against a strong fail-stop adversary with one-half resiliency.

...read moreread less

Abstract: This paper presents a schematic algorithm for distributed systems . This schematic algorithm uses a "black-box" procedure for communica- tion, the output of which must meet two requirements : a global-order requirement and a deadlock-free requirement . This algorithm is valid in any distributed system model that can provide such a communication procedure that complies with these requirements . Two such models exist in an asynchro- nous fail-stop environment : one in the shared-memory model and one in the message-passing model . The implementation of the block-box procedure in these models enables us to translate existing algorithms between the two models whenever these algorithms are based on the schematic algorithm . We demonstrate this idea in two ways . First, we present a randomized algorithm for the consensus problem in the message-passing model based on the algorithm of Aspnes and Herlihy (AH) in the shared-memory model . This solution is the fastest known randomized algorithm that solves the consensus problem against a strong fail-stop adversary with one-half resiliency . Second, we solve the processor renaming problem in the shared-memory model based on the solution of Attiya et al .(ABD +)in the message-passing model . The existence of the solution to the renaming problem should be contrasted with the impossibility result for the consensus problem in the shared-memory model (CIL), (DDS), (LA).

...read moreread less

20 citations

Journal Article•DOI•

Space-efficient asynchronous consensus without shared memory initialization

[...]

Michael J. Fischer¹, Shlomo Moran², Gadi Taubenfeld³•Institutions (3)

Yale University¹, Technion – Israel Institute of Technology², Bell Labs³

26 Feb 1993-Information Processing Letters

TL;DR: A consensus protocol for n processes which can tolerate up to dn=2ei1 failures and which uses a single (2d1:5n i 1e)-valued shared register is presented.

...read moreread less

12 citations

A Distributed Consensus Protocol with a Coordinator

[...]

Francisco Guerra Santana, Sergio Arévalo, Angel Alvarez, Francisco Javier Miranda González

13 Sep 1993

11 citations

Understanding the Power of the Virtually-Synchronous Model

[...]

André Schiper, Alain Sandoz

01 Aug 1993

TL;DR: This paper defines a clear semantics of the virtually-synchronous model, and shows that distributed commit can be solved by the model, providing an interesting broader picture of the problem of building fault-tolerant applications.

...read moreread less

Abstract: The purpose of this paper is to define a clear semantics of the virtually-synchronous model, and to show that distributed commit can be solved by the model. This is in a sense not surprising, as it has been shown that distributed consensus can be solved in the asynchronous model with a very weak failure detector. Considering this result, the virtually-synchronous model become extremely powerful, and more basic than the transaction model, providing an interesting broader picture of the problem of building fault-tolerant applications.

...read moreread less

Journal Article•DOI•

A quick distributed consensus protocol

[...]

F. Guerra¹, Sergio Arévalo², A. Alvarez², J. Miranda¹•Institutions (2)

University of Las Palmas de Gran Canaria¹, Technical University of Madrid²

01 Dec 1993-Microprocessing and Microprogramming

TL;DR: This paper presents an already known consensus protocol which has a cost of O(n 2 ) in the number of exchanged messages, and O( n ) in terms of time needed to arrive at an agreement, and presents several refinements to this protocol which make it linear-in the absence of failures.

...read moreread less

Journal Article•DOI•

Necessary and sufficient conditions for broadcast consensus protocols

[...]

Louise E. Moser¹, Peter M. Melliar-Smith¹, Vivek Agrawala¹•Institutions (1)

University of California, Santa Barbara¹

01 Dec 1993-Distributed Computing

TL;DR: It is shown that a necessary and sufficient condition for the existence of a deterministic consensus protocol is delivery of each broadcast message to at least ⌈(n+k+1)/2⌊ processes in ann-process system subject tok crash failures with either eventual fair broadcasting or eventual full broadcasting.

...read moreread less

Abstract: We consider consensus protocols in asynchronous distributed systems that are based on broadcast communication. We show that a necessary and sufficient condition for the existence of a deterministic consensus protocol is delivery of each broadcast message to at least ⌈(n + k + 1)/2⌉ processes in an n-process system subject to k crash failures with either eventual fair broadcasting or eventual full broadcasting. The broadcast model captures the idea of a broadcast communication medium, such as the Ethernet, in which messages, if delivered, are delivered immediately and in order but not necessarily to all processes.

...read moreread less

Journal Article•DOI•

A fault-tolerant server on MACH

[...]

S. Arévalo, Jesus Carretero, J. L. Castellanos, F. Barco¹•Institutions (1)

Technical University of Madrid¹

01 Sep 1993-Microprocessing and Microprogramming

TL;DR: A fault-tolerant server implemented on top of a distributed operating system, the MACH microkernel, which provides to user applications with a client-server communication mechanism where replications is transparent.

...read moreread less

Tolerance aux Fautes et Systemes Repartis: Concepts et Mecanismes

[...]

Gerard Le Lann, Pascale Minet, David Powell

01 Dec 1993

TL;DR: This report presents the concepts and mechanisms used in fault tolerant distributed systems, and introduces the concepts of fault tolerant group commucation and distributed consensus.

...read moreread less

Abstract: ARRAY(0x80e72ac) Fault-tolerance is an unavoidable requirement in distributed systems. First, multiple resources imply multiple potential causes of failure so much research on distributed systems has aimed to ensure that dependability is not degraded by distribution. Second, fault tolerance can in itself be a motivating factor for distribution. Indeed, fault tolerance cannot be ensured without redundancy, and the distribution of processing and data on different processors provides an approach for structuring and managing this redundancy. In this report, we present the concepts and mechanisms used in fault tolerant distributed systems. A distributed application can be considered either as a set of processes exchanging messages or as a set of transactions acting on distributed data items. The known techniques of fault-tolerance are given for both computational models. We then introduce the concepts of fault tolerant group commucation and distributed consensus.

...read moreread less

Book•

Decentralized and distributed systems : proceedings of the IFIP WG10.3 International Conference on Decentralized and Distributed Systems, Palma de Mallorca, Spain, 13-17 September 1993

[...]

Michel Cosnard, Ramon Puigjaner

01 Jan 1993

TL;DR: High speed interconnection of workstations: concepts, problems and experiences (B. Heinrichs, T. Meuser, O. Spaniol).

...read moreread less

Abstract: High speed interconnection of workstations: concepts, problems and experiences (B. Heinrichs, T. Meuser, O. Spaniol). High performance architecture issues (D.A. Nicole). Issues in object-oriented distributed systems (S. Krakowiak). Distributed Algorithms. What is a deadlock? (Y.C. Tay). A distributed algorithm for resource management (J. Ezpeleta, S. Haddad). Getting cooperative environment by coordinating services through a network (P. Bergougnoux, F. Barrere, P. Vidal). A distributed consensus protocol with a coordinator (F. Guerra, S. Arevalo, A. Alvarez, J. Miranda). Using global state properties to attain mutual exclusion in distributed systems (J. Vila-Carbo). Programming communicating distributed reactive automata: the weak synchronous paradigm (E Boniol, M. Adelantado). Parallel implementations of two algorithms for solving linear programming problems (G.L. Reijns, R.M. Wiegers, G.-J. Boesschen Hospers). Performance Evaluation. Analysis of the quality of service in a MAN environment (M. Conti An approximate analysis of DQDB networks with the bandwith balancing mechanism (Y. Matsumoto). LAN distributed fault-tolerance (J. Miro-Julia). A statistical study of the factors that affect the performance of a class of parallel programs on a MIMD computer (R. Candlin, J. Phillips). A multiprocessor parallel disk system evaluation (J. Carretero, F. Perez, P. de Miguel, F. Garcfa, L. Alonso). A decomposition approximation method for closed queueing networks with fork / join subnetworks (B. Baynat, Y. Dallery). Petri Nets. Modelling and analysis of deterministic concurrent systems with bulk services and arrivals (E. Teruel, J.M. Colom, M. Silva). A protocol specification language with a high-level petri net semantics (B. Zouari, S. Haddad, M. Taghelit). Interconnect. A performance comparison between the Fieldbus protocol standards PROFIBUS and FIP (M. Ettl, U. Klehmet). Analysis of a class of polling protocols for Fieldbus networks (P. Raja, G. Noubir, L. Ruiz, J. Hemandez, M. Riese, J.D. Decotignie). A theory to increase the effective redundancy in Wormhole networks (J. Duato). Distributed Operating Systems. A systematic approach to load distribution strategies for distributed systems (C. Jacqmot, E. Milgrom). A cooperative algorithm for load balancing in interconnected transputer network (H. Guyennet, F. Spies). Uniform co-scheduling, using, object-oriented design techniques (N. Islam, R. Campbell). Distributed access to persistent objects (S.B. Lim, L. Xiao, R. Campbell). Parallel Simulation. Distributed simulation: a simulation system for the discrete event systems (B. Dado, P. Menhart, J. Safarik). Towards the distributed implementation of discrete event simulation languages (J. Miguel, M. Grafia). Design Methods. Enhancing structured analysis by timed statecharts for real-time and concurrency specification (M. von der Beeck). Heuristics driven real time software design. (Part contents).

...read moreread less

Proceedings Article•DOI•

Distributed consensus with general omission failures and timing uncertainty

[...]

A.A. Bharali¹, P. Berman¹•Institutions (1)

Pennsylvania State University¹

23 Mar 1993

TL;DR: The authors provide the first asymptotically optimal distributed consensus protocol for semi-synchronous systems that tolerates general omission failures and terminates faster than the best known protocols for these failure classes.

...read moreread less

Abstract: In a distributed concensus protocol, a number of processors communicating by message passing start with some initial values. The protocol terminates with all nonfaulty processors agreeing on one of these values. The authors investigate the time needed to reach consensus in partially synchronous systems under various classes of processor failures. They provide the first asymptotically optimal distributed consensus protocol for semi-synchronous systems that tolerates general omission failures. When the failures occurring are restricted to omission and crash failures, the protocol terminates faster, matching the best known protocols for these failure classes. >

...read moreread less