Search or ask a question

Showing papers by "Miguel Castro published in 2001"

PDF

Open Access

Book Chapter•DOI•

SCRIBE: The Design of a Large-Scale Event Notification Infrastructure

[...]

Antony Rowstron¹, Anne-Marie Kermarrec¹, Miguel Castro¹, Peter Druschel²•Institutions (2)

Microsoft¹, Rice University²

07 Nov 2001-Lecture Notes in Computer Science

TL;DR: Scribe is built on top of Pastry, a generic peer-to-peer object location and routing substrate overlayed on the Internet, and leverages Pastry's reliability, self-organization and locality properties.

...read moreread less

Abstract: This paper presents Scribe, a large-scale event notification infrastructure for topic-based publish-subscribe applications. Scribe supports large numbers of topics, with a potentially large number of subscribers per topic. Scribe is built on top of Pastry, a generic peer-to-peer object location and routing substrate overlayed on the Internet, and leverages Pastry's reliability, self-organization and locality properties. Pastryi s used to create a topic (group) and to build an efficient multicast tree for the dissemination of events to the topic's subscribers (members). Scribe provides weak reliability guarantees, but we outline how an application can extend Scribe to provide stronger ones.

...read moreread less

637 citations

Proceedings Article•DOI•

BASE: using abstraction to improve fault tolerance

[...]

Rodrigo Rodrigues¹, Miguel Castro², Barbara Liskov¹•Institutions (2)

Massachusetts Institute of Technology¹, Microsoft²

21 Oct 2001

TL;DR: A replication technique, BASE, is described, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors.

...read moreread less

Abstract: Software errors are a major cause of outages and they are increasingly exploited in malicious attacks Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy This paper describes a replication technique, BASE, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors BASE reduces cost because it enables reuse of off-the-shelf service implementations It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or non-deterministic service implementations, which reduces the probability of common mode failures We built an NFS service where each replica can run a different off-the-shelf file system implementation, and an object-oriented database where the replicas ran the same, non-deterministic implementation These examples suggest that our technique can be used in practice --- in both cases, the implementation required only a modest amount of new code, and our performance results indicate that the replicated services perform comparably to the implementations that they reuse

...read moreread less

254 citations

Proceedings Article•DOI•

Using abstraction to improve fault tolerance

[...]

Miguel Castro¹, Rodrigo Rodrigues², Barbara Liskov²•Institutions (2)

Microsoft¹, Massachusetts Institute of Technology²

20 May 2001

TL;DR: BFTA as discussed by the authors is a replication technique that uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors, which reduces the probability of common mode failures.

...read moreread less

Abstract: Software errors are a major cause of outages and they are increasingly exploited in malicious attacks. Byzantine fault tolerance allows replicated systems to mask some software errors but it is expensive to deploy. The paper describes a replication technique, BFTA, which uses abstraction to reduce the cost of Byzantine fault tolerance and to improve its ability to mask software errors. BFTA reduces cost because it enables reuse of off-the-shelf service implementations. It improves availability because each replica can be repaired periodically using an abstract view of the state stored by correct replicas, and because each replica can run distinct or non-deterministic service implementations, which reduces the probability of common mode failures. We built an NFS service that allows each replica to run a different operating system. This example suggests that BFTA can be used in practice; the replicated file system required only a modest amount of new code, and preliminary performance results indicate that it performs comparably to the off-the-shelf implementations that it wraps.

...read moreread less

159 citations

Proceedings Article•DOI•

Byzantine fault tolerance can be fast

[...]

Miguel Castro¹, Barbara Liskov²•Institutions (2)

Microsoft¹, Massachusetts Institute of Technology²

01 Jul 2001

TL;DR: A replicated NFS file system is implemented using BFT, a state-machine replication algorithm that tolerates Byzantine faults in asynchronous systems that performs 2% faster to 24% slower than production implementations of the NFS protocol that are not fault-tolerant.

...read moreread less

Abstract: Byzantine fault tolerance is important because it can be used to implement highly-available systems that tolerate arbitrary behavior from faulty components. We present a detailed performance evaluation of BFT, a state-machine replication algorithm that tolerates Byzantine faults in asynchronous systems. Our results contradict the common belief that Byzantine fault tolerance is too slow to be used in practice, BFT performs well so that it can be used to implement real systems. We implemented a replicated NFS file system using BFT that performs 2% faster to 24% slower than production implementations of the NFS protocol that are not fault-tolerant.

...read moreread less

24 citations