scispace - formally typeset
Search or ask a question

Showing papers by "Thomas Anderson published in 2015"


Proceedings ArticleDOI
03 Jun 2015
TL;DR: Verdi, a framework for implementing and formally verifying distributed systems in Coq, formalizes various network semantics with different faults, and enables the developer to first verify their system under an idealized fault model then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden.
Abstract: Distributed systems are difficult to implement correctly because they must handle both concurrency and failures: machines may crash at arbitrary points and networks may reorder, drop, or duplicate packets. Further, their behavior is often too complex to permit exhaustive testing. Bugs in these systems have led to the loss of critical data and unacceptable service outages. We present Verdi, a framework for implementing and formally verifying distributed systems in Coq. Verdi formalizes various network semantics with different faults, and the developer chooses the most appropriate fault model when verifying their implementation. Furthermore, Verdi eases the verification burden by enabling the developer to first verify their system under an idealized fault model, then transfer the resulting correctness guarantees to a more realistic fault model without any additional proof burden. To demonstrate Verdi's utility, we present the first mechanically checked proof of linearizability of the Raft state machine replication algorithm, as well as verified implementations of a primary-backup replication system and a key-value store. These verified systems provide similar performance to unverified equivalents.

275 citations


Journal ArticleDOI
TL;DR: Arrakis as discussed by the authors splits the traditional role of the kernel in two, allowing most I/O operations to skip the kernel entirely, while the kernel is re-engineered to provide network and disk protection without kernel mediation of every operation.
Abstract: Recent device hardware trends enable a new approach to the design of network server operating systems. In a traditional operating system, the kernel mediates access to device hardware by server applications to enforce process isolation as well as network and disk security. We have designed and implemented a new operating system, Arrakis, that splits the traditional role of the kernel in two. Applications have direct access to virtualized I/O devices, allowing most I/O operations to skip the kernel entirely, while the kernel is re-engineered to provide network and disk protection without kernel mediation of every operation. We describe the hardware and software changes needed to take advantage of this new abstraction, and we illustrate its power by showing improvements of 2 to 5 × in latency and 9 × throughput for a popular persistent NoSQL store relative to a well-tuned Linux implementation.

63 citations


Proceedings Article
08 Jul 2015
TL;DR: This work design, implement, and evaluate MetaSync, a secure and reliable file synchronization service that uses multiple cloud synchronization services as untrusted storage providers that provides efficient and consistent updates on top of the unmodified APIs exported by existing services.
Abstract: Cloud-based file synchronization services, such as Drop-box, are a worldwide resource for many millions of users. However, individual services often have tight resource limits, suffer from temporary outages or even shutdowns, and sometimes silently corrupt or leak user data. We design, implement, and evaluate MetaSync, a secure and reliable file synchronization service that uses multiple cloud synchronization services as untrusted storage providers. To make MetaSync work correctly, we devise a novel variant of Paxos that provides efficient and consistent updates on top of the unmodified APIs exported by existing services. Our system automatically redistributes files upon reconfiguration of providers. Our evaluation shows that MetaSync provides low update latency and high update throughput while being more trustworthy and available. MetaSync outperforms its underlying cloud services by 1.2-10×on three realistic workloads.

37 citations


Proceedings ArticleDOI
01 Dec 2015
TL;DR: This paper proposes and evaluates Subways, a new approach to wiring servers and Top-of-Rack (ToR) switches that provides an inexpensive incremental upgrade path as well as decreased network congestion, better load balancing, and improved fault tolerance.
Abstract: As network demand increases, data center network operators face a number of challenges including the need to add capacity to the network. Unfortunately, network upgrades can be an expensive proposition, particularly at the edge of the network where most of the network's cost lies.This paper presents a quantitative study of alternative ways of wiring multiple server links into a data center network. In it, we propose and evaluate Subways, a new approach to wiring servers and Top-of-Rack (ToR) switches that provides an inexpensive incremental upgrade path as well as decreased network congestion, better load balancing, and improved fault tolerance. Our simulation-based results show that Subways significantly improves performance compared to alternative ways of wiring the same number of links and switches together. For example, we show that Subways offers up to 3.1× better performance on a MapReduce shuffle workload compared to an equivalent capacity network.

23 citations


Proceedings Article
18 May 2015
TL;DR: FlexNIC, a flexible network DMA interface that can be used by operating systems and applications alike to reduce packet processing overheads, is proposed and shown how it can benefit widely used data center server applications, such as key-value stores.
Abstract: We propose FlexNIC, a flexible network DMA interface that can be used by operating systems and applications alike to reduce packet processing overheads. The recent surge of network I/O performance has put enormous pressure on memory and software I/O processing subsystems. Yet even at high speeds, flexibility in packet handling is still important for security, performance isolation, and virtualization. Thus, our proposal moves some of the packet processing traditionally done in software to the NIC DMA controller, where it can be done flexibly and at high speed. We show how FlexNIC can benefit widely used data center server applications, such as key-value stores.

21 citations