Author
David P. Reed
Other affiliations: Wellesley College
Bio: David P. Reed is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Synchronization (computer science) & Encryption. The author has an hindex of 15, co-authored 22 publications receiving 9001 citations. Previous affiliations of David P. Reed include Wellesley College.
Papers
More filters
••
TL;DR: A novel scheme that first selects the best relay from a set of M available relays and then uses this "best" relay for cooperation between the source and the destination and achieves the same diversity-multiplexing tradeoff as achieved by more complex protocols.
Abstract: Cooperative diversity has been recently proposed as a way to form virtual antenna arrays that provide dramatic gains in slow fading wireless environments. However, most of the proposed solutions require distributed space-time coding algorithms, the careful design of which is left for future investigation if there is more than one cooperative relay. We propose a novel scheme that alleviates these problems and provides diversity gains on the order of the number of relays in the network. Our scheme first selects the best relay from a set of M available relays and then uses this "best" relay for cooperation between the source and the destination. We develop and analyze a distributed method to select the best relay that requires no topology information and is based on local measurements of the instantaneous channel conditions. This method also requires no explicit communication among the relays. The success (or failure) to select the best available path depends on the statistics of the wireless channel, and a methodology to evaluate performance for any kind of wireless channel statistics, is provided. Information theoretic analysis of outage probability shows that our scheme achieves the same diversity-multiplexing tradeoff as achieved by more complex protocols, where coordination and distributed space-time coding for M relay nodes is required, such as those proposed by Laneman and Wornell (2003). The simplicity of the technique allows for immediate implementation in existing radio hardware and its adoption could provide for improved flexibility, reliability, and efficiency in future 4G wireless systems.
3,153 citations
•
01 Jan 1981TL;DR: A design principle is presented that helps guide placement of functions among the modules of a distributed computer system and suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level.
Abstract: This paper presents a design principle that helps guide placement of functions among the modules of a distributed computer system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level. Examples discussed in the paper include bit error recovery, security using encryption, duplicate message suppression, recovery from system crashes, and delivery acknowledgement. Low level mechanisms to support these functions are justified only as performance enhancements.
2,091 citations
••
TL;DR: The end-to-end argument as discussed by the authors suggests that functions placed at low levels of a distributed computer system may be redundant or of little value when compared with the cost of providing them at that low level.
Abstract: This paper presents a design principle that helps guide placement of functions among the modules of a distributed computer system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level. Examples discussed in the paper include bit error recovery, security using encryption, duplicate message suppression, recovery from system crashes, and delivery acknowledgement. Low level mechanisms to support these functions are justified only as performance enhancements.
1,892 citations
•
01 Oct 1978
TL;DR: A new approach to the synchronization of accesses to shared data objects is developed, called NAMOS, which provides a useful tool for restoring a consistent state of the system after a failure resulting in irrecoverable loss of information or a user mistake resulting in an inconsistent state.
Abstract: In this dissertation a new approach to the synchronization of accesses to shared data objects is developed. Traditional approaches to the synchronization problems of shared data accessed by concurrently running computations have relied on mutual exclusion -- the ability of one computation to stop the execution of other computations that might access or change shared data accessed by that computation. Our approach is quite different. We regard an object that is modifiable as a sequence of immutable versions; each version is the state of the object after an update is made to the object. Synchronization can then be treated as a mechanism for naming versions to be read and for defining where in the sequence of versions the version resulting from some update should be placed. In systems based on mutual exclusion, the timing of accesses selects the versions accessed. In the system developed here, called NAMOS, versions have two component names consisting of the name of an object and a pseudo-time, the name of the system state to which the version belongs. By giving programs control over the pseudo-time in which an access is made, synchronization of access to multiple objects is simplified. NAMOS is intended to be used in an environment where unreliable components, such as communication lines and processors, and autonomous control of resources occasionally cause certain objects to become inaccessible, perhaps in the middle of an atomic transaction. Computations may also suddenly halt (perhaps as the result of a system crash) never to be restarted. NAMOS provides facilities for recovering from such sudden failures, grouping updates into sets called possibilities, such that failure of any update belonging to a possibility prevents all of the other updates in the possibility. The naming mechanism of NAMOS also provides a useful tool for restoring a consistent state of the system after a failure resulting in irrecoverable loss of information or a user mistake resulting in an inconsistent state. An important motivation for the development of NAMOS is the need to support decentralized development of application systems by combining existing application systems that deal with shared data. NAMOS supports the construction of modules that locally ensure their own correct synchronization and recovery from inaccessibility. Larger modules that use several separately designed modules can then be constructed, perhaps with additional synchronization constraints, without modifying the modules used. In most systems based on mutual exclusion, such post hoc integration of modules is difficult or impossible.
395 citations
•
01 Jan 1979
TL;DR: In this paper, the authors describe a mechanism that solves synchronization of accesses to shared data and recovering the state of such data in the case of failures in a decentralized system.
Abstract: Synchronization of accesses to shared data and recovering the state of such data in the case of failures are really two aspects of the same problem--implementing atomic actions on a related set of data items. In this paper a mechanism that solves both problems simultaneously in a way that is compatible with requirements of decentralized systems is described. In particular, the correct construction and execution of a new atomic action can be accomplished without knowledge of all other atomic actions in the system that might execute concurrently. Further, the mechanisms degrade gracefully if parts of the system fail: only those atomic actions that require resources in failed parts of the system are prevented from executing, and there is no single coordinator that can fail and bring down the whole system.
284 citations
Cited by
More filters
••
01 Aug 2000TL;DR: Greedy Perimeter Stateless Routing is presented, a novel routing protocol for wireless datagram networks that uses the positions of routers and a packet's destination to make packet forwarding decisions and its scalability on densely deployed wireless networks is demonstrated.
Abstract: We present Greedy Perimeter Stateless Routing (GPSR), a novel routing protocol for wireless datagram networks that uses the positions of routers and a packet's destination to make packet forwarding decisions. GPSR makes greedy forwarding decisions using only information about a router's immediate neighbors in the network topology. When a packet reaches a region where greedy forwarding is impossible, the algorithm recovers by routing around the perimeter of the region. By keeping state only about the local topology, GPSR scales better in per-router state than shortest-path and ad-hoc routing protocols as the number of network destinations increases. Under mobility's frequent topology changes, GPSR can use local topology information to find correct new routes quickly. We describe the GPSR protocol, and use extensive simulation of mobile wireless networks to compare its performance with that of Dynamic Source Routing. Our simulations demonstrate GPSR's scalability on densely deployed wireless networks.
7,384 citations
••
TL;DR: In this paper, it is shown that every protocol for this problem has the possibility of nontermination, even with only one faulty process.
Abstract: The consensus problem involves an asynchronous system of processes, some of which may be unreliable The problem is for the reliable processes to agree on a binary value In this paper, it is shown that every protocol for this problem has the possibility of nontermination, even with only one faulty process By way of contrast, solutions are known for the synchronous case, the “Byzantine Generals” problem
4,389 citations
••
25 Aug 2003TL;DR: This work proposes a network architecture and application interface structured around optionally-reliable asynchronous message forwarding, with limited expectations of end-to-end connectivity and node resources.
Abstract: The highly successful architecture and protocols of today's Internet may operate poorly in environments characterized by very long delay paths and frequent network partitions. These problems are exacerbated by end nodes with limited power or memory resources. Often deployed in mobile and extreme environments lacking continuous connectivity, many such networks have their own specialized protocols, and do not utilize IP. To achieve interoperability between them, we propose a network architecture and application interface structured around optionally-reliable asynchronous message forwarding, with limited expectations of end-to-end connectivity and node resources. The architecture operates as an overlay above the transport layers of the networks it interconnects, and provides key services such as in-network data storage and retransmission, interoperable naming, authenticated forwarding and a coarse-grained class of service.
3,511 citations
••
TL;DR: This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.
Abstract: A concurrent object is a data object shared by concurrent processes. Linearizability is a correctness condition for concurrent objects that exploits the semantics of abstract data types. It permits a high degree of concurrency, yet it permits programmers to specify and reason about concurrent objects using known techniques from the sequential domain. Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response, implying that the meaning of a concurrent object's operations can be given by pre- and post-conditions. This paper defines linearizability, compares it to other correctness conditions, presents and demonstrates a method for proving the correctness of implementations, and shows how to reason about concurrent objects, given they are linearizable.
3,396 citations
•
01 Jan 2004
TL;DR: This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies, allowing a designer to see all the steps of the process from abstract design to concrete implementation.
Abstract: One of the greatest challenges faced by designers of digital systems is optimizing the communication and interconnection between system components. Interconnection networks offer an attractive and economical solution to this communication crisis and are fast becoming pervasive in digital systems. Current trends suggest that this communication bottleneck will be even more problematic when designing future generations of machines. Consequently, the anatomy of an interconnection network router and science of interconnection network design will only grow in importance in the coming years.
This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies. It incorporates hardware-level descriptions of concepts, allowing a designer to see all the steps of the process from abstract design to concrete implementation.
·Case studies throughout the book draw on extensive author experience in designing interconnection networks over a period of more than twenty years, providing real world examples of what works, and what doesn't.
·Tightly couples concepts with implementation costs to facilitate a deeper understanding of the tradeoffs in the design of a practical network.
·A set of examples and exercises in every chapter help the reader to fully understand all the implications of every design decision.
Table of Contents
Chapter 1 Introduction to Interconnection Networks
1.1 Three Questions About Interconnection Networks
1.2 Uses of Interconnection Networks
1.3 Network Basics
1.4 History
1.5 Organization of this Book
Chapter 2 A Simple Interconnection Network
2.1 Network Specifications and Constraints
2.2 Topology
2.3 Routing
2.4 Flow Control
2.5 Router Design
2.6 Performance Analysis
2.7 Exercises
Chapter 3 Topology Basics
3.1 Nomenclature
3.2 Traffic Patterns
3.3 Performance
3.4 Packaging Cost
3.5 Case Study: The SGI Origin 2000
3.6 Bibliographic Notes
3.7 Exercises
Chapter 4 Butterfly Networks
4.1 The Structure of Butterfly Networks
4.2 Isomorphic Butterflies
4.3 Performance and Packaging Cost
4.4 Path Diversity and Extra Stages
4.5 Case Study: The BBN Butterfly
4.6 Bibliographic Notes
4.7 Exercises
Chapter 5 Torus Networks
5.1 The Structure of Torus Networks
5.2 Performance
5.3 Building Mesh and Torus Networks
5.4 Express Cubes
5.5 Case Study: The MIT J-Machine
5.6 Bibliographic Notes
5.7 Exercises
Chapter 6 Non-Blocking Networks
6.1 Non-Blocking vs. Non-Interfering Networks
6.2 Crossbar Networks
6.3 Clos Networks
6.4 Benes Networks
6.5 Sorting Networks
6.6 Case Study: The Velio VC2002 (Zeus) Grooming Switch
6.7 Bibliographic Notes
6.8 Exercises
Chapter 7 Slicing and Dicing
7.1 Concentrators and Distributors
7.2 Slicing and Dicing
7.3 Slicing Multistage Networks
7.4 Case Study: Bit Slicing in the Tiny Tera
7.5 Bibliographic Notes
7.6 Exercises
Chapter 8 Routing Basics
8.1 A Routing Example
8.2 Taxonomy of Routing Algorithms
8.3 The Routing Relation
8.4 Deterministic Routing
8.5 Case Study: Dimension-Order Routing in the Cray T3D
8.6 Bibliographic Notes
8.7 Exercises
Chapter 9 Oblivious Routing
9.1 Valiant's Randomized Routing Algorithm
9.2 Minimal Oblivious Routing
9.3 Load-Balanced Oblivious Routing
9.4 Analysis of Oblivious Routing
9.5 Case Study: Oblivious Routing in the
Avici Terabit Switch Router(TSR)
9.6 Bibliographic Notes
9.7 Exercises
Chapter 10 Adaptive Routing
10.1 Adaptive Routing Basics
10.2 Minimal Adaptive Routing
10.3 Fully Adaptive Routing
10.4 Load-Balanced Adaptive Routing
10.5 Search-Based Routing
10.6 Case Study: Adaptive Routing in the
Thinking Machines CM-5
10.7 Bibliographic Notes
10.8 Exercises
Chapter 11 Routing Mechanics
11.1 Table-Based Routing
11.2 Algorithmic Routing
11.3 Case Study: Oblivious Source Routing in the
IBM Vulcan Network
11.4 Bibliographic Notes
11.5 Exercises
Chapter 12 Flow Control Basics
12.1 Resources and Allocation Units
12.2 Bufferless Flow Control
12.3 Circuit Switching
12.4 Bibliographic Notes
12.5 Exercises
Chapter 13 Buffered Flow Control
13.1 Packet-Buffer Flow Control
13.2 Flit-Buffer Flow Control
13.3 Buffer Management and Backpressure
13.4 Flit-Reservation Flow Control
13.5 Bibliographic Notes
13.6 Exercises
Chapter 14 Deadlock and Livelock
14.1 Deadlock
14.2 Deadlock Avoidance
14.3 Adaptive Routing
14.4 Deadlock Recovery
14.5 Livelock
14.6 Case Study: Deadlock Avoidance in the Cray T3E
14.7 Bibliographic Notes
14.8 Exercises
Chapter 15 Quality of Service
15.1 Service Classes and Service Contracts
15.2 Burstiness and Network Delays
15.3 Implementation of Guaranteed Services
15.4 Implementation of Best-Effort Services
15.5 Separation of Resources
15.6 Case Study: ATM Service Classes
15.7 Case Study: Virtual Networks in the Avici TSR
15.8 Bibliographic Notes
15.9 Exercises
Chapter 16 Router Architecture
16.1 Basic Router Architecture
16.2 Stalls
16.3 Closing the Loop with Credits
16.4 Reallocating a Channel
16.5 Speculation and Lookahead
16.6 Flit and Credit Encoding
16.7 Case Study: The Alpha 21364 Router
16.8 Bibliographic Notes
16.9 Exercises
Chapter 17 Router Datapath Components
17.1 Input Buffer Organization
17.2 Switches
17.3 Output Organization
17.4 Case Study: The Datapath of the IBM Colony
Router
17.5 Bibliographic Notes
17.6 Exercises
Chapter 18 Arbitration
18.1 Arbitration Timing
18.2 Fairness
18.3 Fixed Priority Arbiter
18.4 Variable Priority Iterative Arbiters
18.5 Matrix Arbiter
18.6 Queuing Arbiter
18.7 Exercises
Chapter 19 Allocation
19.1 Representations
19.2 Exact Algorithms
19.3 Separable Allocators
19.4 Wavefront Allocator
19.5 Incremental vs. Batch Allocation
19.6 Multistage Allocation
19.7 Performance of Allocators
19.8 Case Study: The Tiny Tera Allocator
19.9 Bibliographic Notes
19.10 Exercises
Chapter 20 Network Interfaces
20.1 Processor-Network Interface
20.2 Shared-Memory Interface
20.3 Line-Fabric Interface
20.4 Case Study: The MIT M-Machine Network Interface
20.5 Bibliographic Notes
20.6 Exercises
Chapter 21 Error Control 411
21.1 Know Thy Enemy: Failure Modes and Fault Models
21.2 The Error Control Process: Detection, Containment,
and Recovery
21.3 Link Level Error Control
21.4 Router Error Control
21.5 Network-Level Error Control
21.6 End-to-end Error Control
21.7 Bibliographic Notes
21.8 Exercises
Chapter 22 Buses
22.1 Bus Basics
22.2 Bus Arbitration
22.3 High Performance Bus Protocol
22.4 From Buses to Networks
22.5 Case Study: The PCI Bus
22.6 Bibliographic Notes
22.7 Exercises
Chapter 23 Performance Analysis
23.1 Measures of Interconnection Network Performance
23.2 Analysis
23.3 Validation
23.4 Case Study: Efficiency and Loss in the
BBN Monarch Network
23.5 Bibliographic Notes
23.6 Exercises
Chapter 24 Simulation
24.1 Levels of Detail
24.2 Network Workloads
24.3 Simulation Measurements
24.4 Simulator Design
24.5 Bibliographic Notes
24.6 Exercises
Chapter 25 Simulation Examples 495
25.1 Routing
25.2 Flow Control Performance
25.3 Fault Tolerance
Appendix A Nomenclature
Appendix B Glossary
Appendix C Network Simulator
3,233 citations