Home
/
Authors
/
David A. Nichols

Author

David A. Nichols

Bio: David A. Nichols is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Network File System & Self-certifying File System. The author has an hindex of 5, co-authored 7 publications receiving 2716 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Scale and performance in a distributed file system

[...]

John H. Howard¹, Michael Kazar¹, Sherri G. Menees¹, David A. Nichols¹, Mahadev Satyanarayanan¹, Robert N. Sidebotham¹, Michael J. West¹ - Show less +3 more•Institutions (1)

Carnegie Mellon University¹

01 Feb 1988-ACM Transactions on Computer Systems

TL;DR: Observations of a prototype implementation are presented, changes in the areas of cache validation, server process structure, name translation, and low-level storage representation are motivated, and Andrews ability to scale gracefully is quantitatively demonstrated.

...read moreread less

Abstract: The Andrew File System is a location-transparent distributed tile system that will eventually span more than 5000 workstations at Carnegie Mellon University. Large scale affects performance and complicates system operation. In this paper we present observations of a prototype implementation, motivate changes in the areas of cache validation, server process structure, name translation, and low-level storage representation, and quantitatively demonstrate Andrews ability to scale gracefully. We establish the importance of whole-file transfer and caching in Andrew by comparing its performance with that of Sun Microsystems NFS tile system. We also show how the aggregation of files into volumes improves the operability of the system.

...read moreread less

1,604 citations

Journal Article•DOI•

Scale and performance in a distributed file system

[...]

John H. Howard¹, Michael Kazar¹, Sherri G. Menees¹, David A. Nichols¹, Mahadev Satyanarayanan¹, Robert N. Sidebotham¹, Michael J. West¹ - Show less +3 more•Institutions (1)

Carnegie Mellon University¹

01 Nov 1987

TL;DR: This paper examines the consequences of the design decision to transfer whole files between servers and workstations rather than some smaller unit such as records or blocks, as almost all other distributed file systems do, and compares the whole file transfer strategy with that of a block-oriented file system, Sun Microsystems' NFS.

...read moreread less

Abstract: Andrew is a distributed computing environment being developed in a joint project by Carnegie Mellon University and IBM. One of the major components of Andrew is a distributed file system which constitutes underlying mechanism for sharing information. The goals of the Andrew file system are to support growth up to at least 7000 workstations (one for each student, faculty member, and staff at Carnegie Mellon) while providing users, application programs, and system administrators with the amenities of a shared file system.A fundamental result of our concern with scale is the design decision to transfer whole files between servers and workstations rather than some smaller unit such as records or blocks, as almost all other distributed file systems do. This paper examines the consequences of this and other design decisions and features that bear on the scalability of Andrew.Large scale affects a distributed system in two ways: it degrades performance and it complicates administration and day-to-day operation. This paper addresses both concerns and shows that the mechanisms we have incorporated cope with them successfully. We start the initial prototype of the system, what we learned from it, and how we changed the system to improve performance. We compare its performance with that of a block-oriented file system, Sun Microsystems' NFS, in order to evaluate the whole file transfer strategy. We then turn to operability, and finish with issues related peripherally to scale and with the ways the present design could be enchanced.

...read moreread less

663 citations

Journal Article•DOI•

The ITC distributed file system: principles and design

[...]

Mahadev Satyanarayanan¹, John H. Howard¹, David A. Nichols¹, Robert N. Sidebotham¹, Alfred Z. Spector¹, Michael J. West¹ - Show less +2 more•Institutions (1)

Carnegie Mellon University¹

01 Dec 1985

TL;DR: This paper presents the design and rationale of a distributed file system for a network of more than 5000 personal computer workstations, with careful attention paid to the goals of location transparency, user mobility and compatibility with existing operating system interfaces.

...read moreread less

Abstract: This paper presents the design and rationale of a distributed file system for a network of more than 5000 personal computer workstations. While scale has been the dominant design influence, careful attention has also been paid to the goals of location transparency, user mobility and compatibility with existing operating system interfaces. Security is an important design consideration, and the mechanisms for it do not assume that the workstations or the network are secure. Caching of entire files at workstations is a key element in this design. A prototype of this system has been built and is in use by a user community of about 400 individuals. A refined implementation that will scale more gracefully and provide better performance is close to completion.

...read moreread less

298 citations

Journal Article•DOI•

Using idle workstations in a shared computing environment

[...]

David A. Nichols¹•Institutions (1)

Carnegie Mellon University¹

01 Nov 1987

TL;DR: An application of the Butler system known as gypsy servers, which allow network server programs to be run on idle workstations instead of using dedicated server machines, is described.

...read moreread less

Abstract: The Butler system is a set of programs running on Andrew workstations at CMU that give users access to idle workstations. Current Andrew users use the system over 300 times per day. This paper describes the implementation of the Butler system and tells of our experience in using it. In addition, it describes an application of the system known as gypsy servers, which allow network server programs to be run on idle workstations instead of using dedicated server machines.

...read moreread less

151 citations

Multiprocessing in a network of workstations

[...]

James Morris, David A. Nichols

01 Jan 1989

TL;DR: The second part of the thesis examines the performance of a particular file system, the Andrew File System (AFS), developed at CMU and examines the effects of proposed changes to the system, such as the use of encryption during transmission of file data.

...read moreread less

Abstract: The recent move to workstation-based computing environments has introduced a new point in the design space of multiprocessors: a loosely-coupled collection of workstations using a network file system for shared memory. One problem with such a system is managing the available workstations and making them available to clients on demand. The Butler system has been running at CMU for three years and is used hundreds of times daily to allow students and faculty to use idle workstations. I discovered that the system is used far more for interactive programs than expected. Surprisingly, security attacks involving the Butler system have been quite rare, despite the large student population among its users. A natural class of U scNIX applications that can take advantage of idle workstations includes programs consisting of multiple processes communicating via a shared file system. With such applications, the file system becomes a bottleneck for performance. The second part of the thesis examines the performance of a particular file system, the Andrew File System (AFS), developed at CMU. The major tool for the AFS performance analysis is a discrete-event simulation of the file server and its client workstations. The simulation's accuracy is verified by comparison with experiments run on the file system. Experiments show that the model's parameters can be used to construct a simple linear equation model of the server. While this model is not accurate under conditions when resources are nearing exhaustion, it is useful for a wide range of normal operation. Using the simulation, I estimate the effects of various parameters on AFS performance, such as network latency, CPU speed, and disk seek time. In addition, I examine the effects of proposed changes to the system, such as the use of encryption during transmission of file data. The simulation provides a number of insights about the operation of AFS. These include the fact that AFS is very CPU-limited, that it achieves respectable performance while using relatively slow communications primitives, and that it can handle a wide range of workloads without thrashing. The conclusions give more general observations about AFS and the process of constructing its simulator.

...read moreread less

12 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Google file system

[...]

Sanjay Ghemawat¹, Howard Gobioff¹, Shun-Tak Albert Leung¹•Institutions (1)

Google¹

19 Oct 2003

TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.

...read moreread less

Abstract: We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While sharing many of the same goals as previous distributed file systems, our design has been driven by observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore radically different design points. The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets. The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients. In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real world use.

...read moreread less

5,429 citations

Proceedings Article•DOI•

Practical Byzantine fault tolerance

[...]

Miguel Castro¹, Barbara Liskov¹•Institutions (1)

Massachusetts Institute of Technology¹

22 Feb 1999

TL;DR: A new replication algorithm that is able to tolerate Byzantine faults that works in asynchronous environments like the Internet and incorporates several important optimizations that improve the response time of previous algorithms by more than an order of magnitude.

...read moreread less

Abstract: This paper describes a new replication algorithm that is able to tolerate Byzantine faults. We believe that Byzantinefault-tolerant algorithms will be increasingly important in the future because malicious attacks and software errors are increasingly common and can cause faulty nodes to exhibit arbitrary behavior. Whereas previous algorithms assumed a synchronous system or were too slow to be used in practice, the algorithm described in this paper is practical: it works in asynchronous environments like the Internet and incorporates several important optimizations that improve the response time of previous algorithms by more than an order of magnitude. We implemented a Byzantine-fault-tolerant NFS service using our algorithm and measured its performance. The results show that our service is only 3% slower than a standard unreplicated NFS.

...read moreread less

3,562 citations

Journal Article•DOI•

OceanStore: an architecture for global-scale persistent storage

[...]

John Kubiatowicz¹, David Bindel¹, Yan Chen¹, Steven E. Czerwinski¹, Patrick Eaton¹, Dennis Geels¹, Ramakrishna Gummadi¹, Sean Rhea¹, Hakim Weatherspoon¹, Westley Weimer¹, Chris Wells¹, Ben Y. Zhao¹ - Show less +8 more•Institutions (1)

University of California, Berkeley¹

12 Nov 2000

TL;DR: OceanStore monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data.

...read moreread less

Abstract: OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development.

...read moreread less

3,376 citations

Proceedings Article•DOI•

Condor-a hunter of idle workstations

[...]

M. Litzkow¹, Miron Livny¹, Matt W. Mutka¹•Institutions (1)

University of Wisconsin-Madison¹

13 Jun 1988

TL;DR: The design, implementation, and performance of the Condor scheduling system, which operates in a workstation environment, are presented and a performance profile of the system is presented that is based on data accumulated from 23 stations during one month.

...read moreread less

Abstract: The design, implementation, and performance of the Condor scheduling system, which operates in a workstation environment, are presented. The system aims to maximize the utilization of workstations with as little interference as possible between the jobs it schedules and the activities of the people who own workstations. It identifies idle workstations and schedules background jobs on them. When the owner of a workstation resumes activity at a station, Condor checkpoints the remote job running on the station and transfers it to another workstation. The system guarantees that the job will eventually complete, and that very little, if any, work will be performed more than once. A performance profile of the system is presented that is based on data accumulated from 23 stations during one month. >

...read moreread less

2,570 citations

Journal Article•DOI•

Big Data: A Survey

[...]

Min Chen¹, Shiwen Mao², Yunhao Liu³•Institutions (3)

Huazhong University of Science and Technology¹, Auburn University², Tsinghua University³

01 Apr 2014-Mobile Networks and Applications

TL;DR: The background and state-of-the-art of big data are reviewed, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid, as well as related technologies.

...read moreread less

Abstract: In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.

...read moreread less

2,303 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse