scispace - formally typeset
Search or ask a question

Showing papers on "Distributed File System published in 1991"


Proceedings ArticleDOI
01 Sep 1991
TL;DR: This work analyzed the user-level file access patterns and caching behavior of the Sprite distributed file system and found that client cache consistency is needed to prevent stale data errors, but that it is not invoked often enough to degrade overall system performance.
Abstract: We analyzed the user-level file access patterns and caching behavior of the Sprite distributed file system. The first part of our analysis repeated a study done in 1985 of the: BSD UNIX file system. We found that file throughput has increased by a factor of 20 to an average of 8 Kbytes per second per active user over 10-minute intervals, and that the use of process migration for load sharing increased burst rates by another factor of six. Also, many more very large (multi-megabyte) files are in use today than in 1985. The second part of our analysis measured the behavior of Sprite's main-memory file caches. Client-level caches average about 7 Mbytes in size (about one-quarter to one-third of main memory) and filter out about 50% of the traffic between clients and servers. 35% of the remaining server traffic is caused by paging, even on workstations with large memories. We found that client cache consistency is needed to prevent stale data errors, but that it is not invoked often enough to degrade overall system performance.

593 citations


Patent
Kazutoshi Yamada1
12 Sep 1991
TL;DR: In this article, a data processing system with data transmission failure recovery measures includes a plurality of first-and second-POS terminals, and each POS terminal is connected to two transmission lines.
Abstract: A data processing system with data transmission failure recovery measures includes a plurality of first POS terminals and a plurality of second POS terminals. Each POS terminal is connected to two transmission lines. A first file server is connected to one of the transmission lines. A second file server having stored contents identical to that of the first file server is connected to the other transmission line. The first POS terminal accesses the first file server, and the second POS terminal accesses the second file server during normal transmission. This permits incorporation of a plurality of files and allows a large number of POS terminals to be installed while alleviating access concentration of the files. Because the transmission lines and files are multiplexed, data processing including data transmission can be continued, thereby avoiding a system down, by switching from the normal transmission line or file when a transmission failure occurs. The two file servers are connected to each other via another transmission line to update the stored content of one file via the third transmission line to conform to the update of the stored contents of the other file. Thus, after updating, the stored contents of those files are always identical and the same data can always be obtained regardless of which file is accessed.

71 citations


Journal ArticleDOI
TL;DR: A new model for replication in distributed systems that combines the advantages of modular redundancy and primary‐stand‐by approaches to give more flexibility with respect to system configuration is introduced.
Abstract: SUMMARY We introduce a new model for replication in distributed systems. The primary motivation for replication lies in fault tolerance. Although there are different kinds of replication approaches, our model combines the advantages of modular redundancy and primary-stand-by approaches to give more flexibility with respect to system configuration. To implement such a model, we select the IBM PC-net with MS-DOS environment as our base. Transparency as well as fault-tolerance file access are the highlights of our system design. To fulfil these requirements, we incorporate the idea of directory-oriented replication and extended prefix tab/es in the system design. The implementation consists of a command shell, a DOS manager, and a recovery manager. Through this design, we can simulate a UNIX-like distributed file system whose function is compatible with MS-DOS.

42 citations


Book ChapterDOI
02 Jan 1991
TL;DR: In this article, two basic complexity measures for distributed algorithms, time and message complexity, are defined: time complexity measures the time needed both for message transmission and for computation within the processes; the most common measure of message complexity is the total number of messages transmitted.
Abstract: Publisher Summary Distributed computing is an activity that is performed on a spatially distributed system. An important problem in distributed computing is to provide a user with a non-distributed view of a distributed system to implement a distributed file system that allows the client programmer to ignore the physical location of his data. The models of computation generally considered to be distributed are process models in which the computational activity is represented as the concurrent execution of sequential processes. Different process models are distinguished by the mechanism employed for inter-process communication. The process models that are most distributed are the ones in which processes communicate by message passing. A process sends a message by adding it to a message queue and another process receives the message by removing it from the queue. There are two basic complexity measures for distributed algorithms: time and message complexity. The time complexity of an algorithm measures the time needed both for message transmission and for computation within the processes. The most common measure of message complexity is the total number of messages transmitted. If messages contain on the order of a few hundred bits or more, then the total number of bits sent might be a better measure of the cost than the number of messages.

39 citations


Journal ArticleDOI
TL;DR: This work aims to reduce file server recovery times to less than 90 seconds, and takes advantage of the distributed state already present in the file system, and a high-performance log-structured file system currently under implementation.
Abstract: In the Sprite environment, tolerating faults means recovering from them quickly. Our position is that performance and availability are the desired features of the typical locally-distributed office/engineering environment, and that very fast server recovery is the most cost-effective way of providing such availability. Mechanisms used for reliability can be inappropriate in systems with the primary goal of performance, and some availability-oriented methods using replicated hardware or processes cost too much for these systems. In contrast, availability via fast recovery need not slow down a system, and our experience in Sprite shows that in some cases the same techniques that provide high performance also provide fast recovery. In our first attempt to reduce file server recovery times to less than 90 seconds, we take advantage of the distributed state already present in our file system, and a high-performance log-structured file system currently under implementation. As a long-term goal, we hope to reduce recovery to 10 seconds or less.

31 citations


Proceedings ArticleDOI
20 May 1991
TL;DR: The comparison shows that replicated servers are more flexible and tolerant of a wider variety of faults and the dual-ported disks approach is more efficient and simpler to implement.
Abstract: Several existing distributed file systems provide reliability by server replication. An alternative approach is to use dual-ported disks accessible to a server and a backup. The two approaches are compared by examining an example of each. Deceit is a replicated file server that emphasizes flexibility. HA-NFS is an example of the second approach that emphasizes efficiency and simplicity. The two file servers run on the same hardware and implement SUN's NFS protocol. The comparison shows that replicated servers are more flexible and tolerant of a wider variety of faults. On the other hand, the dual-ported disks approach is more efficient and simpler to implement. When tolerating single failure, dual-ported disks also give somewhat better availability. >

29 citations


01 Jan 1991
TL;DR: The Ficus mechanism permits optimistic management of the volume location data by exploiting the existing directory reconciliation algorithms which merge directory updates made during network partition.
Abstract: Existing techniques to provide name transparency in distributed le systems have been designed for modest scale systems, and do not readily extend to very large con gurations. This paper details the approach which is now operational in the Ficus replicated Unix ling environment, and shows how it di ers from other methods currently in use. The Ficus mechanism permits optimistic management of the volume location data by exploiting the existing directory reconciliation algorithms which merge directory updates made during network partition.

25 citations



Journal ArticleDOI
TL;DR: An implementation and performance evaluation of file replication in a locally distributed system using both sequential and concurrent update methods to update the copies of the replicated file by using various methods.

14 citations


Proceedings ArticleDOI
18 Apr 1991
TL;DR: OS/2 and SNA demonstrated their viability as platforms for distributed applications by providing competent support for this performance-critical software, but substantial modifications to the AFS structure were necessary to achieve an efficient OS/2 implementation that preserved the Afs file system interface and semantics.
Abstract: Porting a distributed application from one environment to another can be significant, particularly when performance is an important consideration. The Andrew File System (AFS) is a distributed file system designed to be heterogeneous and scalable, and it runs efficiently on variations of Unix. A port of AFS to Operating system/2 (OS/2) encountered an assortment of problems at various levels. The port, performed as a sequence of two ports, first investigated feasibility and performance issues and then integrated AFS into the OS/2 environment. Additionally, the migration from the original AFS network protocol, user datagram protocol internet protocol (UDP/IP), to systems network architecture (SNA) posed challenges, not the least of which was the change from a connectionless to a connection-oriented protocol. OS/2 and SNA demonstrated their viability as platforms for distributed applications by providing competent support for this performance-critical software, but substantial modifications to the AFS structure were necessary to achieve an efficient OS/2 implementation that preserved the AFS file system interface and semantics. >

6 citations


Proceedings ArticleDOI
30 Apr 1991
TL;DR: The authors study the improvement in performance obtained in distributed memory machines through the use of a separate network that serves multiple I/O nodes operating under a distributed file system, and finds that wormhole routing is efficient only for low network loads, and its performance degrades rapidly even with moderate channel utilisation.
Abstract: The authors study the improvement in performance obtained in distributed memory machines through the use of a separate network that serves multiple I/O nodes operating under a distributed file system. For a hypercube architecture augmented by an independent I/O network that scales as N/log N, a significant improvement in performance is observed, specially when data locality is low. Moreover, performance becomes relatively insensitive to data locality. From the hardware aspect, wormhole routing is efficient only for low network loads, and its performance degrades rapidly even with moderate channel utilisation. Simulation results are presented for a 128-node hypercube. >

Patent
10 May 1991
TL;DR: In this article, a virtual document where link information are arranged as pointers is constituted in the work area of a main memory, and the document is registered and stored as one file in a link information area.
Abstract: PURPOSE:To avoid unnecessary overlapping preservation or the like and to easily constitute a document by virtually constituting a new document only with information (like information) indicating the file attribute of a document file. CONSTITUTION:A document constituting device 10 (work station) constitutes a desired document (manual or the like) based on the document file distributed and preserved in a file server 11 or 12 while inquiring of a resource information managing device 17. A work area control part 106 executes prescribed control in accordance with the command from a document constitution control part 102, and the document (a virtual document where link information are arranged as pointers) constituted in the work area of a main memory 105 through these control is registered and stored as one file in a link information area 103a of a file part 103. Thus, the document is easily constituted based on respective distributed and preserved document files, and unnecessary overlapping of these document files is avoided to efficiently preserve the document.

Proceedings ArticleDOI
J.D. Shiers1
07 Oct 1991
TL;DR: To cope with the large quantities of data produced in high energy physics, CERN has developed a system for the management of, and access to, data in a fully distributed environment, known as FATMEN, which provides a worldwide distributed file catalog and offers system- and medium-independent access to data.
Abstract: To cope with the large quantities of data produced in high energy physics, CERN has developed a system for the management of, and access to, data in a fully distributed environment The principal user interface is via a package known as FATMEN (file and tape management: experimental needs), which provides a worldwide distributed file catalog and offers system- and medium-independent access to data The software runs on a large variety of platforms, including VM/CMS, MVS, VAX/VMS, and UNIX systems TCP/IP, DECnet, and Bitnet networks are currently supported for the transfer of catalog updates Particular attention is given to the FATMEN catalogs, the FATMEN naming scheme, access to data, migration, and security and reliability >

Journal ArticleDOI
TL;DR: The use of a hybrid between the available copies method and voting with witnesses to maintain consistency in a replicated file system is investigated, using a stochastic Petri net model.
Abstract: The use of a hybrid between the available copies method and voting with witnesses to maintain consistency in a replicated file system is investigated. In such a system, the available copies method is augmented with witnesses, and a simple static voting algorithm is used. High levels of availability are possible with only two copies, with the added advantage that the consistency of the file system is maintained even if the network is partitioned. The transformation of copies and witnesses into the other, as needed, is discussed. The system is analysed via a stochastic Petri net model.

01 Jan 1991
TL;DR: For a hypercube architecture augmented by an independent 1/0 network that scales as &, a significant improvement in performance is observed, specially when data locality is low, and performance becomes relatively insensitive to data locality.
Abstract: We study the improvement in performance obtained in distributed memory machines through the use of a separate network that serves multiple 1/0 nodes operating under a distributed file system. For a hypercube architecture augmented by an independent 1/0 network that scales as &, a significant improvement in performance is observed, specially when data locality is low. Moreover, performance becomes relatively insensitive to data locality. From the hardware a% pect, wormhole routing is efficient only for low network loads, and its performance degrades rapidly even with moderate channel utilization. Simulation results are presented for a 128-node hypercube.