scispace - formally typeset
Search or ask a question

Showing papers on "Distributed File System published in 2002"


Journal ArticleDOI
09 Dec 2002
TL;DR: The design of Farsite is reported on and the lessons learned by implementing much of that design are reported, including how to locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases.
Abstract: Farsite is a secure, scalable file system that logically functions as a centralized file server but is physically distributed among a set of untrusted computers. Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of file and directory data with a Byzantine-fault-tolerant protocol; it is designed to be scalable by using a distributed hint mechanism and delegation certificates for pathname translations; and it achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases. We report on the design of Farsite and the lessons we have learned by implementing much of that design.

1,037 citations


Proceedings ArticleDOI
John R. Douceur1, Atul Adya1, William J. Bolosky1, P. Simon1, Marvin M. Theimer1 
02 Jul 2002
TL;DR: This work presents a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication, and includes convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys.
Abstract: The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.

690 citations


Journal ArticleDOI
09 Dec 2002
TL;DR: The evaluation demonstrates that Pangaea outperforms existing distributed file systems in large heterogeneous environments, typical of the Internet and of large corporate intranets.
Abstract: Pangaea is a wide-area file system that supports data sharing among a community of widely distributed users. It is built on a symmetrically decentralized infrastructure that consists of commodity computers provided by the end users. Computers act autonomously to serve data to their local users. When possible, they exchange data with nearby peers to improve the system's overall performance, availability, and network economy. This approach is realized by aggressively creating a replica of a file whenever and wherever it is accessed.This paper presents the design, implementation, and evaluation of the Pangaea file system. Pangaea offers efficient, randomized algorithms to manage highly dynamic and potentially large groups of file replicas. It applies optimistic consistency semantics to replica contents, but it also offers stronger guarantees when required by the users. The evaluation demonstrates that Pangaea outperforms existing distributed file systems in large heterogeneous environments, typical of the Internet and of large corporate intranets.

182 citations


Patent
Amjad Soomro1, Sunghyun Choi1
07 Mar 2002
TL;DR: In this paper, the IEEE 802.11 basic service set network is enabled by a dynamic frequency selection (DFS) element within beacon and probe response frames defining a DFS owner.
Abstract: Dynamic frequency selection for an IEEE 802.11 basic service set network is enabled by a dynamic frequency selection (DFS) element within beacon and probe response frames defining a DFS owner, a DFS interval specifying the time until channel switch in beacon intervals, a DFS count specifying a time in beacon intervals until the DFS owner initiates selection of the next channel frequency from the supported channel set, and a DFS recovery interval specifying a time after the end of the DFS interval when recovery procedures are initiated if no channel switch information was received during that DFS interval. Channel switch information is presented in beacons following the end of the channel selection process, and within beacons during the DFS recovery interval.

129 citations


Proceedings Article
01 Jan 2002

71 citations


Patent
Ravi Kashyap1
09 May 2002
TL;DR: In this paper, a distributed file system including a plurality of remote systems is described, and an attempt counter indicates how many attempts have been made to install the transferred file on the receiver site.
Abstract: Disclosed are novel methods and apparatus for persistent queuing in distributed file systems. In an embodiment, an apparatus is disclosed. The apparatus includes a distributed file system including a plurality of remote systems. The plurality of remote systems includes a sender site and a receiver site. The apparatus further includes a local queue accessible by the sender site; a remote queue accessible by the receiver site; a next attempt time indicator; and an attempt counter. The next attempt time indicator may specify a next time to install a transferred file on the receiver site. The attempt counter indicates how many attempts have been made to install the transferred file on the receiver site.

71 citations


Patent
26 Mar 2002
TL;DR: In this article, a method for responding to a request for a file, comprising receiving a request at a selection server for the selection server to select one of a plurality of content distribution networks based upon predetermined selection criteria, was proposed.
Abstract: A method for responding to a request for a file, comprising receiving a request for a file at a selection server for the selection server to select one of a plurality of content distribution networks based upon predetermined selection criteria, the request by a client system to a file server, the selection server, file server client system, and content distribution networks all connected to an Internet; and responding to the request by providing the file from the selected content distribution network to the client system.

64 citations


Patent
14 Feb 2002
TL;DR: In this article, a network system capable of preventing the leakage of a confidential file by an inadvertent act of a transmitting party and capable of meeting the requirement for an arbitrary file format is disclosed.
Abstract: A network system capable of preventing the leakage of a confidential file by an inadvertent act of a transmitting party and capable of meeting the requirement for an arbitrary file format is disclosed. A label indicating a security level (“confidential” or “unclassified”) is attached to the file in a client terminal, which transmits the labeled file outside. A transmission management program on a gateway server checks the label of the file, and in the case where the security level is “unclassified”, transmits the file to an external network. Also, a label management program manages the labeled file in the client terminal.

50 citations


Proceedings ArticleDOI
02 Jul 2002
TL;DR: A way to manage distributed file system caches based upon groups of files that are accessed together, using file access patterns to automatically construct dynamic groupings of files and then managing the cache by fetching groups, rather than single files.
Abstract: We describe a way to manage distributed file system caches based upon groups of files that are accessed together. We use file access patterns to automatically construct dynamic groupings of files and then manage our cache by fetching groups, rather than single files. We present experimental results, based on trace-driven workloads, demonstrating that grouping improves cache performance. At the file system client, grouping can reduce LRU demand fetches by 50 to 60%. At the server cache hit rate improvements are much more pronounced, but vary widely (20 to over 1200%) depending upon the capacity of intervening caches. Our treatment includes information theoretic results that justify our approach to file grouping.

50 citations


Patent
29 Apr 2002
TL;DR: In this article, a multiple-application-server architecture for thin-client/server (denoted MAS TC/S) is provided to allow users with thin client devices to roam around a wide area network while experiencing transparent working environment.
Abstract: A multiple-application-server architecture model for thin-client/server (denoted MAS TC/S) is provided to allow users with thin-client devices to roam around a wide area network while experiencing transparent working environment. The MAS TC/S system includes major components of a display protocol, a multiple-application-server network, an application-server discovery protocol and a distributed file system. The application-server discovery protocol identifies the most appropriate application server for a thin-client device to connect to. The distributed file system includes a data-mining-based intelligent prefetching mechanism allowing achieving a working environment with access, location, and mobility transparencies in an efficient way for prompt service.

49 citations


Proceedings Article
28 Jan 2002
TL;DR: Measurements of a Fluid Replication prototype show that update performance is completely independent of wide-area networking costs, at the expense of increased sharing costs, putting the costs of sharing on those who require it, preserving common case performance.
Abstract: As mobile clients travel, their costs to reach home filing services change, with serious performance implications. Current file systems mask these performance problems by reducing the safety of updates, their visibility, or both. This is the result of combining the propagation and notification of updates from clients to servers. Fluid Replication separates these mechanisms. Client updates are shipped to nearby replicas, called WayStations, rather than remote servers, providing inexpensive safety. WayStations and servers periodically exchange knowledge of updates through reconciliation, providing a tight bound on the time until updates are visible. Reconciliation is non-blocking, and update contents are not propagated immediately; propagation is deferred to take advantage of the low incidence of sharing in file systems. Our measurements of a Fluid Replication prototype show that update performance is completely independent of wide-area networking costs, at the expense of increased sharing costs. This places the costs of sharing on those who require it, preserving common case performance. Furthermore, the benefits of independent update outweigh the costs of sharing for a workload with substantial sharing. A trace-based simulation shows that a modest reconciliation interval of 15 seconds can eliminate 98% of all stale accesses. Furthermore, our traced clients could collectively expect availability of five nines, even with deferred propagation of updates.

Patent
27 Feb 2002
TL;DR: The widelink directive as mentioned in this paper is similar to the distributed file system (DFS) facility that allows DFS-enabled common internet file system clients to resolve uniform naming convention paths to locations that may or may not be on an original storage system, such as a multi-protocol filer or original protocol server of the filer.
Abstract: A widelink directive provides an enhanced level of indirection with respect to a resource, such as a unit of storage, attached to a destination, such as a storage system. The widelink has a scope of indirection that is “wider” than a conventional symbolic link (“symlink”). The novel widelink directive is similar to the distributed file system (DFS) facility that allows DFS-enabled common internet file system clients to resolve uniform naming convention paths to locations that may or may not be on an original storage system, such as a multi-protocol filer, or original protocol server of the filer. By taking advantage of clients that support the DFS facility, the widelink directive is quite flexible in that it can be used to resolve symlinks that “leave” a share.

Patent
01 Oct 2002
TL;DR: In this article, a distributed file system consisting of a plurality of compute nodes and an input/output (I/O) node connected by an interconnection network is presented. But the system is adapted to use a common data representation for both physical and logical partitions of a file stored in the system and the partitions are linearly addressable.
Abstract: The present invention provides a distributed file system comprising a plurality of compute nodes and a plurality of input/output (I/O) nodes connected by an interconnection network wherein the system is adapted to use a common data representation for both physical and logical partitions of a file stored in the system and wherein the partitions are linearly addressable. The invention also provides a method of operating a distributed file system comprising a plurality of input/output (I/O) nodes and a plurality of compute nodes, the method comprising the steps of partitioning a file into a plurality of subfiles distributed across a plurality of I/O nodes; logically partioning a file by setting a view on it; computing mapping between a linear space of a file and a linear space of a subfile; computing the intersection between a view and and a subfile; and performing data operations.

Patent
09 Dec 2002
TL;DR: In this paper, the authors present a method and apparatus for a wide area file system, including: creating a peer-to-peer wide-area file system that allows read and write sharing of data.
Abstract: In one embodiment, the invention provides a method and apparatus for a wide area file system, including: creating a peer-to-peer wide-area file system that allows read and write sharing of data. In another embodiment, the invention provides a method and apparatus for a wide area file system, including using per-file replication to achieve high availability and performance in the wide-area distributed file system.

Patent
Laxmikant Gunda1, Balaji Narasimhan1, Sara Abraham1, Shie-Rei Huang1, Nagaraj Shyam1 
17 Dec 2002
TL;DR: In this paper, the authors propose a system and method for distributed file system I/O recovery in storage networks, which can detect loss of access to a server in the storage network and recover application I /O requests in real-time once access to the server is restored.
Abstract: Embodiments of a system and method for distributed file system I/O recovery in storage networks. Embodiments may detect loss of access to a server in the storage network and recover application I/O requests in real-time once access to the server is restored. Embodiments may detect server and/or network failures and store failed and new I/O requests. Recovery from the failure (e.g. network reconnect, server node reboot, or failover, if this is a clustered environment) may be detected and, after recovery is detected, any stored failed and new I/O requests may be sent to the server. In one embodiment, to detect recovery from the failure, a failed I/O request may be repeatedly re-issued until the I/O request succeeds. Embodiments may be implemented in a variety of storage environments, including environments where clients issue direct I/O over a storage network to storage and control I/O over a network to a server.

Proceedings ArticleDOI
23 Sep 2002
TL;DR: The software support for sharing of disks in clusters, where the disks are distributed across the nodes in the cluster, thereby allowing them to be combined into a high-performance storage system, is described.
Abstract: In many clusters today, the local disks of a node are only used sporadically. This paper describes the software support for sharing of disks in clusters, where the disks are distributed across the nodes in the cluster, thereby allowing them to be combined into a high-performance storage system. Compared to centralized storage servers, such an architecture allows the total I/O capacity of the cluster to scale up with the number of nodes and disks. Additionally, our software allows customizing the functionality of the remote disk access using a library of code modules. A prototype has been implemented on a cluster connected by a Scalable Coherent Interface (SCI) network and performance measurements using both raw device access and a distributed file system show that the performance is comparable to dedicated storage systems and that the overhead of the framework is moderate even during high load. Thus, the prospects are that clusters sharing disks distributed among the nodes will allow both the application processing power and total I/O capacity of the cluster to scale up with the number of nodes.

01 Jan 2002
TL;DR: A novel mechanism for scaling queries as the network grows large and the intuition that keeping some state about the contents of the rest of the system will aid in searching as long as acquiring this state is not overly costly and it does not expire too quickly is confirmed.
Abstract: We have examined the tradeoffs in applying regular and Compressed Bloom filters to the name query problem in distributed file systems and developed and tested a novel mechanism for scaling queries as the network grows large. Filters greatly reduced query messages when using Fan’ s “Summary Cache” in web cache hierarchies[6], a similar albeit smaller, searching problem. We have implemented a testbed that models a distributed file system and run experiments that test various configuration s of the system to see if Bloom filters could provide the same kind of improvements. In a realistic system, where the chance that a randomly queried node holds the file being searched for is low, we show that filters always provide lower bandwidth/search and faster time/search, as long as the rates of change of the files stored at the nodes is not extremely high relative to the number of searches. In other words, we confir m the intuition that keeping some state about the contents of the rest of the system will aid in searching as long as acquiring this state is not overly costly and it does not expire too quickly. The grouping topology we have developed divides nodes into groups, each of which has a representative node that aggregates a composite filter for the group. All nodes not in that group use this low-precision filter to weed out whole collections of nodes by probing these filters, only sending a search to be proxied by a member of the group if the probe of the group filter returns positively. Proxied searches are then carried out within a group, where more precise (more bits per file) filters are kept and exchanged between the nodes in a group. Experimental results show that both bandwidth/search and time/search are improved with this novel grouping topology.

Proceedings ArticleDOI
23 Sep 2002
TL;DR: The Trellis File System (Trellis FS) is developed to allow programs to access data files on any file system and on any host on a network that can be named by a Secure Copy Locator (SCL) or a Uniform Resource Locators (URL).
Abstract: A practical problem faced by users of metacomputers and computational grids is: If my computation can move from one system to another, how can I ensure that my data will still be available to my computation? Depending on the level of software, technical, and administrative support available, a data grid or a distributed file system would be reasonable solutions. However, it is not always possible (or practical) to have a diverse group of systems administrators agree to adopt a common infrastructure to support remote data access. Yet, having transparent access to any remote data is an important, practical capability. We have developed the Trellis File System (Trellis FS) to allow programs to access data files on any file system and on any host on a network that can be named by a Secure Copy Locator (SCL) or a Uniform Resource Locator (URL). Without requiring any new protocols or infrastructure, Trellis can be used on practically any POSIX-based system on the Internet. Read access, write access, sparse access, local caching of data, prefetching, and authentication are supported. Trellis is implemented as a user-level C library, which mimics the standard stream I/O functions, and is highly portable. Trellis is not a replacement for traditional file systems or data grids; it provides new capabilities by overlaying on top of other file systems, including grid-based file systems. And, by building upon an already-existing infrastructure (i.e., Secure Shell and Secure Copy), Trellis can be used in situations where a suitable data grid or distributed file system does not yet exist.

Patent
Rod D. Waltermann1
24 Apr 2002
TL;DR: In this paper, the storage capability of workstations connected to a LAN is surveyed for storage capability potentially available for sharing, a weighting function is derived for each system which is indicative of shared system storage capability, and data files to be stored are scattered among and gathered from the connected systems.
Abstract: Storage capability otherwise going underutilized in a LAN is made available for sharing among workstations connected to the LAN. Systems connected to a LAN are surveyed for storage capability potentially available for sharing, a weighting function is derived for each system which is indicative of shared system storage capability, and data files to be stored are scattered among and gathered from the connected systems.

Journal Article
TL;DR: In this article, the authors present a high-level model to access via Web the resources of a distributed file system based on Resource Description Framework (RDF) recommendation of the World Wide Web Consortium, a standardized foundation for processing metadata.
Abstract: The modern operating systems must integrate various Internet services, especially World-Wide Web facilities to access different Web resources using file system mechanisms. In this paper we present a high-level model to access via Web the resources of a distributed file system. The proposed description is based on Resource Description Framework (RDF) recommendation of the World-Wide Web Consortium, a standardized foundation for processing metadata.

Patent
Lance W. Russell1, Lu Xu1
12 Apr 2002
TL;DR: In this article, a distributed file system interface (104a) is coupled to the one or more client application (108a), and a storage server (106) and a meta-data server (102) are coupled to
Abstract: A distributed file system interface (104a) is coupled to the one or more client applications (108a), and a storage server (106) and a meta-data server (102) are coupled to the distributed file system interface The meta-data server receives open-file requests from the distributed file system interface and in response creates a security object (124) The meta-data server also generates an partial encryption key and stores the partial encryption key in the security object The block storage server completes the encryption key, and the meta-data server encrypts the list of blocks that are in the file and stores the encrypted block list in the security object The security object is then returned to the distributed file interface and used in subsequent file access requests

Patent
28 Jun 2002
TL;DR: In this article, the IEEE 802.11 basic service set network (100, 101, 102, 103) is enabled by a dynamic frequency selection (DFS) element (200) within beacon and probe response frames defining a DFS owner.
Abstract: Dynamic frequency selection for an IEEE 802.11 basic service set network (100, 101, 102, 103) is enabled by a dynamic frequency selection (DFS) element (200) within beacon and probe response frames defining a DFS owner, a DFS interval specifying the time until channel switch in beacon intervals, a DFS count specifying a time in beacon intervals until the DFS owner initiates selection of the next channel frequency from the supported channel set, and a DFS recovery interval specifying a time after the end of the DFS interval when recovery procedures are initiated if no channel switch information was received during that DFS interval. Channel switch information is presented in beacons following the end of the channel selection process, and within beacons during the DFS recovery interval.

Proceedings ArticleDOI
21 May 2002
TL;DR: This paper reports on the experiences in designing a portable parallel file system for clusters and the discussion of the file system design and early implementation is completed with basic performance measures confirming the potential of the approach.
Abstract: In this paper, we report on the experiences in designing a portable parallel file system for clusters. The file system offers to the applications an interface compliant with MPI-IO, the I/O interface of the MPI-2 standard. The file system implementation relies upon MPI for internal coordination and communication. This guarantees high performance and portability over a wide range of hardware and software cluster platforms. The internal architecture of the file system has been designed to allow rapid prototyping and experimentation of novel strategies for managing parallel I/O in a cluster environment. The discussion of the file system design and early implementation is completed with basic performance measures confirming the potential of the approach.

Journal ArticleDOI
TL;DR: This paper discusses the essential issues in designing cluster file systems, and presents the solutions to them with emphasis on COSMOS’s novelty, and point out the bottlenecks of the existing system and propose methods for improvement.

Journal ArticleDOI
TL;DR: Experimental results from the Slice prototype show that the protocol has low cost in the common case, allowing the system to deliver client file access bandwidths approaching gigabit-per-second network speeds.
Abstract: This paper presents a recovery protocol for block I/O operations in Slice, a storage system architecture for high-speed LANs incorporating network-attached block storage. The goal of the Slice architecture is to provide a network file service with scalable bandwidth and capacity while preserving compatibility with off-the-shelf clients and file server appliances. The Slice prototype virtualizes the Network File System (NFS) protocol by interposing a request switching filter at the client's interface to the network storage system. The distributed Slice architecture separates functions typically combined in central file servers, introducing new challenges for failure atomicity. This paper presents a protocol for atomic file operations and recovery in the Slice architecture, and related support for reliable file storage using mirrored striping. Experimental results from the Slice prototype show that the protocol has low cost in the common case, allowing the system to deliver client file access bandwidths approaching gigabit-per-second network speeds.

Proceedings ArticleDOI
02 Sep 2002
TL;DR: This paper discusses the twin-transaction model (TTM) developed for disconnected operation and proposes a novel way to maintain consistency of the TTM mechanism using a component-based mobility-enabled distributed file system platform Glomar.
Abstract: Efficient support for computational activities in mobile computing environments remains very much a research challenge. This paper discusses the twin-transaction model (TTM) developed for disconnected operation and proposes a novel way to maintain consistency of the TTM mechanism using a component-based mobility-enabled distributed file system platform Glomar. TTM consistency support with Glomar is implemented and demonstrated.

Proceedings ArticleDOI
03 Apr 2002
TL;DR: The assessment based on RAID reliability theory confirmed the feasibility of deploying CoStore on an existing desktop computing infrastructure and preliminary results indicate that CoStore performance is comparable to that of commonly used distributed file systems, such as NFS, Samba, and Windows 2000 Server.
Abstract: CoStore is a serverless distributed file system designed to provide cost-effective storage service utilizing idle disk space on workstation clusters. With all the system responsibilities evenly distributed across a group of collaborating workstations, the proposed architecture provides improved performance, reliability, and scalability. We have collected workstation uptime data from production systems. The assessment based on RAID reliability theory confirmed the feasibility of deploying CoStore on an existing desktop computing infrastructure. We have implemented a CoStore prototype and measured its performance. Preliminary results indicate that CoStore performance is comparable to that of commonly used distributed file systems, such as NFS, Samba, and Windows 2000 Server.

Proceedings ArticleDOI
07 Jan 2002
TL;DR: A new web based Distributed File System server management tool to perform load balancing across multiple servers based on rule-based data mining techniques and graph analysis algorithms is presented.
Abstract: In this paper we present a new web based Distributed File System server management tool to perform load balancing across multiple servers. The Distributed File System from Distributed Computing Environment (DCE DFS) is a collection of many file systems mounted onto a single virtual file system space with a single namespace. The tool is based on rule-based data mining techniques and graph analysis algorithms. The data mining procedures identify DFS file access patterns and the graph analysis and statistical information relocates the filesets between different file servers. We demonstrate our tool on data collected for five months on DFS servers in a production environment. Experiments with this data show that our load balancing tool is useful to file system administrators to monitor, evaluate DFS state and to make intelligent decisions about file system transfers in order to balance the access request load on "read-write" filesets across DFS servers.

Journal ArticleDOI
TL;DR: This paper addresses the design and implementation of the various components of the PODOS system and addresses the growing demand for performance in the distributed computing environment.

Patent
13 Nov 2002
TL;DR: In this paper, a serverless distributed file system manages the storage of files and directories using one or more directory groups, where the directories may be managed using Byzantine-fault-tolerant groups, whereas files are managed without using Byzantine fault tolerant groups.
Abstract: A serverless distributed file system manages the storage of files and directories using one or more directory groups. The directories may be managed using Byzantine-fault-tolerant groups, whereas files are managed without using Byzantine-fault-tolerant groups. Additionally, the file system may employ a hierarchical namespace to store files. Furthermore, the directory group may employ a plurality of locks to control access to objects (e.g., files and directories) in each directory.