scispace - formally typeset
Proceedings ArticleDOI

BFO: Batch-File Operations on Massive Files for Consistent Performance Improvement

TLDR
A novel batch-file access approach is proposed, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes respectively, using a two-phase access for metadata and data jointly.
Abstract
Existing local file systems, designed to support a typical single-file access pattern only, can lead to poor performance when accessing a batch of files, especially small files. This single-file pattern essentially serializes accesses to batched files one by one, resulting in a large number of non-sequential, random, and often dependent I/Os between file data and metadata at the storage ends. We first experimentally analyze the root cause of such inefficiency in batch-file accesses. Then, we propose a novel batch-file access approach, referred to as BFO for its set of optimized Batch-File Operations, by developing novel BFOr and BFOw operations for fundamental read and write processes respectively, using a two-phase access for metadata and data jointly. The BFO offers dedicated interfaces for batch-file accesses and additional processes integrated into existing file systems without modifying their structures and procedures. We implement a BFO prototype on ext4, one of the most popular file systems. Our evaluation results show that the batch-file read and write performances of BFO are consistently higher than those of the traditional approaches regardless of access patterns, data layouts, and storage media, with synthetic and real-world file sets. BFO improves the read performance by up to 22.4× and 1.8× with HDD and SSD respectively; and boosts the write performance by up to 111.4× and 2.9× with HDD and SSD respectively. BFO also demonstrates consistent performance advantages when applied to four representative applications, Linux cp, Tar, GridFTP, and Hadoop.

read more

Citations
More filters
Proceedings Article

Application Crash Consistency and Performance with CCFS.

TL;DR: The Crash-Consistent File System is presented, a file system that improves the correctness of application-level crash consistency protocols while maintaining high performance and it is demonstrated that both application correctness and high performance can be realized in a modern file system.
Proceedings Article

{BCW}: Buffer-Controlled Writes to HDDs for SSD-HDD Hybrid Storage Server

TL;DR: An extensive experimental study reveals that a series of sequential and continuous writes to HDDs exhibit a periodic, staircase shaped pattern of write latency, which suggests that HDDs can potentially provide μs-level write IO delay (for appropriately scheduled writes), which is close to SSDs’ write performance.
Journal ArticleDOI

Exploration and Exploitation for Buffer-Controlled HDD-Writes for SSD-HDD Hybrid Storage Server

TL;DR: A Buffer-Controlled Write approach (BCW) is proposed to proactively control buffered writes so that low- and mid-latency periods are scheduled with application data and high-latencies periods are filled with padded data, and a mixed IO scheduler (MIOS) is designed to adaptively steer incoming data to SSDs and HDDs.
Proceedings ArticleDOI

Mass: Workload-Aware Storage Policy for OpenStack Swift

TL;DR: Mass is proposed, a programmable framework to provide the enhanced storage policies for diverse workloads based on their access characteristics, which can be enforced over the full storage paths of the requests and be dynamically adjusted during runtime to adapt to the workload changes.
References
More filters
Proceedings ArticleDOI

The Globus Striped GridFTP Framework and Server

TL;DR: It is argued that this combination of performance and modular structure make the Globus GridFTP framework both a good foundation on which to build tools and applications, and a unique testbed for the study of innovative data management techniques and network protocols.
Proceedings ArticleDOI

Finding a needle in Haystack: facebook's photo storage

TL;DR: This paper describes Haystack, an object storage system optimized for Facebook's Photos application, which provides a less expensive and higher performing solution than the previous approach, which leveraged network attached storage appliances over NFS.
Proceedings Article

Scalability in the XFS file system

TL;DR: The architecture and design of a new file system, XFS, for Silicon Graphics' IRIX operating system is described, and the use of B+ trees in place of many of the more traditional linear file system structures are discussed.
Journal ArticleDOI

BTRFS: The Linux B-Tree Filesystem

TL;DR: The core ideas, data structures, and algorithms of BTRFS are described, which sheds light on the challenges posed by defragmentation in the presence of snapshots, and the tradeoffs required to maintain even performance in the face of a wide spectrum of workloads.
Proceedings ArticleDOI

F2FS: a new file system for flash storage

TL;DR: Experimental results highlight the desirable performance of F2FS; on a state-of-the-art mobile system, it outperforms EXT4 under synthetic workloads by up to 3.1× (iozone) and 2× (SQLite).
Related Papers (5)