GPUs as Storage System Accelerators
Reads0
Chats0
TLDR
The design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and techniques to efficiently leverage the processing power of GPUs are presented.Abstract:
Massively multicore processors, such as graphics processing units (GPUs), provide, at a comparable price, a one order of magnitude higher peak performance than traditional CPUs. This drop in the cost of computation, as any order-of-magnitude drop in the cost per unit of performance for a class of system components, triggers the opportunity to redesign systems and to explore new ways to engineer them to recalibrate the cost-to-performance relation. This project explores the feasibility of harnessing GPUs' computational power to improve the performance, reliability, or security of distributed storage systems. In this context, we present the design of a storage system prototype that uses GPU offloading to accelerate a number of computationally intensive primitives based on hashing, and introduce techniques to efficiently leverage the processing power of GPUs. We evaluate the performance of this prototype under two configurations: as a content addressable storage system that facilitates online similarity detection between successive versions of the same file and as a traditional system that uses hashing to preserve data integrity. Further, we evaluate the impact of offloading to the GPU on competing applications' performance. Our results show that this technique can bring tangible performance gains without negatively impacting the performance of concurrently running applications.read more
Citations
More filters
Proceedings ArticleDOI
CuMF_SGD: Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUs
TL;DR: This paper first design high-performance GPU computation kernels that accelerate individual SGD updates by exploiting model parallelism, then design efficient schemes that parallelize SGD Updates by exploiting data parallelism and scales cuMF SGD to large data sets that cannot fit into one GPU's memory.
Journal ArticleDOI
A Checkpoint of Research on Parallel I/O for High-Performance Computing
Francieli Zanon Boito,Eduardo C. Inacio,Jean Luca Bez,Philippe O. A. Navaux,Mario A. R. Dantas,Yves Denneulin +5 more
TL;DR: This survey article focuses on a traditional I/O stack, with a POSIX parallel file system, and aims at identifying the general characteristics of the field and the main current and future research topics.
Journal ArticleDOI
Accelerating relational database operations using both CPU and GPU co-processor
TL;DR: A new hybrid query processing technique that makes use of the capabilities of CPUs and GPUs, which breaks down each SQL statement into smaller parts during the parsing process and achieves speedup up to 39 as fast as multi-core CPUs.
Proceedings ArticleDOI
Experience with using a performance predictor during development: a distributed storage system tale
TL;DR: This work proposes the use of a prediction tool to estimate the expected performance of a complex system, and describes the experience with employing this tool to support the development of a distributed storage system.
Proceedings ArticleDOI
ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems with Parallel & Direct IO
TL;DR: ECS2 is designed and implemented, a fast erasure coding library on GPU -accelerated storage to let users enhance their data protection with transparent IO performance and file system like programming interface and take advantage of the latest GPUDirect technology supported on Nvidia GPU.
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Proceedings ArticleDOI
Chord: A scalable peer-to-peer lookup service for internet applications
TL;DR: Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.
Journal ArticleDOI
Space/time trade-offs in hash coding with allowable errors
TL;DR: Analysis of the paradigm problem demonstrates that allowing a small number of test messages to be falsely identified as members of the given set will permit a much smaller hash area to be used without increasing reject time.
Journal ArticleDOI
The Google file system
TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.
Proceedings ArticleDOI
Dynamo: amazon's highly available key-value store
Giuseppe deCandia,Deniz Hastorun,Madan Mohan Rao Jampani,Gunavardhan Kakulapati,Avinash Lakshman,Alex Pilchin,Swaminathan Sivasubramanian,Peter Sven Vosshall,Werner Vogels +8 more
TL;DR: D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.