Applicability of GPU Computing for Efficient Merge in In-Memory Databases

Open AccessProceedings Article

Applicability of GPU Computing for Efficient Merge in In-Memory Databases

- pp 19-26

TLDR

It is found that the maximum potential merge speedup is limited since only two of its four stages are likely to benefit from parallelization, so a parallel dictionary slice merge algorithm as well as an alternative parallel merge algorithm for GPUs that achieves up to 40% more throughput than its CPU implementation are presented.

Abstract:

Column oriented in-memory databases typically use dictionary compression to reduce the overall storage space and allow fast lookup and comparison. However, there is a high performance cost for updates since the dictionary, used for compression, has to be recreated each time records are created, updated or deleted. This has to be taken into account for TPC-C like workloads with around 45% of all queries being transactional modifications. A technique called differential updates can be used to allow faster modifications. In addition to the main storage, the database then maintains a delta storage to accommodate modifying queries. During the merge process, the modifications of the delta are merged with the main storage in parallel to the normal operation of the database. Current hardware and software trends suggest that this problem can be tackled by massively parallelizing the merge process. One approach to massive parallelism are GPUs that offer order of magnitudes more cores than modern CPUs. Therefore, we analyze the feasibility of a parallel GPU merge implementation and its potential speedup. We found that the maximum potential merge speedup is limited since only two of its four stages are likely to benefit from parallelization. We present a parallel dictionary slice merge algorithm as well as an alternative parallel merge algorithm for GPUs that achieves up to 40% more throughput than its CPU implementation. In addition, we propose a parallel duplicate removal algorithm that achieves up to 27 times the throughput of the CPU implementation.

Applicability of GPU Computing for Efficient Merge in In-Memory Databases

Citations

A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs

Efficient co-processor utilization in database query processing

A GPU-based index to support interactive spatio-temporal queries over historical data

Automatic selection of processing units for coprocessing in databases

Hash Table and Radix Sort Based Aggregation

References

An introduction to parallel algorithms

GPU Computing

Parallel database systems: the future of high performance database systems

The log-structured merge-tree (LSM-tree)

C-store: a column-oriented DBMS

Related Papers (5)

A Parallel Algorithm Development Model for the GPU Architecture

Efficient Relational Algebra Algorithms and Data Structures for GPU

Clustering Throughput Optimization on the GPU

The Yin and Yang of processing data warehousing queries on GPU devices

Many-core GPU computing with NVIDIA CUDA