Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

doi:10.14778/1687553.1687564

Journal ArticleDOI

Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Changkyu Kim, +8 more

- Vol. 2, Iss: 2, pp 1378-1389

Chats0

TLDR

This paper re-examines two popular join algorithms to determine if the latest computer architecture trends shift the tide that has favored hash join for many years and offers multicore implementations of hash join and sort-merge join which consistently outperform all previously reported results.

Abstract:

Join is an important database operation. As computer architectures evolve, the best join algorithm may change hand. This paper re-examines two popular join algorithms -- hash join and sort-merge join -- to determine if the latest computer architecture trends shift the tide that has favored hash join for many years. For a fair comparison, we implemented the most optimized parallel version of both algorithms on the latest Intel Core i7 platform. Both implementations scale well with the number of cores in the system and take advantages of latest processor features for performance. Our hash-based implementation achieves more than 100M tuples per second which is 17X faster than the best reported performance on CPUs and 8X faster than that reported for GPUs. Moreover, the performance of our hash join implementation is consistent over a wide range of input data sizes from 64K to 128M tuples and is not affected by data skew. We compare this implementation to our highly optimized sort-based implementation that achieves 47M to 80M tuples per second. We developed analytical models to study how both algorithms would scale with upcoming processor architecture trends. Our analysis projects that current architectural trends of wider SIMD, more cores, and smaller memory bandwidth per core imply better scalability potential for sort-merge join. Consequently, sort-merge join is likely to outperform hash join on upcoming chip multiprocessors. In summary, we offer multicore implementations of hash join and sort-merge join which consistently outperform all previously reported results. We further conclude that the tide that favors the hash join algorithm has not changed yet, but the change is just around the corner.

Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs

Citations

Sensitive protein alignments at tree-of-life scale using DIAMOND.

FAST: fast architecture sensitive tree search on modern CPUs and GPUs

Design and evaluation of main memory hash join algorithms for multi-core CPUs

Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware

Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort

References

Randomized Algorithms

Sorting networks and their applications

Data parallel algorithms

Larrabee: a many-core x86 architecture for visual computing

GPUTeraSort: high performance graphics co-processor sorting for large database management

Related Papers (5)

Design and evaluation of main memory hash join algorithms for multi-core CPUs

Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware

Multi-core, main-memory joins: sort vs. hash revisited

Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort

Relational joins on graphics processors