scispace - formally typeset
Journal ArticleDOI

Cache-Oblivious Algorithms

Reads0
Chats0
TLDR
It is proved that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model can be simulated efficiently by LRU replacement.
Abstract
This article presents asymptotically optimal algorithms for rectangular matrix transpose, fast Fourier transform (FFT), and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size M and cache-line length B where M = Ω(B2), the number of cache misses for an m × n matrix transpose is Θ(1 + mn/B). The number of cache misses for either an n-point FFT or the sorting of n numbers is Θ(1 + (n/B)(1 + logM n)). We also give a Θ(mnp)-work algorithm to multiply an m × n matrix by an n × p matrix that incurs Θ(1 + (mn + np + mp)/B + mnp/B√M) cache faults.We introduce an “ideal-cache” model to analyze our algorithms. We prove that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal for multiple levels and that the assumption of optimal replacement in the ideal-cache model can be simulated efficiently by LRU replacement. We offer empirical evidence that cache-oblivious algorithms perform well in practice.

read more

Citations
More filters
Journal ArticleDOI

B-Trees and Cache-Oblivious B-Trees with Different-Sized Atomic Keys

TL;DR: B-tree-like performance guarantees are provided on dictionaries that contain keys of different sizes in a model in which keys must be stored and compared as opaque objects, and a cache-oblivious static atomic-key B-tree is given, which achieves the same asymptotic performance as the static B- tree, mentioned previously.
Journal ArticleDOI

CPI-model-based analysis of sparse k-means clustering algorithms

TL;DR: This work designs sparse k-means clustering algorithms that utilize distinct representations, each of which is a pair of a data structure and an expression, and clarifies that the best algorithm among them suppresses the performance degradation factors of the number of cache misses, the branch mispredictions, and the completed instructions.
Proceedings ArticleDOI

Optimal hierarchical layouts for cache-oblivious search trees

TL;DR: Hierarchical Layouts as mentioned in this paper generalize many commonly used layouts for trees such as in-order, pre-order and breadth-first, and investigate the relative effect of each of these decisions in the construction of cache-oblivious layouts.
DissertationDOI

Toward Better Computation Models for Modern Machines

TL;DR: This paper addresses the computational cost of the address translation in the virtual memory and difficulties in design of parallel algorithms on modern many-core machines, and presents a case study of the design of an efficient 2D convex hull algorithm for GPUs.
References
More filters
Book

Matrix computations

Gene H. Golub
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Journal ArticleDOI

An algorithm for the machine calculation of complex Fourier series

TL;DR: Good generalized these methods and gave elegant algorithms for which one class of applications is the calculation of Fourier series, applicable to certain problems in which one must multiply an N-vector by an N X N matrix which can be factored into m sparse matrices.
Book

Computer Architecture: A Quantitative Approach

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.