Low depth cache-oblivious algorithms
Citations
6 citations
Cites background from "Low depth cache-oblivious algorithm..."
...It would be particularly interesting to see if an I/O-optimal low-depth cache-oblivious distribution sweeping paradigm can be designed, along the lines of [ 14 ]....
[...]
...In [9]–[ 14 ] several different multicore models were considered and cache- and processor-oblivious algorithms were presented for fundamental combinatorial, graph, and matrix-based problems....
[...]
6 citations
Cites background from "Low depth cache-oblivious algorithm..."
...see [9, 10, 12]), the PEM model offers the simplest way to study parallelism and cache-efficiency required for efficient computations on modern multicores....
[...]
5 citations
Cites methods from "Low depth cache-oblivious algorithm..."
...For a cache with size Z and cacheline length L in elements, a cache-oblivious algorithm [13] for multiplying a sparse H ×H matrix with h non-zeros by a vector establishes an upper bound on cache misses in the SpMV as O( L + H...
[...]
5 citations
5 citations
References
3,885 citations
Additional excerpts
...7] and distributed memory machines [48, 33, 12]....
[...]
2,378 citations
"Low depth cache-oblivious algorithm..." refers background in this paper
...It follows from [47] that the number of cache misses at each level under the multi-level LRU policy is within a factor of two of the number of misses for a cache half the size running the optimal replacement policy....
[...]
1,688 citations
"Low depth cache-oblivious algorithm..." refers background in this paper
...A common form of programming in this model is based on nested parallelism—consisting of nested parallel loops and/or fork-join constructs [13, 26, 20, 35, 44]....
[...]
1,577 citations
Additional excerpts
...A basic strategy for list ranking [40] is the following: (i) shrink the list to size O(n/ log n), and (ii) apply pointer jumping on this shorter list....
[...]
1,515 citations