scispace - formally typeset
Open AccessProceedings ArticleDOI

An adaptive packed-memory array

Reads0
Chats0
TLDR
The first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern, is given, which has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.
Abstract
The packed-memory array (PMA) is a data structure that maintains a dynamic set of N elements in sorted order in a Θ(N)-sized array. The idea is to intersperse Θ(N) empty spaces or gaps among the elements so that only a small number of elements need to be shifted around on an insert or delete. Because the elements are stored physically in sorted order in memory or on disk, the PMA can be used to support extremely efficient range queries. Specifically, the cost to scan L consecutive elements is O(1+L/B) memory transfers.This paper gives the first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern. Like the original PMA, any pattern of updates costs only O(log2N) amortized element moves and O(1+(log2N)/B) amortized memory transfers per update. However, the APMA performs even better on many common input distributions achieving only O(logN) amortized element moves and O(1+(logN)/B) amortized memory transfers. The paper analyzes sequential inserts, where the insertions are to the front of the APMA, hammer inserts, where the insertions "hammer" on one part of the APMA, random inserts, where the insertions are after random elements in the APMA, and bulk inserts, where for constant α∈[0,1], Nα elements are inserted after random elements in the APMA. The paper then gives simulation results that are consistent with the asymptotic bounds. For sequential insertions of roughly 1.4 million elements, the APMA has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.

read more

Content maybe subject to copyright    Report

An Adaptive Packed-Memory Array
Michael A. Bender
Stony Brook University
and
Haodong Hu
Stony Brook University
The packed-memory array (PMA) is a data structure that maintains a dynamic set of N elements in sorted order
in a Θ(N)-sized array. The idea is to intersperse Θ(N) empty spaces or gaps among the elements so that only
a small number of elements need to be shifted around on an insert or delete. Because the elements are stored
physically in sorted order in memory or on disk, the PMA can be used to support extremely efficient range
queries. Specifically, the cost to scan L consecutive elements is O(1+ L/B) memory transfers.
This paper gives the first adaptive packed-memory array (APMA), which automatically adjusts to the input
pattern. Like the traditional PMA, any pattern of updates costs only O(log
2
N) amortized element moves and
O(1+ (log
2
N)/B) amortized memory transfers per update. However, the APMA performs even better on many
common input distributions achieving only O(logN) amortized element moves and O(1+ (logN)/B) amortized
memory transfers. The paper analyzes sequential inserts, where the insertions are to the front of the APMA,
hammer inserts, where the insertions “hammer” on one part of the APMA, random inserts, where the insertions
are after random elements in the APMA, and bulk inserts, where for constant α [0,1], N
α
elements are inserted
after random elements in the APMA. The paper then gives simulation results that are consistent with the asymp-
totic bounds. For sequential insertions of roughly 1.4 million elements, the APMA has four times fewer element
moves per insertion than the traditional PMA and running times that are more than seventimes faster.
Categories and Subject Descriptors: D.1.0 [Programming Techniques]: General; E.1 [Data Structures]: Ar-
rays; E.1 [Data Structures]: Lists, stacks, queues; E.5 [Files]: Sorting/searching; H.3.3 [Information Storage
and Retrieval]: Information Search and Retrieval
General Terms: Algorithms, Experimentation, Performance, Theory.
Additional Key Words and Phrases: Adaptive Packed-Memory Array, Cache Oblivious, Locality Preserving,
Packed-MemoryArray, Range Query, Rebalance, Sequential File Maintenance, Sequential Scan, Sparse Array.
1. INTRODUCTION
A classical problem in data structures and databases is how to maintain a dynamic set of
N elements in sorted order in a Θ(N)-sized array. The idea is to intersperse Θ(N) empty
spaces or gaps among the elements so that only a small number of elements need to be
shifted around on an insert or delete. These data structures effectively simulate a library
bookshelf, where gaps on the shelves mean that books are easily added and removed.
Remarkably, such data structures can be efficient for any pattern of inserts/deletes. In-
Department of Computer Science, Stony Brook University, Stony Brook, NY 11794-4400, USA.
Email: {bender,huhd}@cs.sunysb.edu. This research was supported in part by NSF Grants
CCF 0621439/0621425, CCF 0540897/05414009,CCF 0634793/0632838, and CNS 0627645.
Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use
providedthat the copies are not made or distributedfor profit or commercial advantage,the ACM copyright/server
notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the
ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific
permission and/or a fee.
c
2007 ACM 1529-3785/2007/0700-0001$5.00
ACM Transactions on Computational Logic, Vol. V, No. N, July 2007, Pages 1–30.

2 · Michael A. Bender and Haodong Hu
deed, it has been known for over two decades that the number of element moves per update
is only O(log
2
N) both amortized [Itai et al. 1981] and in the worst case [Willard 1982;
1986; 1992]. Since these data structures were proposed, this problem has been studied un-
der different names, including sparse arrays [Itai et al. 1981; Katriel 2002], sequential file
maintenance [Willard 1982; 1986; 1992], and list labeling [Dietz 1982; Dietz and Sleator
1987; Dietz and Zhang 1990; Dietz et al. 1994]. The problem is also closely related to
the order-maintenance problem [Dietz 1982; Tsakalidis 1984; Dietz and Sleator 1987;
Bender et al. 2002].
Recently there has been renewed interest in these sparse-array data structures because
of their application in I/O-efficient and cache-oblivious algorithms. The I/O-efficient and
cache obliviousversion of the sparse array is called thepacked memory array (PMA) [Ben-
der et al. 2000; 2005]. The PMA maintains N elements in sorted order in a Θ(N)-sized
array. It supports the operations insert, delete, and scan. Let B be the number of ele-
ments that fit within a memory block. To insert an element y after a given element x,
when given a pointer to x, or to delete x, costs O(log
2
N) amortized element moves and
O(1+(log
2
N)/B) amortized memory transfers. The PMA maintains the density invariant
that in any region of size S (for S greater than some small constant value), there are Θ(S)
elements stored in it. To scan L elements after a given element x, when given a pointer to
x, costs Θ(1 + L/B) memory transfers.
The PMA has been used in cache-oblivious B-trees [Bender et al. 2000; Bender et al.
2002; Brodal et al. 2002; Bender et al. 2004; Bender et al. 2005; Bender et al. 2006], con-
current cache-obliviousB-trees [Bender et al. 2005], cache-oblivious string B-tree [Bender
et al. 2006], and scanning structures [Bender et al. 2002]. A sparse array in the same spirit
as the PMA was independently proposed and used in the locality-preserving B-tree of [Ra-
man 1999], although the asymptotic space bounds are superlinear and therefore inferior to
the linear space bounds of the earlier sparse-array data structures [Itai et al. 1981; Willard
1982; 1986; 1992] and the PMA [Bender et al. 2000; 2005].
We now give more details about how to implement search in a PMA. For example, the
update and scan bounds above assume that we are given a pointer to an element x; we now
show how to find such a pointer. A straightforward approach is to use a standard binary
search, slightly modified to deal with gaps. However, binary search does not have good
data locality. As a result, binary search is not efficient when the PMA resides on disk
because search requires O(1+ logdN/Be) memory transfers. An alternative approach is to
use a separate index into the array; the index is designed for efficient searches. In [Raman
1999] that index is a B-tree, and in [Bender et al. 2000; Bender et al. 2002; 2004; Bender
et al. 2005] the index is some type of binary search tree, laid out in memory using a so-
called van Emde Boas layout [Prokop 1999; Bender et al. 2000; 2005].
The primary use of the PMA in the literature has been for sequential storage in mem-
ory/disk of all the elements of a (cache-oblivious or traditional) B-tree. An early paper
suggesting this idea was [Raman 1999]. The PMA maintains locality of reference at all
granularities and consequently supports extremely efficient sequential scans/range queries
of the elements. The concern with traditional B-trees is that the 2K or 4K sizes of disk
blocks are too small to amortize the cost of disk seeks. Consequently, on modern disks,
random block accesses are well over an order-of-magnitude slower than sequential block
accesses. Thus, locality-preserving B-trees and cache-oblivious B-trees based on PMAs
support range queries that run an order of magnitude faster than those of traditional B-
ACM Transactions on ComputationalLogic, Vol. V, No. N, July 2007.

An Adaptive Packed-Memory Array · 3
trees [Bender et al. 2006]. Moreover, since the elements are maintained strictly in sorted
order, these structures do not suffer from aging unlike most file systems and databases.
The point is that traditional B-trees age: As new blocks are allocated and deallocated to
the B-tree, blocks that are logically near each other, are far from each other on the disk.
The result is that range-query performance suffers.
The PMA is an efficient and promising data structure, but it also has weaknesses. The
main weakness is that the PMA performs relatively poorly on some common insertion
patterns such as sequential inserts. For sequential inserts, the PMA performs near its worst
in terms of the number of elements moved per insert. The PMAs difficulty with sequential
inserts is that the insertions “hammer” on one part of the array, causing many elements to
be shifted around. Although O(log
2
N) amortized elements moves and O(1 + (log
2
N)/B)
amortized memory transfers is surprisingly good considering the stringent requirements
on the data order, it is relatively slow compared with traditional B-tree inserts. Moreover,
sequential inserts are common, and B-trees in databases are frequently optimized for this
insertion pattern. It would be better if the PMA could perform near its best, not worst, in
this case.
In contrast, one of the PMAs strengths is its performance on common insertion patterns
such as random inserts. For random inserts, the PMA performs extremely well with only
O(logN) element moves per insert and only O(1 + (logN)/B) memory transfers. This
performance surpasses the guarantees for arbitrary inserts.
Results. This paper proposes an adaptive packed-memory array (abbreviated adaptive
PMA or APMA), which overcomes these deficiencies of the traditional PMA. Our struc-
ture is the first PMA that adapts to insertion patterns and it gives the largest decrease in
the cost of sparse arrays/sequential-file maintenance in almost two decades. The APMA
retains the same amortized guarantees as the traditional PMA, but adapts to common in-
sertion patterns, such as sequential inserts, random inserts, and bulk inserts, where chunks
of elements are inserted at random locations in the array.
We give the following results for the APMA:
We first show that the APMA has the “rebalance property”, which ensures that any
pattern of insertions cost only O(1 + (log
2
N)/B) amortized memory transfers and
O(log
2
N) amortized element moves. Because the elements are kept in sorted order in
the APMA, as with the PMA, scans of L elements costs O(1 + L/B) memory trans-
fers. Thus, the adaptive PMA guarantees performance at least as good as that of the
traditional PMA.
We next analyze the performance of the APMA under some common insertion patterns.
We show that for sequential inserts, where all the inserts are to the front of the ar-
ray, the APMA makes only O(logN) amortized element moves and O(1+ (logN)/B)
amortized memory transfers.
We generalize this analysis to hammer inserts, where the inserts hammer on any single
element in the array.
We then turn to random inserts, where each insert occurs after a randomly chosen ele-
ment in the array. We establish that the insertioncost is again onlyO(logN) amortized
element moves and O(1+ (logN)/B) amortized memory transfers.
We generalize all these previous results by analyzing the case of bulk inserts. In the
bulk-insert insertion pattern, we pick a random element in the array and perform N
α
inserts after it for α [0, 1]. We show that for all values of α [0,1], the APMA also
ACM Transactions on ComputationalLogic, Vol. V, No. N, July 2007.

4 · Michael A. Bender and Haodong Hu
only performs O(logN) amortized element moves and O(1 + (logN)/B) amortized
memory transfers.
We next perform simulations and experiments, measuring the performance of the
APMA on these insertion patterns. For sequential insertions of roughly 1.4 million
elements, the APMA has over four times fewer element moves per insertion than the
traditional PMA and running times that are nearly seven times faster. For bulk inser-
tions of 1.4 million elements, where f(N) = N
0.6
, the APMA has over twotimes fewer
element moves per insertion than the traditional PMA and running times that are over
three times faster.
2. ADAPTIVE PACKED-MEMORY ARRAY
In this section we introduce the adaptive PMA. We first explain how the adaptive PMA
differs from the traditional PMA. We then show that both PMAs have the same amor-
tized bounds, O(log
2
N) element moves and O(1 + (log
2
N)/B) memory transfers per in-
sert/delete. Thus, adaptivity comes at no extra asymptotic cost.
Description of Traditionaland Adaptive PMAs. We first describe how to insert into both
the adaptive and traditional PMAs. Henceforth, PMA with no preceding adjective refers
to either structure. When we insert an element y after an existing element x in the PMA,
we look for a neighborhood around element x that has sufficiently low density, that is,
we look for a subarray that is not storing too many or too few elements. Once we find
a neighborhood of the appropriate density, we rebalance the neighborhood by spacing
out the elements, including y. In the traditional PMA, we rebalance by spacing out the
elements evenly. In the adaptive PMA, we may rebalance the elements unevenly, based on
previous insertions, that is, we leave extra gaps near elements that have recently had inserts
after them.
We deal with a PMA that is too full or empty, as with a traditional hash table. Namely,
we recopy the elements into a new PMA that is a constant factor larger or smaller. In this
paper, this constant is stated as 2. However, the constant could be larger or smaller (say
1.2) with almost no change in running time. This is because most of the cost from element
moves come from rebalances rather than from recopies.
We now give some terminology. We divide the PMA into Θ(N/ logN) segments, each
of size Θ(logN), and we let the number of segments be a power of 2. We call a contiguous
group of segments a window. We view the PMA in terms of a tree structure, where the
nodes of the tree are windows. The root node is the window containing all segments, and
a leaf node is a window containing a single segment. A node in the tree that is a window
of 2
i
segments has two children, a left child that is the window of the first 2
i1
segments
and a right child that is the window of the last 2
i1
segments.
We let the height of the tree be h, so that 2
h
= Θ(N/ logN) and h = lgN lglgN +
O(1). The nodes at each height ` have an upper density threshold τ
`
and a lower density
threshold ρ
`
, which together determine the acceptable density of keys within a window of
2
`
segments. As the node height increases, the upper density thresholds decrease and the
lower density thresholds increase. Thus, for constant minimum and maximum densities
D
min
and D
max
, we have
D
min
= ρ
0
< ··· < ρ
h
< τ
h
< ··· < τ
0
= D
max
. (1)
The density thresholds on windows of intermediate powers of 2 are arithmetically dis-
ACM Transactions on ComputationalLogic, Vol. V, No. N, July 2007.

An Adaptive Packed-Memory Array · 5
tributed. For example, the maximum density threshold of a segment can be set to 1.0, the
maximum density threshold of the entire array to 0.5, the minimum density threshold of
the entire array to 0.2, and the minimum density of a segment to 0.1. If the PMA has 32
segments, then the maximum density threshold of a single segment is 1.0, of two segments
is 0.9, of four segments is 0.8, of eight segments is 0.7, of 16 segments is 0.6, and of all 32
segments is 0.5.
More formally, upper and lower density thresholds for nodes at height ` are defined as
follows:
τ
`
= τ
h
+ (τ
0
τ
h
)(h `)/h (2)
ρ
`
= ρ
h
(ρ
h
ρ
0
)(h `)/h. (3)
Moreover,
2ρ
h
< τ
h
, (4)
because when we double the size of an array that becomes too dense, the new array must
be within the density threshold.
1
Observe that any values of τ
0
, τ
h
, ρ
0
, and ρ
h
that satisfy
(1)-(4) and enable the array to have size Θ(N) will work. The important requirement is
that
τ
`1
τ
`
= O(ρ
`
ρ
`1
) = O(1/ logN).
We now give more details about how to insert element y after an existing element x. If
there is enough space in the leaf (segment) containing x, then we rearrange the elements
within the leaf to make room for y. If the leaf is full, then we find the closest ancestor of the
leaf whose density is within the permitted thresholds and rebalance. To delete an element
x, we remove x from its segment. If the segment falls below its density threshold, then,
as before, we find the smallest enclosing window whose density is within threshold and
rebalance. As described above, if the entire array is above the maximum density threshold
(resp., below the minimum density threshold), then we recopy the keys into a PMA of
twice (resp., half) the size.
We introduce further notation. Let Cap(u
`
) denote the number of array positions in
node u
`
of height `. Since there are 2
`
segments in the node, the capacity is Θ(2
`
logN).
Let Gaps(u
`
) denote the number of gaps, i.e., unfilled array positions in node u
`
. Let
Density(u
`
) denote the fraction of elements actually stored in node u
`
, i.e., Density(u
`
) =
1 Gaps(u
`
)/Cap(u
`
).
Rebalance. We rebalance a node u
`
of height ` if u
`
is within threshold, but we detect
that a child node u
`1
is outside of threshold. Any node whose elements are rearranged in
the process of a rebalance is swept. Thus, we sweep a node u
`
of height ` when we detect
that a child node u
`1
is outside of threshold, butnow u
`
need not be withinthreshold. Note
that with this rebalance scheme, thistree can be implicitlyrather than explicitlymaintained.
In this case, a rebalance consists of two scans, one to the left and one to the right of the
insertion point until we find a region of the appropriate density.
In a traditional PMA we rebalance evenly, whereas in the adaptive PMA we rebalance
unevenly. The idea of the APMA is to store a smaller number of elements in the leaves in
1
There are straightforward ways to generalize (4) to further reduce space usage. Introducing this generalization
here leads to unnecessary complication in presentation.
ACM Transactions on ComputationalLogic, Vol. V, No. N, July 2007.

Citations
More filters
Proceedings ArticleDOI

ALEX: An Updatable Adaptive Learned Index

TL;DR: A new learned index called ALEX is presented which addresses practical issues that arise when implementing learned indexes for workloads that contain a mix of point lookups, short range queries, inserts, updates, and deletes and effectively combines the core insights from learned indexes with proven storage and indexing techniques to achieve high performance and low memory footprint.
Journal ArticleDOI

Hashedcubes: Simple, Low Memory, Real-Time Visual Exploration of Big Data

TL;DR: The algorithms to build and query Hashedcubes are described, and how it can drive well-known interactive visualizations such as binned scatterplots, linked histograms and heatmaps, and the typical query is answered fast enough to easily sustain a interaction.
Proceedings ArticleDOI

Updating a cracked database

TL;DR: In this article, the authors introduce several novel algorithms for high-volume insertions, deletions and updates against a cracked database, which comply with the cracking philosophy, i.e., a table is informed on pending insertions and deletions, but only when the relevant data is needed for query processing just enough pending update actions are applied.
Journal ArticleDOI

Accelerating dynamic graph analytics on GPUs

TL;DR: This paper proposes a GPU-based dynamic graph storage scheme to support existing graph algorithms easily and proposes parallel update algorithms to support efficient stream updates so that the maintained graph is immediately available for high-speed analytic processing on GPUs.
Journal ArticleDOI

A survey of B-tree logging and recovery techniques

TL;DR: Central in this discussion are physical data independence, separation of logical database contents and physical representation, and the concepts of user transactions and system transactions.
References
More filters
Book ChapterDOI

Cache-oblivious algorithms

TL;DR: It is proved that an optimal cache-oblivious algorithm designed for two levels of memory is also optimal across a multilevel cache hierarchy, and it is shown that the assumption of optimal replacement made by the ideal-cache model can be simulated efficiently by LRU replacement.
Proceedings ArticleDOI

Two algorithms for maintaining order in a list

TL;DR: The order maintenance problem is that of maintaining a list under a sequence of Insert and Delete operations, while answering Order queries (determine which of two elements comes first in the list).
Proceedings ArticleDOI

Maintaining order in a linked list

Paul F. Dietz
TL;DR: The paper concludes with two applications: determining ancestor relationships in a growing tree and maintaining a tree structured environment (context tree) for linked lists.
Proceedings ArticleDOI

Cache oblivious search trees via binary trees of small height

TL;DR: A version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and Farach-Colton and has the same complexity bounds is proposed, and can be implemented as just a single array of data elements without the use of pointers.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What contributions have the authors mentioned in the paper "An adaptive packed-memory array" ?

This paper gives the first adaptive packed-memory array ( APMA ), which automatically adjusts to the input pattern. The paper analyzes sequential inserts, where the insertions are to the front of the APMA, hammer inserts, where the insertions “ hammer ” on one part of the APMA, random inserts, where the insertions are after random elements in the APMA, and bulk inserts, where for constant α ∈ [ 0,1 ], Nα elements are inserted after random elements in the APMA. The paper then gives simulation results that are consistent with the asymptotic bounds.