scispace - formally typeset
Open AccessProceedings ArticleDOI

On Optimal Balance in B-Trees: What Does It Cost to Stay in Perfect Shape?

TLDR
A lower bound is proved on the cost of maintaining optimal height ceil[log_B(n)], which shows that this cost must increase from Omega(1/B) to Omega(n/ B) rebalancing per update as n grows from one power of B to the next.
Abstract
Any B-tree has height at least ceil[log_B(n)]. Static B-trees achieving this height are easy to build. In the dynamic case, however, standard B-tree rebalancing algorithms only maintain a height within a constant factor of this optimum. We investigate exactly how close to ceil[log_B(n)] the height of dynamic B-trees can be maintained as a function of the rebalancing cost. In this paper, we prove a lower bound on the cost of maintaining optimal height ceil[log_B(n)], which shows that this cost must increase from Omega(1/B) to Omega(n/B) rebalancing per update as n grows from one power of B to the next. We also provide an almost matching upper bound, demonstrating this lower bound to be essentially tight. We then give a variant upper bound which can maintain near-optimal height at low cost. As two special cases, we can maintain optimal height for all but a vanishing fraction of values of n using Theta(log_B(n)) amortized rebalancing cost per update and we can maintain a height of optimal plus one using O(1/B) amortized rebalancing cost per update. More generally, for any rebalancing budget, we can maintain (as n grows from one power of B to the next) optimal height essentially up to the point where the lower bound requires the budget to be exceeded, after which optimal height plus one is maintained. Finally, we prove that this balancing scheme gives B-trees with very good storage utilization.

read more

Citations
More filters
Proceedings ArticleDOI

Online List Labeling: Breaking the log2n Barrier

TL;DR: The solution is history independent, meaning that the state of the datastructure is independent of the order in which items are inserted/deleted, and a matching lower bound is proved: for all ε between 1 / n 1 / 3 and some sufficiently small positive constant, the optimal expected cost for history-independentlist-labeling solutions is Θ ( ε − 1 log 3 / 2 n ) .
References
More filters
Journal ArticleDOI

Ubiquitous B-Tree

TL;DR: The major variations of the B-tree are discussed, especially the B+-tree, contrasting the merits and costs of each implementation and illustrating a general purpose access method that uses a B- tree.
Journal ArticleDOI

On random 2---3 trees

TL;DR: It is shown that ¯n (N), the average number of nodes in an N-key random 2–3 tree, satisfies the inequality 0.70 N < ¯n(N) <0.79 N for large N.
Proceedings ArticleDOI

Cache oblivious search trees via binary trees of small height

TL;DR: A version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and Farach-Colton and has the same complexity bounds is proposed, and can be implemented as just a single array of data elements without the use of pointers.
Related Papers (5)