(Open Access) Average case analysis of heap building by repeated insertion (1991) | Ryan B. Hayward

Citations

PDF

Open Access

More filters

Randomized Algorithms

[...]

Rajeev Motwani¹, Prabhakar Raghavan²•Institutions (2)

Stanford University¹, IBM²

05 Mar 2013

TL;DR: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. as discussed by the authors introduces the basic concepts in the design and analysis of randomized algorithms and provides a comprehensive and representative selection of the algorithms that might be used in each of these areas.

...read moreread less

Abstract: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. This book introduces the basic concepts in the design and analysis of randomized algorithms. The first part of the text presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications. Algorithmic examples are also given to illustrate the use of each tool in a concrete setting. In the second part of the book, each chapter focuses on an important area to which randomized algorithms can be applied, providing a comprehensive and representative selection of the algorithms that might be used in each of these areas. Although written primarily as a text for advanced undergraduates and graduate students, this book should also prove invaluable as a reference for professionals and researchers.

...read moreread less

785 citations

Book Chapter•DOI•

Efficient Implementation of Lazy Suffix Trees

[...]

Robert Giegerich¹, Stefan Kurtz¹, Jens Stoye•Institutions (1)

Bielefeld University¹

19 Jul 1999-Lecture Notes in Computer Science

TL;DR: The experiments show that for the problem of searching many exact patterns in a fixed input string, the lazy top-down construction is often faster and more space efficient than other methods.

...read moreread less

Abstract: We present an efficient implementation of a write-only top-down construction for suffix trees. Our implementation is based on a new, space-efficient representation of suffix trees which requires only 12 bytes per input character in the worst case, and 8:5 bytes per input character on average for a collection of files of different type. We show how to efficiently implement the lazy evaluation of suffix trees such that a subtree is evaluated not before it is traversed for the first time. Our experiments show that for the problem of searching many exact patterns in a fixed input string, the lazy top-down construction is often faster and more space efficient than other methods.

...read moreread less

104 citations

Book Chapter•DOI•

Performance Engineering Case Study: Heap Construction

[...]

Jesper Bojesen¹, Jyrki Katajainen², Maz Spork²•Institutions (2)

Technical University of Denmark¹, University of Copenhagen²

19 Jul 1999-Lecture Notes in Computer Science

TL;DR: Analysis of the behaviour of three methods for constructing a binary heap shows that, under reasonable assumptions, repeated insertion and layerwise construction both incur at most cN/B cache misses, whereas repeated merging, as programmed by Floyd, can incur more than (dN log2 B)/B caches misses.

...read moreread less

Abstract: The behaviour of three methods for constructing a binary heap is studied. The methods considered are the original one proposed by Williams [1964], in which elements are repeatedly inserted into a single heap; the improvement by Floyd [1964], in which small heaps are repeatedly merged to bigger heaps; and a recent method proposed, e. g., by Fadel et al. [1999] in which a heap is built layerwise. Both the worst-case number of instructions and that of cache misses are analysed. It is well-known that Floyd's method has the best instruction count. Let N denote the size of the heap to be constructed, B the number of elements that fit into a cache line, and let c and d be some positive constants. Our analysis shows that, under reasonable assumptions, repeated insertion and layerwise construction both incur at most cN/B cache misses, whereas repeated merging, as programmed by Floyd, can incur more than (dN log2 B)/B cache misses. However, for a memory-tuned version of repeated merging the number of cache misses incurred is close to the optimal bound N/B.

...read moreread less

17 citations

Cites background from "Average case analysis of heap build..."

...There are two reasonsfor this:(1) In the average case the number of instructions executed by Williams' programis linear [Hayward and McDiarmid 1991], which is guaranteed in the worst caseby Floyd's program....
[...]

Book•

Algorithm Engineering: 3rd International Workshop, WAE'99 London, UK, July 19-21, 1999 Proceedings

[...]

Jeffrey Scott Vitter, Christos D. Zaloliagis

24 Sep 1999

TL;DR: Experiments with List Ranking for Explicit Multi-Threaded Instruction Parallelism and Evaluation of an Algorithm for the Transversal Hypergraph Problem.

...read moreread less

Abstract: Invited Lectures.- Selecting Problems for Algorithm Evaluation.- BSP Algorithms - "Write Once, Run Anywhere".- Ten Years of LEDA: Some Thoughts.- Contributed Papers.- Computing the K Shortest Paths: A New Algorithm and an Experimental Comparison.- Efficient Implementation of Lazy Suffix Trees.- Experiments with List Ranking for Explicit Multi-Threaded (XMT) Instruction Parallelism.- Finding Minimum Congestion Spanning Trees.- Evaluation of an Algorithm for the Transversal Hypergraph Problem.- Construction Heuristics and Domination Analysis for the Asymmetric TSP.- Counting in Mobile Networks: Theory and Experimentation.- Dijkstra's Algorithm On-Line: An Empirical Case Study from Public Railroad Transport.- Implementation and Experimental Evaluation of Graph Connectivity Algorithms Using LEDA.- On-Line Zone Construction in Arrangements of Lines in the Plane.- The Design and Implementation of Planar Maps in CGAL.- An Easy to Use Implementation of Linear Perturbations within Cupgal.- Analysing Cache Effects in Distribution Sorting.- Fast Regular Expression Search.- An Experimental Evaluation of Hybrid Data Structures for Searching.- LEDA-SM: Extending LEDA to Secondary Memory.- A Priority Queue Transform.- Implementation Issues and Experimental Study of a Wavelength Routing Algorithm for Irregular All-Optical Networks.- Estimating Large Distances in Phylogenetic Reconstruction.- The Performance of Concurrent Red-Black Tree Algorithms.- Performance Engineering Case Study: Heap Construction.- A Fast and Simple Local Search for Graph Coloring.- BALL: Biochemical Algorithms Library.- An Experimental Study of Priority Queues in External Memory.

...read moreread less

17 citations

Book Chapter•DOI•

On the Number of Heaps and the Cost of Heap Construction

[...]

Hwang Hsien-Kuei¹, Steyaert Jean-Marc²•Institutions (2)

Academia Sinica¹, École Polytechnique²

01 Jan 2002

TL;DR: A number of general (not restricting to special subsequences) asymptotic results are presented that give insight on the difficulties encountered in the asymPTotic study of the number of heaps of a given size and of the cost of heap construction.

...read moreread less

Abstract: Heaps constitute a well-known data structure allowing the implementation of an efficient O(n log n) sorting algorithm as well as the design of fast priority queues. Although heaps have been known for long, their combinatorial properties are still partially worked out: exact summation formulae have been stated, but most of the asymptotic behaviors are still unknown. In this paper, we present a number of general (not restricting to special subsequences) asymptotic results that give insight on the difficulties encountered in the asymptotic study of the number of heaps of a given size and of the cost of heap construction. In particular, we exhibit the influence of arithmetic functions in the apparently chaotic behavior of these quantities and study their extremal and average properties. It is also shown that the distribution function of the cost of heap construction using Floyd’s algorithm and other variants is asymptotically normal.

...read moreread less

14 citations

Cites background or methods from "Average case analysis of heap build..."

...In particular, this rule applies to the heap construction algorithms in [3, 19, 12, 29], the basic ideas of improvement being more or less due to Floyd....
[...]
...The average case analysis of its behavior is more difficult; see [2, 8, 12]....
[...]

References

Related Papers (5)

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Average case analysis of heap building by repeated insertion" ?

The heap is a much used and much studied data structure ( for example, see Knuth [ K ] ).

Q2. how can i compute wn in lg n space?

by storing the probability arrays P (n, i, ·) only for nodes i on IP+(n), it is possible to compute E[Wn] in only O(n lg n) space but O(n 3 lg n) time.

Q3. What is the insertion path of the key?

Observe that during the insertion of some key with R(n) = k, a swap takes place at node i of the insertion path if and only if k ≤ An−1[i].

Q4. What is the expected number of swaps along the two links incident with the root?

When bubbling up the nodes in Lk, the expected number of swaps along the two links incident with the root is ∑2k+1−1j=2k 1/j > ln 2.

Q5. What is the rank of a number x in a set of numbers?

The rank of a number x in a set of numbers is its placement in the ordered set; thus the smallest number has rank 1, the next smallest rank 2, etc. Let An[i] be the rank of A[i] among A[1, . . . , n] after exactly n keys have been inserted.

Q6. what is the avg. exp. of the array?

Heapsize 31level avg. exp. val.0 0.031250000000000 1 0.092036756202444 2 0.205202826266408 3 0.401927398975604 4 0.703027874420290

Q7. What is the meaning of array indices?

Here and for the rest of this section, array indices are understood to be integers; any fraction x/y used an array index is actually ⌊x/y⌋.

Q8. Hence, for k k2 j=k4 l?

Hence for j ≥ k − k2 + 1,Prob{2−jBj,k > 2 j−k+2 + 2−s+1} ≤ (2 lg lg k) exp(−2kk2(lg k)8 ) .Now Σ2 ≤ ∑k−k1j=k−k2+1 2−jBj,k, soProb{Σ2 > 2 −k1+3 + (k2 − k1)2 −s+1} ≤ (k2 − k1)(2 lg lg k) exp(− 2kk2(lg k)8 ) .

Q9. What is the insertion path of nodes t?

For each node t in level Lk (k > 0) let IP (t) = {⌊t/2j⌋ : j = 1, . . . , k} be the set of nodes on the insertion path from node t to the root.

Q10. What is the method of heap building?

This method requires (2+o(1))n comparisons in the worst case to build a heap with n keys, which is less than Williams’ method takes on average.

Average case analysis of heap building by repeated insertion

Summary (1 min read)

1 Introduction

2 Qualitative results for expected times

5 Probability bounds

6 Repeated insertion into equi-probable heaps

7 Concluding remarks

Citations

Cites background from "Average case analysis of heap build..."

Cites background or methods from "Average case analysis of heap build..."

References

"Average case analysis of heap build..." refers methods in this paper

Related Papers (5)

Frequently Asked Questions (10)

Q1. What are the contributions mentioned in the paper "Average case analysis of heap building by repeated insertion" ?

Q2. how can i compute wn in lg n space?

Q3. What is the insertion path of the key?

Q4. What is the expected number of swaps along the two links incident with the root?

Q5. What is the rank of a number x in a set of numbers?

Q6. what is the avg. exp. of the array?

Q7. What is the meaning of array indices?

Q8. Hence, for k k2 j=k4 l?

Q9. What is the insertion path of nodes t?

Q10. What is the method of heap building?