Author

# Haodong Hu

Other affiliations: Shanghai University of Finance and Economics, Microsoft

Bio: Haodong Hu is an academic researcher from Stony Brook University. The author has contributed to research in topic(s): Upper and lower bounds & Cache-oblivious algorithm. The author has an hindex of 8, co-authored 9 publication(s) receiving 232 citation(s). Previous affiliations of Haodong Hu include Shanghai University of Finance and Economics & Microsoft.

##### Papers

More filters

••

11 Jan 2004TL;DR: This work studies the problem of sorting integer sequences and permutations by length-weighted reversals, and gives polynomial-time algorithms to determine the optimal reversal sequence for a restricted but interesting class of sequences and cost functions.

Abstract: We study the problem of sorting integer sequences and permutations by length-weighted reversals. We consider a wide class of cost functions, namely f(l) = lα for all α ≥ 0, where l is the length of the reversed subsequence. We present tight or nearly tight upper and lower bounds on the worst-case cost of sorting by reversals. Then we develop algorithms to approximate the optimal cost to sort a given input. Furthermore, we give polynomial-time algorithms to determine the optimal reversal sequence for a restricted but interesting class of sequences and cost functions. Our results have direct application in computational biology to the field of comparative genomics.

43 citations

••

TL;DR: The first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern, is given, which has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.

Abstract: The packed-memory array (PMA) is a data structure that maintains a dynamic set of N elements in sorted order in a Θ(N)-sized array. The idea is to intersperse Θ(N) empty spaces or gaps among the elements so that only a small number of elements need to be shifted around on an insert or delete. Because the elements are stored physically in sorted order in memory or on disk, the PMA can be used to support extremely efficient range queries. Specifically, the cost to scan L consecutive elements is O(1 p LsB) memory transfers.This article gives the first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern. Like the traditional PMA, any pattern of updates costs only O(log2N) amortized element moves and O(1 p (log2N)sB) amortized memory transfers per update. However, the APMA performs even better on many common input distributions achieving only O(log N) amortized element moves and O(1p (logN)sB) amortized memory transfers. The article analyzes sequential inserts, where the insertions are to the front of the APMA, hammer inserts, where the insertions “hammer” on one part of the APMA, random inserts, where the insertions are after random elements in the APMA, and bulk inserts, where for constant α e [0, 1], Nα elements are inserted after random elements in the APMA. The article then gives simulation results that are consistent with the asymptotic bounds. For sequential insertions of roughly 1.4 million elements, the APMA has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.

40 citations

••

Stony Brook University

^{1}, Aarhus University^{2}, New York University^{3}, University of Waterloo^{4}TL;DR: It is shown that for a multilevel memory hierarchy, a simple cache-oblivious structure almost replicates the performance of an optimal parameterized k-level DAM structure, and it is demonstrated that as k grows, the search costs of the optimal k- level DAM search structure and the optimal cache-OBlivious search structure rapidly converge.

Abstract: Tight bounds on the cost of cache-oblivious searching are proved. It is shown that no cache-oblivious search structure can guarantee that a search performs fewer than lg e log/sub B/N block transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the block sizes are limited to be powers of 2. A modified version of the van Emde Boas layout is proposed, whose expected block transfers between any two levels of the memory hierarchy arbitrarily close to [lg e + O(lg lg B/ lgB)] logB N + O(1). This factor approaches lg e /spl ap/ 1.443 as B increases. The expectation is taken over the random placement of the first element of the structure in memory. As searching in the disk access model (DAM) can be performed in log/sub B/N + 1 block transfers, this result shows a separation between the 2-level DAM and cache-oblivious memory-hierarchy models. By extending the DAM model to k levels, multilevel memory hierarchies can be modeled. It is shown that as k grows, the search costs of the optimal k-level DAM search structure and of the optimal cache-oblivious search structure rapidly converge. This demonstrates that for a multilevel memory hierarchy, a simple cache-oblivious structure almost replicates the performance of an optimal parameterized k-level DAM structure.

33 citations

••

26 Jun 2006TL;DR: The first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern, is given, which has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.

Abstract: The packed-memory array (PMA) is a data structure that maintains a dynamic set of N elements in sorted order in a Θ(N)-sized array. The idea is to intersperse Θ(N) empty spaces or gaps among the elements so that only a small number of elements need to be shifted around on an insert or delete. Because the elements are stored physically in sorted order in memory or on disk, the PMA can be used to support extremely efficient range queries. Specifically, the cost to scan L consecutive elements is O(1+L/B) memory transfers.This paper gives the first adaptive packed-memory array (APMA), which automatically adjusts to the input pattern. Like the original PMA, any pattern of updates costs only O(log2N) amortized element moves and O(1+(log2N)/B) amortized memory transfers per update. However, the APMA performs even better on many common input distributions achieving only O(logN) amortized element moves and O(1+(logN)/B) amortized memory transfers. The paper analyzes sequential inserts, where the insertions are to the front of the APMA, hammer inserts, where the insertions "hammer" on one part of the APMA, random inserts, where the insertions are after random elements in the APMA, and bulk inserts, where for constant α∈[0,1], Nα elements are inserted after random elements in the APMA. The paper then gives simulation results that are consistent with the asymptotic bounds. For sequential insertions of roughly 1.4 million elements, the APMA has four times fewer element moves per insertion than the traditional PMA and running times that are more than seven times faster.

32 citations

••

05 Jul 2004TL;DR: The main result in this paper is an optimal polynomial-time algorithm for sorting circular 0/1 sequences when the cost function is additive.

Abstract: We consider the problem of sorting linear and circular permutations and 0/1 sequences by reversals in a length-sensitive cost model. We extend the results on sorting by length-weighted reversals in two directions: we consider the signed case for linear sequences and also the signed and unsigned cases for circular sequences. We give lower and upper bounds as well as guaranteed approximation ratios for these three cases. The main result in this paper is an optimal polynomial-time algorithm for sorting circular 0/1 sequences when the cost function is additive.

31 citations

##### Cited by

More filters

•

TL;DR: BLOCKIN BLOCKINÒ BLOCKin× ½¸ÔÔº ¾ßß¿º ¿ ¾ ¾ Ã ¼ Ã Ã 0

Abstract: BLOCKIN BLOCKINÒ BLOCKIN× ½¸ÔÔº ¿ßß¿º ¿

372 citations

•

06 Apr 2010

TL;DR: In this article, a high-performance dictionary data structure is defined for storing data in a disk storage system, which supports full transactional semantics, concurrent access from multiple transactions, and logging and recovery.

Abstract: A method, apparatus and computer program product for storing data in a disk storage system is presented. A high-performance dictionary data structure is defined. The dictionary data structure is stored on a disk storage system. Key-value pairs can be inserted and deleted into the dictionary data structure. Updates run faster than one insertion per disk-head movement. The structure can also be stored on any system with two or more levels of memory. The dictionary is high performance and supports with full transactional semantics, concurrent access from multiple transactions, and logging and recovery. Keys can be looked up with only a logarithmic number of transfers, even for keys that have been recently inserted or deleted. Queries can be performed on ranges of key-value pairs, including recently inserted or deleted pairs, at a constant fraction of the bandwidth of the disk.

146 citations

•

Hewlett-Packard

^{1}TL;DR: This tutorial of B- tree techniques will stimulate research and development of modern B-tree indexing techniques for future data management systems.

Abstract: In summary, the core design of B-trees has remained unchanged in 40 years: balanced trees, pages or other units of I/O as nodes, efficient root-to-leaf search, splitting and merging nodes, etc. On the other hand, an enormous amount of research and development has improved every aspect of B-trees including data contents such as multi-dimensional data, access algorithms such as multi-dimensional queries, data organization within each node such as compression and cache optimization, concurrency control such as separation of latching and locking, recovery such as multi-level recovery, etc. Gray and Reuter believed in 1993 that “B-trees are by far the most important access path structure in database and file systems.” It seems that this statement remains true today. B-tree indexes are likely to gain new importance in relational databases due to the advent of flash storage. Fast access latencies permit many more random I/O operations than traditional disk storage, thus shifting the break-even point between a full-bandwidth scan and a B-tree index search, even if the scan has the benefit of columnar database storage. We hope that this tutorial of B-tree techniques will stimulate research and development of modern B-tree indexing techniques for future data management systems.

137 citations

••

08 Jul 2004TL;DR: An overview of the results achieved on cache-oblivious algorithms and data structures since the seminal paper by Frigo et al. in 1999 is given.

Abstract: Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the ideal-cache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cache-oblivious algorithms. Cache-oblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the two-level I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal off-line cache replacement strategy. The result are algorithms that automatically apply to multi-level memory hierarchies. This paper gives an overview of the results achieved on cache-oblivious algorithms and data structures since the seminal paper by Frigo et al.

109 citations

••

18 Jul 2005TL;DR: The cache-oblivious model is extended to a parallel or distributed setting and three concurrent CO B-trees are presented, showing that these data structures are linearizable, meaning that completed operations appear to an outside viewer as though they occurred in some serialized order.

Abstract: This paper presents concurrent cache-oblivious (CO) B-trees. We extend the cache-oblivious model to a parallel or distributed setting and present three concurrent CO B-trees. Our first data structure is a concurrent lock-based exponential CO B-tree. This data structure supports insertions and non-blocking searches/successor queries. The second and third data structures are lock-based and lock-free variations, respectively, on the packed-memory CO B-tree. These data structures support range queries and deletions in addition to the other operations. Each data structure achieves the same serial performance as the original data structure on which it is based. In a concurrent setting, we show that these data structures are linearizable, meaning that completed operations appear to an outside viewer as though they occurred in some serialized order. The lock-based data structures are also deadlock free, and the lock-free data structure guarantees forward progress by at least one process.

94 citations