Topic

Merge sort

About: Merge sort is a research topic. Over the lifetime, 607 publications have been published within this topic receiving 11296 citations. The topic is also known as: mergesort & MergeSort.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The log-structured merge-tree (LSM-tree)

[...]

Patrick O'Neil¹, Edward C. Cheng, Dieter Gawlick², Elizabeth O'Neil¹•Institutions (2)

University of Massachusetts Boston¹, Oracle Corporation²

01 Jun 1996-Acta Informatica

TL;DR: The log-structured mergetree (LSM-tree) is a disk-based data structure designed to provide low-cost indexing for a file experiencing a high rate of record inserts (and deletes) over an extended period.

...read moreread less

Abstract: High-performance transaction system applications typically insert rows in a History table to provide an activity trace; at the same time the transaction system generates log records for purposes of system recovery. Both types of generated information can benefit from efficient indexing. An example in a well-known setting is the TPC-A benchmark application, modified to support efficient queries on the history for account activity for specific accounts. This requires an index by account-id on the fast-growing History table. Unfortunately, standard disk-based index structures such as the B-tree will effectively double the I/O cost of the transaction to maintain an index such as this in real time, increasing the total system cost up to fifty percent. Clearly a method for maintaining a real-time index at low cost is desirable. The log-structured mergetree (LSM-tree) is a disk-based data structure designed to provide low-cost indexing for a file experiencing a high rate of record inserts (and deletes) over an extended period. The LSM-tree uses an algorithm that defers and batches index changes, cascading the changes from a memory-based component through one or more disk components in an efficient manner reminiscent of merge sort. During this process all index values are continuously accessible to retrievals (aside from very short locking periods), either through the memory component or one of the disk components. The algorithm has greatly reduced disk arm movements compared to a traditional access methods such as B-trees, and will improve cost-performance in domains where disk arm costs for inserts with traditional access methods overwhelm storage media costs. The LSM-tree approach also generalizes to operations other than insert and delete. However, indexed finds requiring immediate response will lose I/O efficiency in some cases, so the LSM-tree is most useful in applications where index inserts are more common than finds that retrieve the entries. This seems to be a common property for history tables and log files, for example. The conclusions of Sect. 6 compare the hybrid use of memory and disk components in the LSM-tree access method with the commonly understood advantage of the hybrid method to buffer disk pages in memory.

...read moreread less

1,097 citations

Journal Article•DOI•

Parallel merge sort

[...]

Richard Cole¹•Institutions (1)

New York University¹

01 Aug 1988-SIAM Journal on Computing

TL;DR: A parallel implementation of merge sort on a CREW PRAM that uses n processors and O(logn) time; the constant in the running time is small.

...read moreread less

Abstract: We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and $O(\log n)$ time; the constant in the running time is small. We also give a more complex version of the algorithm for the EREW PRAM; it also uses n processors and $O(\log n)$ time. The constant in the running time is still moderate, though not as small.

...read moreread less

847 citations

Proceedings Article•DOI•

Designing efficient sorting algorithms for manycore GPUs

[...]

Nadathur Satish¹, Mark J. Harris², Michael Garland²•Institutions (2)

University of California, Berkeley¹, Nvidia²

23 May 2009

TL;DR: The design of high-performance parallel radix sort and merge sort routines for manycore GPUs, taking advantage of the full programmability offered by CUDA, are described, which are the fastest GPU sort and the fastest comparison-based sort reported in the literature.

...read moreread less

Abstract: We describe the design of high-performance parallel radix sort and merge sort routines for manycore GPUs, taking advantage of the full programmability offered by CUDA. Our radix sort is the fastest GPU sort and our merge sort is the fastest comparison-based sort reported in the literature. Our radix sort is up to 4 times faster than the graphics-based GPUSort and greater than 2 times faster than other CUDA-based radix sorts. It is also 23% faster, on average, than even a very carefully optimized multicore CPU sorting routine. To achieve this performance, we carefully design our algorithms to expose substantial fine-grained parallelism and decompose the computation into independent tasks that perform minimal global communication. We exploit the high-speed onchip shared memory provided by NVIDIA's GPU architecture and efficient data-parallel primitives, particularly parallel scan. While targeted at GPUs, these algorithms should also be well-suited for other manycore processors.

...read moreread less

684 citations

Proceedings Article•DOI•

A comparison of sorting algorithms for the connection machine CM-2

[...]

Guy E. Blelloch¹, Charles E. Leiserson², Bruce M. Maggs³, C. Greg Plaxton⁴, Stephen J. Smith, Marco Zagha¹ - Show less +2 more•Institutions (4)

Carnegie Mellon University¹, Massachusetts Institute of Technology², Princeton University³, University of Texas at Austin⁴

01 Jun 1991

TL;DR: A fast sorting algorithm for the Connection Machine Supercomputer model CM-2 is developed and it is shown that any U(lg n)-depth family of sorting networks can be used to sort n numbers in U( lg n) time in the bounded-degree fixed interconnection network domain.

...read moreread less

Abstract: Sorting is arguably the most studied problem in computer science, both because it is used as a substep in many applications and because it is a simple, combinatorial problem with many interesting and diverse solutions. Sorting is also an important benchmark for parallel supercomputers. It requires significant communication bandwidth among processors, unlike many other supercomputer benchmarks, and the most efficient sorting algorithms communicate data in irregular patterns. Parallel algorithms for sorting have been studied since at least the 1960’s. An early advance in parallel sorting came in 1968 when Batcher discovered the elegant U(lg2 n)-depth bitonic sorting network [3]. For certain families of fixed interconnection networks, such as the hypercube and shuffle-exchange, Batcher’s bitonic sorting technique provides a parallel algorithm for sorting n numbers in U(lg2 n) time with n processors. The question of existence of a o(lg2 n)-depth sorting network remained open until 1983, when Ajtai, Komlos, and Szemeredi [1] provided an optimal U(lg n)-depth sorting network, but unfortunately, their construction leads to larger networks than those given by bitonic sort for all “practical” values of n. Leighton [15] has shown that any U(lg n)-depth family of sorting networks can be used to sort n numbers in U(lg n) time in the bounded-degree fixed interconnection network domain. Not surprisingly, the optimal U(lg n)-time fixed interconnection sorting networks implied by the AKS construction are also impractical. In 1983, Reif and Valiant proposed a more practical O(lg n)-time randomized algorithm for sorting [19], called flashsort. Many other parallel sorting algorithms have been proposed in the literature, including parallel versions of radix sort and quicksort [5], a variant of quicksort called hyperquicksort [23], smoothsort [18], column sort [15], Nassimi and Sahni’s sort [17], and parallel merge sort [6]. This paper reports the findings of a project undertaken at Thinking Machines Corporation to develop a fast sorting algorithm for the Connection Machine Supercomputer model CM-2. The primary goals of this project were:

...read moreread less

362 citations

Book•

Parallel merge sort

[...]

Richard Cole

09 Sep 2011

TL;DR: In this paper, a parallel implementation of merge sort on a CREW PRAM that uses n processors and O(logn) time is given, and the constant in the running time is small.

...read moreread less

Abstract: We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and O(logn) time; the constant in the running time is small. We also give a more complex version of the algorithm for the EREW PRAM; it also uses n processors and O(logn) time. The constant in the running time is still moderate, though not as small.

...read moreread less

346 citations

Collapse

Network Information

Performance

Metrics

650

Papers

12,249

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	35
2021	13
2020	33
2019	23
2018	31

Merge sort

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics