Showing papers on "Bitonic sorter published in 1999"

PDF

Open Access

Dissertation•DOI•

Efficient Algorithms for Sorting and Synchronization

[...]

01 Jan 1999

TL;DR: This thesis presents efficient algorithms for internal and external parallel sorting and remote data update and examines a number of related algorithms for text compression, differencing and incremental backup.

...read moreread less

Abstract: This thesis presents efficient algorithms for internal and external parallel sorting and remote data update. The sorting algorithms approach the problem by concentrating first on highly efficient but incorrect algorithms followed by a cleanup phase that completes the sort. The remote data update algorithm, rsync, operates by exchanging block signature information followed by a simple hash search algorithm for block matching at arbitrary byte boundaries. The last chapter of the thesis examines a number of related algorithms for text compression, differencing and incremental backup.

...read moreread less

431 citations

Journal Article•DOI•

Pipelining with Futures

[...]

Guy E. Blelloch¹, Margaret Reid-Miller•Institutions (1)

Carnegie Mellon University¹

01 Jun 1999-Theory of Computing Systems \/ Mathematical Systems Theory

TL;DR: This paper shows how futures (a parallel language construct) can be used to implement pipelining without requiring the user to code it explicitly, allowing for much simpler code and more asynchronous execution.

...read moreread less

Abstract: Pipelining has been used in the design of many PRAM algorithms to reduce their asymptotic running time. Paul, Vishkin, and Wagener (PVW) used the approach in a parallel implementation of 2-3 trees. The approach was later used by Cole in the first O( lg n) time sorting algorithm on the PRAM not based on the AKS sorting network, and has since been used to improve the time of several other algorithms. Although the approach has improved the asymptotic time of many algorithms, there are two practical problems: maintaining the pipeline is quite complicated for the programmer, and the pipelining forces highly synchronous code execution. Synchronous execution is less practical on asynchronous machines and makes it difficult to modify a schedule to use less memory or to take better advantage of locality. In this paper we show how futures (a parallel language construct) can be used to implement pipelining without requiring the user to code it explicitly, allowing for much simpler code and more asynchronous execution. A runtime system manages the pipelining implicitly. As with user-managed pipelining, we show how the technique reduces the depth of many algorithms by a logarithmic factor over the nonpipelined version. We describe and analyze four algorithms for which this is the case: a parallel merging algorithm on trees, parallel algorithms for finding the union and difference of two randomized balanced trees (treaps), and insertion into a variant of the PVW 2-3 trees. For three of these, the pipeline delays are data dependent making them particularly difficult to pipeline by hand. To determine the runtime of algorithms we first analyze the algorithms in a language-based cost model in terms of the work w and depth d of the computations, and then show universal bounds for implementing the language on various machine models.

...read moreread less

46 citations

Journal Article•DOI•

How to sort N items using a sorting network of fixed I/O size

[...]

Stephan Olariu¹, Maria Cristina Pinotti, Si-Qing Zheng²•Institutions (2)

Old Dominion University¹, University of Texas at Dallas²

01 May 1999-IEEE Transactions on Parallel and Distributed Systems

TL;DR: This work proposes a simple sorting architecture whose main feature is the pipelined use of a sorting network of fixed I/O size p to sort an arbitrarily large data set of N elements and shows that by using the design N elements can be sorted in /spl Theta/(N/p log N/p) time without memory access conflicts.

...read moreread less

Abstract: Sorting networks of fixed I/O size p have been used, thus far, for sorting a set of p elements. Somewhat surprisingly, the important problem of using such a sorting network for sorting arbitrarily large datasets has not been addressed in the literature. Our main contribution is to propose a simple sorting architecture whose main feature is the pipelined use of a sorting network of fixed I/O size p to sort an arbitrarily large data set of N elements. A noteworthy feature of our design is that no extra data memory space is required, other than what is used for storing the input. As it turns out, our architecture is feasible for VLSI implementation and its time performance is virtually independent of the cost and depth of the underlying sorting network. Specifically, we show that by using our design N elements can be sorted in /spl Theta/(N/p log N/p) time without memory access conflicts. Finally, we show how to use an AT/sup 2/-optimal sorting network of fixed I/O size p to construct a similar architecture that sorts N elements in /spl Theta/(N/p log N/p log p) time.

...read moreread less

30 citations

Proceedings Article•DOI•

A simple and efficient parallel disk mergesort

[...]

Rakesh D. Barve¹, Jeffrey Scott Vitter¹•Institutions (1)

Duke University¹

01 Jun 1999

TL;DR: The simple randomized merging (SRM ) mergesort algorithm proposed by Barve et al. is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice.

...read moreread less

Abstract: External sorting—the process of sorting a file that is too large to fit into the computer's internal memory and must be stored externally on disks—is a fundamental subroutine in database systems[G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM ) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K,Section 5.4.9] recently identified SRM (which he calls ``randomized striping'') as the method of choice for sorting with parallel disks.

...read moreread less

16 citations

Proceedings Article•DOI•

Fault tolerance analysis of odd-even transposition sorting networks

[...]

Salam N. Salloum¹, A.L. Perrie•Institutions (1)

University of Wisconsin-Madison¹

22 Aug 1999

TL;DR: This paper investigates the fault-tolerance properties of a special class of sorting networks called the odd-even transposition sorting networks, which have a simple and reliable hardware structure, which is easy to implement with VLSI technology.

...read moreread less

Abstract: Sorting networks are important hardware and software models of parallel sorting operations. They have several applications such as ATM switching, distributed processing, and optical implementation of sorting. In this paper we investigate the fault-tolerance properties of a special class of sorting networks called the odd-even transposition sorting networks. These networks have a simple and reliable hardware structure, which is easy to implement with VLSI technology. A simulation program of these networks' operation has been developed in C++. The simulation results revealed two important properties of odd-even transposition sorting networks: Any single stuck-at-X fault occurring in an internal comparator is redundant. And any two stuck-at-X faults occurring in a large number of internal comparators is redundant.

...read moreread less

7 citations

Journal Article•DOI•

An Efficient General In-Place Parallel Sorting Scheme

[...]

Si-Qing Zheng¹, Balaji Calidas², Yanjun Zhang³•Institutions (3)

University of Texas at Dallas¹, Louisiana State University², Southern Methodist University³

01 Jul 1999-The Journal of Supercomputing

TL;DR: It is shown that ZZ-sort can be used to convert a non-adaptive parallel sorting algorithm into an in-place and adaptive one by considering the problem of sorting an arbitrarily large input on fixed-size reconfigurable meshes.

...read moreread less

Abstract: We present a simple and general parallel sorting scheme, ZZ-sort, which can be used to derive a class of efficient in-place sorting algorithms on realistic parallel machine models. We prove a tight bound for the worst case performance of ZZ-sort. We also demonstrate the average performance of ZZ-sort by experimental results obtained on a MasPar parallel computer. Our experiments indicate that ZZ-sort can be incorporated into a distributed memory parallel computer system as a standard routine, and this routine is useful for space critical situations. Finally, we show that ZZ-sort can be used to convert a non-adaptive parallel sorting algorithm into an in-place and adaptive one by considering the problem of sorting an arbitrarily large input on fixed-size reconfigurable meshes.

...read moreread less

6 citations

Journal Article•

Design of General -Purpose Bitonic Sorting Algorithms with a Fixed Number of Processors for Shared-Memory Parallel Computers

[...]

Jae-Dong Lee

01 Jan 1999-Journal of KIISE:Computer Systems and Theory

3 citations

Journal Article•DOI•

K-bitonic sort

[...]

Qingshi Gao¹, Yue Hu¹, Zhiyong Liu²•Institutions (2)

University of Science and Technology Beijing¹, National Natural Science Foundation of China²

01 Apr 1999-Science China-technological Sciences

TL;DR: Ak-bitonic sort which generalizes the bitonic sort is proposed which merges two monotonic sequences into one order sequence and is the Batcher's bitonicsort whenk=1.

...read moreread less

Abstract: Ak-bitonic sort which generalizes the bitonic sort is proposed. The theorem of the bitonic sort, which merges two monotonic sequences into one order sequence, is extended into the theorem ofk-bitonic sort. Thek-bitonic sort merges (K (=2k or 2k−1) monotonic sequences into one order sequence in\(\left\lceil {log_2 K} \right\rceil \left\lceil {log_2 N} \right\rceil - \tfrac{{\left\lceil {log_2 K} \right\rceil (\left\lceil {log_2 K} \right\rceil - 1)}}{2}\) steps, where\(k = \left\lceil {\tfrac{K}{2}} \right\rceil \) is an integer andk≥1. Thek-bitonic sort is the Batcher's bitonic sort whenk=1.

...read moreread less

1 citations