scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

PDM sorting algorithms that take a small number of passes

04 Apr 2005-pp 10-10
TL;DR: This paper presents several novel algorithms for sorting on the PDM that take only a small number of passes through the data and presents a generalization of the zero-one principle for sorting and a shuffling lemma.
Abstract: We live in an era of data explosion that necessitates the discovery of novel out-of-core techniques. The I/O bottleneck has to be dealt with in developing out-of-core methods. The parallel disk model (PDM) has been proposed to alleviate the I/O bottleneck. Sorting is an important problem that has ubiquitous applications. Several asymptotically optimal PDM sorting algorithms are known and now the focus has shifted to developing algorithms for problem sizes of practical interest. In this paper we present several novel algorithms for sorting on the PDM that take only a small number of passes through the data. We also present a generalization of the zero-one principle for sorting. A shuffling lemma is presented as well. These lemmas should be of independent interest for average case analysis of sorting algorithms as well as for the analysis of randomized sorting algorithms.
Citations
More filters
Proceedings ArticleDOI
17 Jan 2010
TL;DR: This algorithm is a simple, randomized, data-oblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability.
Abstract: In this paper, we describe a randomized Shellsort algorithm. This algorithm is a simple, randomized, data-oblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability. Taken together, these properties imply applications in the design of new efficient privacy-preserving computations based on the secure multi-party computation (SMC) paradigm. In addition, by a trivial conversion of this Monte Carlo algorithm to its Las Vegas equivalent, one gets the first version of Shellsort with a running time that is provably O(n log n) with very high probability.

88 citations

Proceedings Article
01 Jan 1986
Abstract: In this paper, We consider the problem of sorting n2 numbers, initially distributed randomly in an n × n mesh-connected processor array with one element per processor. We show a lower bound, based on distance arguments, of 4n routing steps on mesh-connected processors operating in an SIMD mode with no wraparounds in rows or columns, We present an algorithm using a novel approach, which is optimal upto the conslant of the leading term, and hence, succeed in proving the tightness of the lower bound based on distance. Keeping in mind the practical difficulties in implementation of this algorithm, we also present an extremely practical O(n) algorithm amenable for VLSI implementation and for existing mesh- connected computers. All the results in this paper were derived by using a new method of analysis inspired by the discovery of shear-sort or row-column sort.

46 citations

Journal ArticleDOI
TL;DR: This algorithm is a simple, randomized, data-oblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability.
Abstract: In this article, we describe a randomized Shellsort algorithm. This algorithm is a simple, randomized, data-oblivious version of the Shellsort algorithm that always runs in O(n log n) time and succeeds in sorting any given input permutation with very high probability. Taken together, these properties imply applications in the design of new efficient privacy-preserving computations based on the secure multiparty computation (SMC) paradigm. In addition, by a trivial conversion of this Monte Carlo algorithm to its Las Vegas equivalent, one gets the first version of Shellsort with a running time that is provably O(n log n) with very high probability.

44 citations

Journal ArticleDOI
TL;DR: A randomized mergesort algorithm based on a simple idea that sorts using an asymptotically optimal number of I/O operations with high probability and has all of the desirable features for practical implementation is presented.
Abstract: The parallel disks model (PDM) has been proposed to alleviate the I/O bottleneck that arises in the processing of massive data sets. Sorting has been extensively studied on the PDM model due to the fundamental nature of the problem - several asymptotically optimal algorithms are known for sorting. Although randomization has been frequently exploited, most of the prior algorithms suffer from complications in memory layouts, implementation, restrictions in range of parameters, and laborious analysis. In this paper, we present a randomized mergesort algorithm based on a simple idea that sorts using an asymptotically optimal number of I/O operations with high probability and has all of the desirable features for practical implementation. In the second part of the paper, we also present several novel algorithms for sorting on the PDM that take only a small number of passes through the data. Recently, considerable interest has been shown by researchers in developing algorithms for problem sizes of practical interest and we are able to obtain several improvements and simplification, in particular for random input.

17 citations


Cites result from "PDM sorting algorithms that take a ..."

  • ...Preliminary versions of the results in this paper were presented in either [21] or [22]....

    [...]

Journal ArticleDOI
22 Jan 2011
TL;DR: In this article, the authors study sorting algorithms based on randomized round-robin comparisons, such as Spin-the-bottle and Annealing sort, where comparisons are restricted to a distance bounded by a temperature parameter.
Abstract: We study sorting algorithms based on randomized round-robin comparisons. Specifically, we study Spin-the-bottle sort, where comparisons are unrestricted, and Annealing sort, where comparisons are restricted to a distance bounded by a temperature parameter. Both algorithms are simple, randomized, data-oblivious sorting algorithms, which are useful in privacy-preserving computations, but, as we show, Annealing sort is much more efficient. We show that there is an input permutation that causes Spin-the-bottle sort to require Ω(n2 log n) expected time in order to succeed, and that in O(n2 log n) time this algorithm succeeds with high probability for any input. We also show there is an specification of Annealing sort that runs in O(n log n) time and succeeds with very high probability.

10 citations

References
More filters
Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

01 Jan 2001
TL;DR: Here the authors haven’t even started the project yet, and already they’re forced to answer many questions: what will this thing be named, what directory will it be in, what type of module is it, how should it be compiled, and so on.
Abstract: Writers face the blank page, painters face the empty canvas, and programmers face the empty editor buffer. Perhaps it’s not literally empty—an IDE may want us to specify a few things first. Here we haven’t even started the project yet, and already we’re forced to answer many questions: what will this thing be named, what directory will it be in, what type of module is it, how should it be compiled, and so on.

6,547 citations

Book
01 Sep 1991
TL;DR: This chapter discusses sorting on a Linear Array with a Systolic and Semisystolic Model of Computation, which automates the very labor-intensive and therefore time-heavy and expensive process of manually sorting arrays.
Abstract: Preface Acknowledgments Notation 1 Arrays and Trees 1.1 Elementary Sorting and Counting 1.1.1 Sorting on a Linear Array Assessing the Performance of the Algorithm Sorting N Numbers with Fewer Than N Processors 1.1.2 Sorting in the Bit Model 1.1.3 Lower Bounds 1.1.4 A Counterexample-Counting 1.1.5 Properties of the Fixed-Connection Network Model 1.2 Integer Arithmetic 1.2.1 Carry-Lookahead Addition 1.2.2 Prefix Computations-Segmented Prefix Computations 1.2.3 Carry-Save Addition 1.2.4 Multiplication and Convolution 1.2.5 Division and Newton Iteration 1.3 Matrix Algorithms 1.3.1 Elementary Matrix Products 1.3.2 Algorithms for Triangular Matrices 1.3.3 Algorithms for Tridiagonal Matrices -Odd-Even Reduction -Parallel Prefix Algorithms 1.3.4 Gaussian Elimination 1.3.5 Iterative Methods -Jacobi Relaxation -Gauss-Seidel Relaxation Finite Difference Methods -Multigrid Methods 1.4 Retiming and Systolic Conversion 1.4.1 A Motivating Example-Palindrome Recognition 1.4.2 The Systolic and Semisystolic Model of Computation 1.4.3 Retiming Semisystolic Networks 1.4.4 Conversion of a Semisystolic Network into a Systolic Network 1.4.5 The Special Case of Broadcasting 1.4.6 Retiming the Host 1.4.7 Design by Systolic Conversion-A Summary 1.5 Graph Algorithms 1.5.1 Transitive Closure 1.5.2 Connected Components 1.5.3 Shortest Paths 1.5.4 Breadth-First Spanning Trees 1.5.5 Minimum Weight Spanning Trees 1.6 Sorting Revisited 1.6.1 Odd-Even Transposition Sort on a Linear Array 1.6.2 A Simple Root-N(log N + 1)-Step Sorting Algorithm 1.6.3 A (3 Root- N + o(Root-N))-Step Sorting Algorithm 1.6.4 A Matching Lower Bound 1.7 Packet Routing 1.7.1 Greedy Algorithms 1.7.2 Average-Case Analysis of Greedy Algorithms -Routing N Packets to Random Destinations -Analysis of Dynamic Routing Problems 1.7.3 Randomized Routing Algorithms 1.7.4 Deterministic Algorithms with Small Queues 1.7.5 An Off-line Algorithm 1.7.6 Other Routing Models and Algorithms 1.8 Image Analysis and Computational Geometry 1.8.1 Component-Labelling Algorithms -Levialdi's Algorithm -An O (Root-N)-Step Recursive Algorithm 1.8.2 Computing Hough Transforms 1.8.3 Nearest-Neighbor Algorithms 1.8.4 Finding Convex Hulls 1.9 Higher-Dimensional Arrays 1.9.1 Definitions and Properties 1.9.2 Matrix Multiplication 1.9.3 Sorting 1.9.4 Packet Routing 1.9.5 Simulating High-Dimensional Arrays on Low-Dimensional Arrays 1.10 problems 1.11 Bibliographic Notes 2 Meshes of Trees 2.1 The Two-Dimensional Mesh of Trees 2.1.1 Definition and Properties 2.1.2 Recursive Decomposition 2.1.3 Derivation from KN,N 2.1.4 Variations 2.1.5 Comparison With the Pyramid and Multigrid 2.2 Elementary O(log N)-Step Algorithms 2.2.1 Routing 2.2.2 Sorting 2.2.3 Matrix-Vector Multiplication 2.2.4 Jacobi Relaxation 2.2.5 Pivoting 2.2.6 Convolution 2.2.7 Convex Hull 2.3 Integer Arithmetic 2.3.1 Multiplication 2.3.2 Division and Chinese Remaindering 2.3.3 Related Problems -Iterated Products -Rooting Finding 2.4 Matrix Algorithms 2.4.1 The Three-Dimensional Mesh of Trees 2.4.2 Matrix Multiplication 2.4.3 Inverting Lower Triangular Matrices 2.4.4 Inverting Arbitrary Matrices -Csanky's Algorithm -Inversion by Newton Iteration 2.4.5 Related Problems 2.5 Graph Algorithms 2.5.1 Minimum-Weight Spanning Trees 2.5.2 Connected Components 2.5.3 Transitive Closure 2.5.4 Shortest Paths 2.5.5 Matching Problems 2.6 Fast Evaluation of Straight-Line Code 2.6.1 Addition and Multiplication Over a Semiring 2.6.2 Extension to Codes with Subtraction and Division 2.6.3 Applications 2.7 Higher-Dimensional meshes of Trees 2.7.1 Definitions and Properties 2.7.2 The Shuffle-Tree Graph 2.8 Problems 2.9 Bibliographic Notes 3 Hypercubes and Related Networks 3.1 The Hypercube 3.1.1 Definitions and Properties 3.1.2 Containment of Arrays -Higher-Dimensional Arrays -Non-Power-of-2 Arrays 3.1.3 Containment of Complete Binary Trees 3.1.4 Embeddings of Arbitrary Binary Trees -Embeddings with Dilation 1 and Load O(M over N + log N) -Embeddings with Dilation O(1) and Load O (M over N + 1) -A Review of One-Error-Correcting Codes -Embedding Plog N into Hlog N 3.1.5 Containment of Meshes of Trees 3.1.6 Other Containment Results 3.2 The Butterfly, Cube-Connected-Cycles , and Benes Network 3.2.1 Definitions and Properties 3.2.2 Simulation of Arbitrary Networks 3.2.3 Simulation of Normal Hypercube Algorithms 3.2.4 Some Containment and Simulation Results 3.3 The Shuffle-Exchange and de Bruijn Graphs 3.3.1 Definitions and Properties 3.3.2 The Diaconis Card Tricks 3.3.3 Simulation of Normal Hypercube Algorithms 3.3.4 Similarities with the Butterfly 3.3.5 Some Containment and Simulation Results 3.4 Packet-Routing Algorithms 3.4.1 Definitions and Routing Models 3.4.2 Greedy Routing Algorithms and Worst-Case Problems 3.4.3 Packing, Spreading, and Monotone Routing Problems -Reducing a Many-to-Many Routing Problem to a Many-to-One Routing Problem -Reducing a Routing Problem to a Sorting Problem 3.4.4 The Average-Case Behavior of the Greedy Algorithm -Bounds on Congestion -Bounds on Running Time -Analyzing Non-Predictive Contention-Resolution Protocols 3.4.5 Converting Worst-Case Routing Problems into Average-Case Routing Problems -Hashing -Randomized Routing 3.4.6 Bounding Queue Sizes -Routing on Arbitrary Levelled Networks 3.4.7 Routing with Combining 3.4.8 The Information Dispersal Approach to Routing -Using Information Dispersal to Attain Fault-Tolerance -Finite Fields and Coding Theory 3.4.9 Circuit-Switching Algorithms 3.5 Sorting 3.5.1 Odd-Even Merge Sort -Constructing a Sorting Circuit with Depth log N(log N +1)/2 3.5.2 Sorting Small Sets 3.5.3 A Deterministic O(log N log log N)-Step Sorting Algorithm 3.5.4 Randomized O(log N)-Step Sorting Algorithms -A Circuit with Depth 7.45 log N that Usually Sorts 3.6 Simulating a Parallel Random Access Machine 3.6.1 PRAM Models and Shared Memories 3.6.2 Randomized Simulations Based on Hashing 3.6.3 Deterministic Simulations using Replicated Data 3.6.4 Using Information Dispersal to Improve Performance 3.7 The Fast Fourier Transform 3.7.1 The Algorithm 3.7.2 Implementation on the Butterfly and Shuffle-Exchange Graph 3.7.3 Application to Convolution and Polynomial Arithmetic 3.7.4 Application to Integer Multiplication 3.8 Other Hypercubic Networks 3.8.1 Butterflylike Networks -The Omega Network -The Flip Network -The Baseline and Reverse Baseline Networks -Banyan and Delta Networks -k-ary Butterflies 3.8.2 De Bruijn-Type Networks -The k-ary de Bruijn Graph -The Generalized Shuffle-Exchange Graph 3.9 Problems 3.10 Bibliographic Notes Bibliography Index Lemmas, Theorems, and Corollaries Author Index Subject Index

2,895 citations

Proceedings ArticleDOI
30 Apr 1968
TL;DR: To achieve high throughput rates today's computers perform several operations simultaneously; not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently.
Abstract: To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently. A major problem in the design of such a computing system is the connecting together of the various parts of the system (the I/O devices, memories, processing units, etc.) in such a way that all the required data transfers can be accommodated. One common scheme is a high-speed bus which is time-shared by the various parts; speed of available hardware limits this scheme. Another scheme is a cross-bar switch or matrix; limiting factors here are the amount of hardware (an m × n matrix requires m × n cross-points) and the fan-in and fan-out of the hardware.

2,553 citations


"PDM sorting algorithms that take a ..." refers background or methods in this paper

  • ...Columnsort algorithm [15], odd-even merge sort [6], and the s(2)-way merge sort algorithms are all special cases of LMM sort [23]....

    [...]

  • ...The LMM sort of Rajasekaran [23] is optimal when N, B, and M are polynomially related and is a generalization of Batcher’s odd-even merge sort [6], Thompson and Kung’s s(2)-way merge sort [28], and Leighton’s columnsort [15]....

    [...]

Journal ArticleDOI
TL;DR: Tight upper and lower bounds are provided for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition.
Abstract: We provide tight upper and lower bounds, up to a constant factor, for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition. The bounds hold both in the worst case and in the average case, and in several situations the constant factors match. Secondary storage is modeled as a magnetic disk capable of transferring P blocks each containing B records in a single time unit; the records in each block must be input from or output to B contiguous locations on the disk. We give two optimal algorithms for the problems, which are variants of merge sorting and distribution sorting. In particular we show for P = 1 that the standard merge sorting algorithm is an optimal external sorting method, up to a constant factor in the number of I/Os. Our sorting algorithms use the same number of I/Os as does the permutation phase of key sorting, except when the internal memory size is extremely small, thus affirming the popular adage that key sorting is not faster. We also give a simpler and more direct derivation of Hong and Kung's lower bound for the FFT for the special case B = P = O(1).

1,344 citations


"PDM sorting algorithms that take a ..." refers background in this paper

  • ...The bundle-sorting algorithm of Matias, Segal and Vitter [20] can be applied to this scenario but it is efficient only for a single disk model....

    [...]

  • ...It has been shown by Aggar- wal and Vitter [1] that Ω ( N DB log(N/B) log(M/B) ) I/O operations are needed to sort N keys (residing in D disks) when the block size is B....

    [...]

  • ...wal and Vitter [1] that Ω ( N DB log(N/B)...

    [...]

  • ...Unlike the bundle-sorting algorithm of Matias, Segal and Vitter [20] that is efficient only for a single disk model, our algorithm achieves full parallelism under very mild assumptions....

    [...]

  • ...Several efficient out-of-core algorithms have been devised by Arge, Ferragina, Grossi, and Vitter [3] for sorting strings....

    [...]