scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

An 0(n log n) sorting network

TL;DR: A sorting network of size 0(n log n) and depth 0(log n) is described, and a derived procedure (&egr;-nearsort) are described below, and the sorting network will be centered around these elementary steps.
Abstract: The purpose of this paper is to describe a sorting network of size 0(n log n) and depth 0(log n). A natural way of sorting is through consecutive halvings: determine the upper and lower halves of the set, proceed similarly within the halves, and so on. Unfortunately, while one can halve a set using only 0(n) comparisons, this cannot be done in less than log n (parallel) time, and it is known that a halving network needs (½)n log n comparisons. It is possible, however, to construct a network of 0(n) comparisons which halves in constant time with high accuracy. This procedure (e-halving) and a derived procedure (e-nearsort) are described below, and our sorting network will be centered around these elementary steps.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper shows how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time, and shows that a logarithmic slowdown is a lower bound.
Abstract: Software protection is one of the most important issues concerning computer practice. There exist many heuristics and ad-hoc methods for protection, but the problem as a whole has not received the theoretical treatment it deserves. In this paper, we provide theoretical treatment of software protection. We reduce the problem of software protection to the problem of efficient simulation on oblivious RAM.A machine is oblivious if thhe sequence in which it accesses memory locations is equivalent for any two inputs with the same running time. For example, an oblivious Turing Machine is one for which the movement of the heads on the tapes is identical for each computation. (Thus, the movement is independent of the actual input.) What is the slowdown in the running time of a machine, if it is required to be oblivious? In 1979, Pippenger and Fischer showed how a two-tape oblivious Turing Machine can simulate, on-line, a one-tape Turing Machine, with a logarithmic slowdown in the running time. We show an analogous result for the random-access machine (RAM) model of computation. In particular, we show how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time. On the other hand, we show that a logarithmic slowdown is a lower bound.

1,752 citations


Cites methods from "An 0(n log n) sorting network"

  • ...Yet, each of these oblivious sorting procedures can be implemented while making (log~t actual accesses (to fixed locations independent of the sorted values), where t? is the number of words in the array, (We remark again that the above uses the Batcher Sorting Network [Batcher 1968], whereas, for an asymptotically superior result, one may use the AKS Sorting Network [Ajtai et al, 1983] yielding O(elOgq actual accesses.)...

    [...]

  • ...The complexity estimate is obtained by using the AKS Sorting Network as a basis for the oblivious sorting....

    [...]

  • ...…of the sorted values), where t? is the number of words in the array, (We remark again that the above uses the Batcher Sorting Network [Batcher 1968], whereas, for an asymptotically superior result, one may use the AKS Sorting Network [Ajtai et al, 1983] yielding O(elOgq actual accesses.)...

    [...]

Book ChapterDOI
17 Aug 2003
TL;DR: This paper proposes several efficient techniques for building private circuits resisting side channel attacks, and provides a formal threat model and proofs of security for their constructions.
Abstract: Can you guarantee secrecy even if an adversary can eavesdrop on your brain? We consider the problem of protecting privacy in circuits, when faced with an adversary that can access a bounded number of wires in the circuit This question is motivated by side channel attacks, which allow an adversary to gain partial access to the inner workings of hardware Recent work has shown that side channel attacks pose a serious threat to cryptosystems implemented in embedded devices In this paper, we develop theoretical foundations for security against side channels In particular, we propose several efficient techniques for building private circuits resisting this type of attacks We initiate a systematic study of the complexity of such private circuits, and in contrast to most prior work in this area provide a formal threat model and give proofs of security for our constructions

968 citations


Cites background from "An 0(n log n) sorting network"

  • ...The celebrated AKS network [ 1 ] achieves the optimal parameters ofO(‘ log‘) size andO(log‘) depth....

    [...]

Proceedings ArticleDOI
01 Jan 1987
TL;DR: This paper distill and formulate the key problem of learning about a program from its execution, and presents an efficient way of executing programs such that it is infeasible to learn anything about the program by monitoring its executions.
Abstract: Software protection is one of the most important issues concerning computer practice. There exist many heuristics and ad-hoc methods for protection, but the problem as a whole has not received the theoretical treatment it deserves. In this paper, we make the first steps towards a theoretic treatment of software protection: First, we distill and formulate the key problem of learning about a program from its execution. Second, assuming the existence of one-way permutations, we present an efficient way of executing programs such that it is infeasible to learn anything about the program by monitoring its executions. How can one efficiently execute programs without allowing an adversary, monitoring the execution, to learn anything about the program? Traditional cryptographic techniques can be applied to keep the contents of the memory unknown throughout the execution, but are not applicable to the problem of hiding the access pattern. The problem of hiding the access pattern efficiently corresponds to efficient simulation of Random Access Machines (RAM) on an oblivious RAM. We define an oblivious RAM to be a (probabilistic) RAM for which (the distribution of) the memory access pattern is independent of the input. We present an (on-line) simulation of t steps of an arbitrary RAM with m memory cells, by less than t·me steps of an oblivious RAM with 2m memory cells, where e>0 is an arbitrary constant.

574 citations

Book
01 Jan 1990
TL;DR: A model of parallelism that extends and formalizes the Data-Parallel model on which the Connection Machine and other supercomputers are based is described, and it is argued that data-parallel models are not only practical and can be applied to a surprisingly wide variety of problems, they are also well suited for very-high-level languages and lead to a concise and clear description of algorithms and their complexity.
Abstract: "Vector Models for Data-Parallel Computing "describes a model of parallelism that extends and formalizes the Data-Parallel model on which the Connection Machine and other supercomputers are based. It presents many algorithms based on the model, ranging from graph algorithms to numerical algorithms, and argues that data-parallel models are not only practical and can be applied to a surprisingly wide variety of problems, they are also well suited for very-high-level languages and lead to a concise and clear description of algorithms and their complexity. Many of the author's ideas have been incorporated into the instruction set and into algorithms currently running on the Connection Machine.The book includes the definition of a parallel vector machine; an extensive description of the uses of the scan (also called parallel-prefix) operations; the introduction of segmented vector operations; parallel data structures for trees, graphs, and grids; many parallel computational-geometry, graph, numerical and sorting algorithms; techniques for compiling nested parallelism; a compiler for Paralation Lisp; and details on the implementation of the scan operations.Guy E. Blelloch is an Assistant Professor of Computer Science and a Principal Investigator with the Super Compiler and Advanced Language project at Carnegie Mellon University.Contents: Introduction. Parallel Vector Models. The Scan Primitives. Computational-Geometry Algorithms. Graph Algorithms. Numerical Algorithms. Languages and Compilers. Correction-Oriented Languages. Flattening Nested Parallelism. A Compiler for Paralation Lisp. Paralation-Lisp Code. The Scan Vector Model. Data Structures. Implementing Parallel Vector Models. Implementing the Scan Operations. Conclusions. Glossary.

571 citations

Journal ArticleDOI
TL;DR: A study of the effects of adding two scan primitives as unit-time primitives to PRAM (parallel random access machine) models is presented and it is shown that the primitives improve the asymptotic running time of many algorithms by an O(log n) factor, greatly simplifying the description of many technologies.
Abstract: A study of the effects of adding two scan primitives as unit-time primitives to PRAM (parallel random access machine) models is presented. It is shown that the primitives improve the asymptotic running time of many algorithms by an O(log n) factor, greatly simplifying the description of many algorithms, and are significantly easier to implement than memory references. It is argued that the algorithm designer should feel free to use these operations as if they were as cheap as a memory reference. The author describes five algorithms that clearly illustrate how the scan primitives can be used in algorithm design: a radix-sort algorithm, a quicksort algorithm, a minimum-spanning-tree algorithm, a line-drawing algorithm, and a merging algorithm. These all run on an EREW (exclusive read, exclusive write) PRAM with the addition of two scan primitives and are either simpler or more efficient than their pure PRAM counterparts. The scan primitives have been implemented in microcode on the Connection Machine system, are available in PARIS (the parallel instruction set of the machine). >

543 citations

References
More filters
Proceedings ArticleDOI
30 Apr 1968
TL;DR: To achieve high throughput rates today's computers perform several operations simultaneously; not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently.
Abstract: To achieve high throughput rates today's computers perform several operations simultaneously. Not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently. A major problem in the design of such a computing system is the connecting together of the various parts of the system (the I/O devices, memories, processing units, etc.) in such a way that all the required data transfers can be accommodated. One common scheme is a high-speed bus which is time-shared by the various parts; speed of available hardware limits this scheme. Another scheme is a cross-bar switch or matrix; limiting factors here are the amount of hardware (an m × n matrix requires m × n cross-points) and the fan-in and fan-out of the hardware.

2,553 citations

Journal ArticleDOI
TL;DR: It is shown that for each graph withn vertices and maximum in-degreed, there is a pebbling strategy which requires at mostc(d) n/logn pebbles, and this bound is tight to within a constant factor.
Abstract: We study a one-person game played by placing pebbles, according to certain rules, on the vertices of a directed graph. In [3] it was shown that for each graph withn vertices and maximum in-degreed, there is a pebbling strategy which requires at mostc(d) n/logn pebbles. Here we show that this bound is tight to within a constant factor. We also analyze a variety of pebbling algorithms, including one which achieves the 0(n/logn) bound.

176 citations

Proceedings ArticleDOI
29 Oct 1979
TL;DR: An explicit construction of an infinite family of N-superconcentrators of density 44 of the most economical previously known explicit graphs of this type is presented.
Abstract: We present an explicit construction of an infinite family of N-superconcentrators of density 44. The most economical previously known explicit graphs of this type have density around 60.

93 citations

Proceedings ArticleDOI
05 May 1975
TL;DR: It is shown that the graph of any algorithm for any one of a number of arithmetic problems (e.g. polynomial multiplication, discrete Fourier transforms, matrix multiplication) must have properties closely related to concentration networks.
Abstract: The purpose of this paper is to explore the possibility that purely graph-theoretic reasons may account for the superlinear complexity of wide classes of computational problems. The results are therefore of two kinds: reductions to graph theoretic conjectures on the one hand, and graph theoretic results on the other. We show that the graph of any algorithm for any one of a number of arithmetic problems (e.g. polynomial multiplication, discrete Fourier transforms, matrix multiplication) must have properties closely related to concentration networks.

84 citations


"An 0(n log n) sorting network" refers background in this paper

  • ...For more on expanders and superconcentrators read Valiant (1975), Paul-Tarjan-Celeni (1976), Pippinger (1977), Angluin (1979)....

    [...]

Journal ArticleDOI

18 citations


"An 0(n log n) sorting network" refers background in this paper

  • ...For more on expanders and superconcentrators read Valiant (1975), Paul-Tarjan-Celeni (1976), Pippinger (1977), Angluin (1979)....

    [...]

  • ...For more on expanders and superconcentrators read Valiant (1975), Paul-Tarjan-Celeni (1976), Pippinger (1977), Angluin (1979). The known constructions lead to expanders with arbitrary large X by forming high powers of the graph....

    [...]