scispace - formally typeset
Search or ask a question

Showing papers by "Thomas H. Cormen published in 2010"


Proceedings ArticleDOI
19 Apr 2010
TL;DR: Experimental results show that by using multiple pipelines, an out-of-core, distribution-based sorting program outperforms an out of-core sorting program based on columnsort approximately 75%–85% of the time-despite the advantages that the columnsort-based program holds.
Abstract: We describe the implementation of an out-of-core, distribution-based sorting program on a cluster using FG, a multithreaded programming framework. FG mitigates latency from disk-I/O and interprocessor communication by overlapping such high-latency operations with other operations. It does so by constructing and executing a coarse-grained software pipeline on each node of the cluster, where each stage of the pipeline runs in its own thread. The sorting program distributes data among the nodes to create sorted runs, and then it merges sorted runs on each node. When distributing data, the rates at which a node sends and receives data will differ. When merging sorted runs, each node will consume data from each of its sorted runs at varying rates. Under these conditions, a single pipeline running on each node is unwieldy to program and not necessarily efficient.We describe how we have extended FG to support multiple pipelines on each node in two forms. When a node might send and receive data at different rates during interprocessor communication, we use disjoint pipelines on each node: one pipeline to send and one pipeline to receive. When a node consumes and produces data from different streams on the node, we use multiple pipelines that intersect at a particular stage. Experimental results show that by using multiple pipelines, an out-of-core, distribution-based sorting program outperforms an out-of-core sorting program based on columnsort-taking approximately 75%–85% of the time-despite the advantages that the columnsort-based program holds.

3 citations