scispace - formally typeset
Search or ask a question

Showing papers by "Robert E. Tarjan published in 2004"


Journal ArticleDOI
TL;DR: The clustering algorithms satisfy strong theoretical criteria and perform well in practice, and it is shown that the quality of the produced clusters is bounded by strong minimum cut and expansion criteria.
Abstract: In this paper, we introduce simple graph clustering methods based on minimum cuts within the graph. The clustering methods are general enough to apply to any kind of graph but are well suited for graphs where the link structure implies a notion of reference, similarity, or endorsement, such as web and citation graphs. We show that the quality of the produced clusters is bounded by strong minimum cut and expansion criteria. We also develop a framework for hierarchical clustering and present applications to real-world data. We conclude that the clustering algorithms satisfy strong theoretical criteria and perform well in practice.

380 citations


Book ChapterDOI
14 Sep 2004
TL;DR: An experimental study comparing two versions of a fast algorithm for finding dominators with careful implementations of both versions of the Lengauer-Tarjan algorithm and with a new hybrid algorithm suggests that, although the performance of all the algorithms is similar, the most consistently fast are the simple Leng Bauer-Tarjar algorithm and the hybrid algorithm, and their advantage increases as the graph gets bigger or more complicated.
Abstract: The computation of dominators in a flowgraph has applications in program optimization, circuit testing, and other areas. Lengauer and Tarjan [17] proposed two versions of a fast algorithm for finding dominators and compared them experimentally with an iterative bit vector algorithm. They concluded that both versions of their algorithm were much faster than the bit-vector algorithm even on graphs of moderate size. Recently Cooper et al. [9] have proposed a new, simple, tree-based iterative algorithm. Their experiments suggested that it was faster than the simple version of the Lengauer-Tarjan algorithm on graphs representing computer program control flow. Motivated by the work of Cooper et al., we present an experimental study comparing their algorithm (and some variants) with careful implementations of both versions of the Lengauer-Tarjan algorithm and with a new hybrid algorithm. Our results suggest that, although the performance of all the algorithms is similar, the most consistently fast are the simple Lengauer-Tarjan algorithm and the hybrid algorithm, and their advantage increases as the graph gets bigger or more complicated.

54 citations


Proceedings Article
15 Apr 2004
TL;DR: A complete, correct, simpler lineartime dominators algorithm, implementable on either a random-access machine or a pointer machine, and a linear-time reduction of the dominators problem to a nearest common ancestors problem.
Abstract: The problem of finding dominators in a flowgraph arises in many kinds of global code optimization and other settings. In 1979 Lengauer and Tarjan gave an almost-linear-time algorithm to find dominators. In 1985 Harel claimed a lineartime algorithm, but this algorithm was incomplete; Alstrup et al. [1999] gave a complete and “simpler” linear-time algorithm on a random-access machine. In 1998, Buchsbaum et al. claimed a “new, simpler” linear-time algorithm with implementations both on a random access machine and on a pointer machine. In this paper, we begin by noting that the key lemma of Buchsbaum et al. does not in fact apply to their algorithm, and their algorithm does not run in linear time. Then we provide a complete, correct, simpler lineartime dominators algorithm. One key result is a linear-time reduction of the dominators problem to a nearest common ancestors problem, implementable on either a random-access machine or a pointer machine.

50 citations


Proceedings ArticleDOI
11 Jan 2004
TL;DR: A complete, correct, simpler linear-time dominators algorithm, implementable on either a random-access machine or a pointer machine, and one key result is alinear-time reduction of the dominators problem to a nearest common ancestors problem.
Abstract: The problem of finding dominators in a flowgraph arises in many kinds of global code optimization and other settings. In 1979 Lengauer and Tarjan gave an almost-linear-time algorithm to find dominators. In 1985 Harel claimed a linear-time algorithm, but this algorithm was incomplete; Alstrup et al. [1999] gave a complete and "simpler" linear-time algorithm on a random-access machine. In 1998, Buchsbaum et al. claimed a "new, simpler" linear-time algorithm with implementations both on a random access machine and on a pointer machine. In this paper, we begin by noting that the key lemma of Buchsbaum et al. does not in fact apply to their algorithm, and their algorithm does not run in linear time. Then we provide a complete, correct, simpler linear-time dominators algorithm. One key result is a linear-time reduction of the dominators problem to a nearest common ancestors problem, implementable on either a random-access machine or a pointer machine.

48 citations


Book ChapterDOI
08 Jul 2004
TL;DR: It is shown that any priority queue data structure that supports insert, delete, and find-min operations in pq(n) time, when there are up to n elements in the priority queue, can be converted into a priority queueData structure that also supports meld operations at essentially no extra cost, at least in the amortized sense.
Abstract: We show that any priority queue data structure that supports insert, delete, and find-min operations in pq(n) time, when there are up to n elements in the priority queue, can be converted into a priority queue data structure that also supports meld operations at essentially no extra cost, at least in the amortized sense. More specifically, the new data structure supports insert, meld and find-min operations in O(1) amortized time, and delete operations in O(pq(n) α(n,n/pq(n))) amortized time, where α(m,n) is a functional inverse of the Ackermann function. For all conceivable values of pq(n), the term α(n,n/pq(n)) is constant. This holds, for example, if pq(n)=Ω(log* n). In such cases, adding the meld operation does not increase the amortized asymptotic cost of the priority queue operations. The result is obtained by an improved analysis of a construction suggested recently by three of the authors in [14]. The construction places a non-meldable priority queue at each node of a union-find data structure. We also show that when all keys are integers in [1,N], we can replace n in all the bounds stated above by N.

7 citations


01 Dec 2004
TL;DR: Improved algorithms for the minimum directed spanning tree problem on graphs with integer edge weights are obtained, namely, a deterministic O(n) time algorithm and a randomized O(i)>(&sqrt;log log n) expected amortized time per operation, respectively.
Abstract: We show that any priority queue data structure that supports insert, delete, and find-min operations in pq(n) amortized time, where n is an upper bound on the number of elements in the priority queue, can be converted into a priority queue data structure that also supports fast meld operations with essentially no increase in the amortized cost of the other operations. More specifically, the new data structure supports insert, meld and find-min operations in O(1) amortized time, and delete operations in O(pq(n) p α(n)) amortized time, where α(n) is a functional inverse of the Ackermann function, and where n this time is the total number of operations performed on all the priority queues. The construction is very simple. The meldable priority queues are obtained by placing a nonmeldable priority queues at each node of a union-find data structure. We also show that when all keys are integers in the range [1, N], we can replace n in the bound stated previously by minln, Nr.Applying this result to the nonmeldable priority queue data structures obtained recently by Thorup [2002b] and by Han and Thorup [2002] we obtain meldable RAM priority queues with O(log log n) amortized time per operation, or O(√log log n) expected amortized time per operation, respectively. As a by-product, we obtain improved algorithms for the minimum directed spanning tree problem on graphs with integer edge weights, namely, a deterministic O(m log log n)-time algorithm and a randomized O(m√log log n)-time algorithm. For sparse enough graphs, these bounds improve on the O(m p n log n) running time of an algorithm by Gabow et al. [1986] that works for arbitrary edge weights.

5 citations


01 Dec 2004
TL;DR: In this paper, the authors compared the performance of the simple Lengauer-Tarjan algorithm and the hybrid algorithm on graphs representing computer program control flow and found that the simple algorithm is faster than the hybrid one.
Abstract: The computation of dominators in a flowgraph has applications in program optimization, circuit testing, and other areas. Lengauer and Tarjan [17] proposed two versions of a fast algorithm for finding dominators and compared them experimentally with an iterative bit vector algorithm. They concluded that both versions of their algorithm were much faster than the bit-vector algorithm even on graphs of moderate size. Recently Cooper et al. [9] have proposed a new, simple, tree-based iterative algorithm. Their experiments suggested that it was faster than the simple version of the Lengauer-Tarjan algorithm on graphs representing computer program control flow. Motivated by the work of Cooper et al., we present an experimental study comparing their algorithm (and some variants) with careful implementations of both versions of the Lengauer-Tarjan algorithm and with a new hybrid algorithm. Our results suggest that, although the performance of all the algorithms is similar, the most consistently fast are the simple Lengauer-Tarjan algorithm and the hybrid algorithm, and their advantage increases as the graph gets bigger or more complicated.

4 citations