An optimal minimum spanning tree algorithm

doi:10.1145/505241.505243

An Optimal Minimum Spanning Tree Algorithm

SETH PETTIE AND VIJAYA RAMACHANDRAN

The University of Texas at Austin, Austin, Texas

Abstract. Weestablishthatthe algorithmic complexity of the minimumspanningtree problem is equal

to its decision-tree complexity. Speciﬁcally, we present a deterministic algorithm to ﬁnd a minimum

spanning tree of a graph with n vertices and m edges that runs in time O(T

∗

(m, n)) where T

∗

is the

minimum number of edge-weight comparisons needed to determine the solution. The algorithm is

quite simple and can be implemented on a pointer machine.

Although our time bound is optimal, the exact function describing it is not known at present. The

current best bounds known for T

∗

are T

∗

(m, n) = Ä(m) and T

∗

(m, n) = O(m ·α(m, n)), where α is

a certain natural inverse of Ackermann’s function.

Even under the assumption that T

∗

is superlinear, we show that if the input graph is selected from

G

n,m

, our algorithm runs in linear time with high probability, regardless of n, m, or the permutation of

edge weights. The analysis uses a new martingale for G

n,m

similar to the edge-exposure martingale

for G

n,p

.

Categories and Subject Descriptors: F.2.0 [Analysis of Algorithms and Problem Complexity]:

General; G.2.2 [Discrete Mathematics]: Graph Theory—graph algorithms; G.3 [Probability and

Statistics]

General Terms: Algorithms, Theory

Additional Key Words and Phrases: Graph algorithms, minimum spanning tree, optimal complexity

1. Introduction

The minimum spanning tree (MST) problem has been studied for much of this

century and yet despite its apparent simplicity, the problem is still not fully under-

stood. Graham and Hell [1985] give an excellent survey of results from the earliest

known algorithm of Bor˚uvka [1926] to the invention of Fibonacci heaps, which

were central to the algorithms in Fredman and Tarjan [1987] and Gabow et al.

[1986]. Chazelle [1997] presented an MST algorithm based on the Soft Heap

[Chazelle 2000a] having complexity O(mα(m, n)log α(m, n)), where α is a cer-

tain inverse of Ackermann’s function. Recently Chazelle [2000b] modiﬁed the

A preliminary version of this article appeared in Proceedings of the 27th International Colloquium on Automata,

Languages and Programming (ICALP) (Geneva, Switzerland). Springer-Verlag, New York, 2000.

Part of this work was supported by Texas Advanced Research Program Grant 003658-0029-1999.

S. Pettie was also supported by an MCD Graduate Fellowship.

Authors’ address: The University of Texas at Austin, Department of Computer Science, Taylor Hall 2.124

(Mailcode 0500), Austin, TX 78712, e-mail: {seth;vlr}@cs.utexas.edu.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is

granted without fee provided that copies are not made or distributed for proﬁt or direct commercial

advantage and that copies show this notice on the ﬁrst page or initial screen of a display along with the

full citation. Copyrights for components of this worked owned by others than ACM must be honored.

Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute

to lists, or to use any component of this work in other works requires prior speciﬁc permission and/or

a fee. Permissions may be requested from Publications Dept., ACM, Inc., 1515 Broadway, New York,

NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org.

C

°

2002 ACM 0004-5411/02/0100–0016 $5.00

Journal of the ACM, Vol. 49, No. 1, January 2002, pp. 16–34.

An Optimal Minimum Spanning Tree Algorithm 17

algorithm in Chazelle [1997] to bring down the running time to O(m · α(m, n)).

Later a similar algorithm of the same running time was presented by Pettie [1999],

which gives an alternate exposition of the O(m ·α(m, n)) result. This is the tightest

time bound for the MST problem to date, though not known to be optimal.

All algorithms mentioned above work on a pointer machine [Tarjan 1979a] un-

der the restriction that edge weights may only be subjected to binary comparisons.

If, in addition, we have access to a stream of perfectly random bits, Karger et al.

[1995] showed that the MST can be computed in linear time with high probabil-

ity. Fredman and Willard [1994] gave a deterministic linear time MST algorithm

under the unit-cost RAM model, assuming edge weights are integers represented

in binary.

It is still unknownwhether these more powerful models are necessary to compute

theMSTinlineartime.However,inthisarticle,wegiveadeterministic,comparison-

based MST algorithm that runs on a pointer machine in O(T

∗

(m, n)) time, where

T

∗

(m, n) is the number of edge-weight comparisons needed to determine the MST

on any graph with m edges and n vertices. Additionally, we show that ouralgorithm

runs in linear time for the vast majority of graphs, regardless of the number of edges

in the graph or the permutation of edge weights.

Because of the nature of our algorithm, its exact running time is not known.

This might seem paradoxical at ﬁrst. The source of our algorithm’s optimality,

and its mysterious running time, is the use of precomputed “MST decision trees”

whose exact depth is unknown but nonetheless provably optimal. The technique

of obtaining optimal algorithms via precomputation was used in a simpler setting

in Larmore [1990] for searching convex matrices and in Dixon et al. [1992] for

MST sensitivity analysis. We should point out that precomputing optimal deci-

sion trees does not increase the constant factor hidden by big-Oh notation, nor

does it result in a nonuniform algorithm. A trivial lower bound on the running

time of our algorithm is Ä(m); the best upper bound, O(mα(m, n)), is due to

Chazelle [2000b].

Our optimal MST algorithm should be contrasted with the complexity-theoretic

result that any optimal veriﬁcation algorithm for some problem can be used to

construct an optimal algorithm for the same problem [Jones 1997]. Though asymp-

totically optimal, this construction hides astronomical constant factors and proves

nothing about the relationship between algorithmic complexity and decision-tree

complexity. See Section 8 for a discussion of these and other related issues.

Inthe nextsections, wereviewsomewell-knownMSTresults that areused by our

algorithm. In Section 3, we prove a key lemma and give a procedure for partitioning

the graph in an MST-respecting manner. Section 4 gives an overview of our optimal

algorithm and discusses the structure and use of precomputed decision-trees for the

MST problem. Section 5 gives the algorithm and a proof of optimality. Section 6

shows how the algorithm may be modiﬁed to run on a pointer machine. In Section 7,

we show our algorithm runs in linear-time with high probability if the input graph

is selected at random. Sections 8 and 9 discuss related problems and algorithms,

open questions, and the actual complexity of MST.

2. Preliminaries

The input is an undirected graph G = (V, E) where each edge is assigned a distinct

real-valued weight. By convention, |V |=nand |E|=m. The minimum spanning

18 S. PETTIE AND V. RAMACHANDRAN

forest (MSF) problem asks for a spanning acyclic subgraph of G having the least

total weight. In this article, we assume for convenience that the input graph is

connected, since otherwise we can ﬁnd its connected components in linear time

and then solve the problem on each connected component. Thus, the MSF problem

is identical to the minimum spanning tree problem.

It is well known that one can identify edges provably in the MSF using the cut

property, and edges provably not in the MSF using the cycle property. The cut

property states that the lightest edge crossing any partition of the vertex set into

two parts must belong to the MSF. The cycle property states that the heaviest edge

in any cycle in the graph cannot be in the MSF.

2.1. B

OR

˚

UVKA STEPS. The earliest known MSF algorithm is due to Bor˚uvka

[1926].Thealgorithmisquitesimple:Itproceedsinasequenceofstages,andineach

stage, or Bor

˚

uvka step, it identiﬁes a forest F consisting of the minimum-weight

edge incident to each vertex in the graph G, then forms the graph G

1

= G\F as the

input to the next stage. Here G\F denotes the graph derived from G by contracting

edges in F (by the cut property these edges belong to the MSF.) Each Bor˚uvka step

takes linear time, and since the number of vertices is reduced by at least half in

each step, Bor˚uvka’s algorithm takes O(m log n) time.

Our optimal algorithm uses a procedure called Boruvka2(G; F, G

0

). This proce-

dure executes two Bor˚uvka steps on the input graph G and returns the contracted

graph G

0

as well as the set of edges F identiﬁed as part of the MSF during these

two steps.

2.2. D

IJSKTRA-JARN

´

IK-PRIM ALGORITHM. Another early MSF algorithm that

runs in O(m logn) time is the one by Jarn´ık [1930], rediscoveredby Dijkstra [1959]

and Prim [1957]. We refer to this algorithm as the DJP algorithm. Brieﬂy, the DJP

algorithm grows the MSF T one edge at a time. Initially, T is an arbitrary vertex.

In each step of the DJP algorithm, T is augmented with the least-weight edge

(x, y) such that x ∈ T and y 6∈ T. By the cut property, all edges added to T are in

the MSF.

L

EMMA 2.1. Let T be the tree formed after the execution of some number of

steps of the DJP algorithm. Let e and f be two arbitrary edges, each with exactly

one endpoint in T, and let g be the maximum weight edge on the path from e to f

in T. Then g cannot be heavier than both e and f.

P

ROOF. Let P be the path in T connecting e and f, and assume the contrary,

that g is the heaviest edge in P ∪{e, f}. Now consider the moment when g is

selected by DJP and let P

0

be the portion of P present in the tree. There are exactly

two edges in (P − P

0

) ∪{e, f}that are eligible to be chosen by the DJP algorithm

at this moment, one of which is the edge g. If the other edge is in P, then by our

choice of g it must be lighter than g. If the other edge is either e or f, then by our

assumption it must be lighter than g. In both cases, g could not be chosen next by

the DJP algorithm, a contradiction.

2.3. THE DENSE CASE ALGORITHM. The algorithms presented in Fredman and

Tarjan [1987], Gabow et al. [1986], Chazelle [1997, 2000b], and Pettie [1999] will

ﬁnd the MSF of a graph in linear time if the graph is sufﬁciently dense, that is,

has a sufﬁciently large edge-to-vertex ratio. For our purposes, sufﬁciently dense

will mean m/n ≥ log

(3)

n. All of the above algorithms run in linear time for that

An Optimal Minimum Spanning Tree Algorithm 19

density, the simplest of which is easily that of Fredman and Tarjan [1987]. This

algorithm executes a number of phases, where the purpose of each phase is to

amplify the “nominal density” of the graph by contracting a large number of MSF

edges; here the nominal density is the ratio m/n

0

, where m, as usual, is the number

of edges in the original graph, and n

0

is the number of vertices in the current graph.

Each phase of the algorithm runs in O(m + n) time, and works by executing the

DJP algorithm many times, each for a limited number of steps. If n

0

is the number

of vertices before a phase, the number of vertices after the phase is no more than

n

0

/2

m/n

0

, hence no more than log

∗

n − log

∗

(m/n) phases are needed.

The procedure DenseCase(G; F) takes as input an n-node graph G and returns

the MSF F of G in linear time for graphs with density at least log

(3)

n.

Our optimal algorithm will actually call DenseCase on a graph derived from

an n-node, m-edge graph by contracting vertices so that the number of vertices

is reduced by a factor of at least log

(3)

n. The number of edges in the contracted

graph is no more than m. Hence, DenseCase will run in O(m + n) time on such

a graph.

2.4. S

OFT HEAP. The main data structureused byour algorithmis theSoft Heap

[Chazelle 2000a]. The Soft Heap is a kind of priority queue that gives us an optimal

trade-off between accuracy and speed. It is parameterized by an error tolerance ²,

and supports the following operations:

—

MakeHeap(): returns an empty soft heap.

—

Insert(S, x): insert item x into heap S.

—

Findmin(S): returns item with smallest key in heap S.

—

Delete(S, x): delete x from heap S.

—

Meld(S

1

, S

2

): create new heap containing the union of items stored in S

1

and S

2

, destroying S

1

and S

2

in the process.

All operations take constant amortized time, except for Insert, which takes

O(log(1/² )) time. To save time the Soft Heap allows items to be grouped together

and treated as though they have a single key. An item adopts the largest key of any

item in its group, corrupting the item if its new key differs from its original key.

Thus, the original key of an item returned by Findmin (i.e., any item in the group

with minimum key) is no more than the keys of all uncorrupted items in the heap.

The guarantee is that after n Insert operations, no more than ²n corrupted items are

in the heap. The following result is shown in Chazelle [2000a].

L

EMMA 2.2. Fix anyparameter 0<²<1/2,and beginning with no priordata,

consider a mixed sequence of operations that includes n inserts. On a Soft Heap,

the amortized complexity of each operation is constant, except for insert, which

takes O(log(1/² )) time. At most ²n items are corrupted at any given time.

3. A Key Lemma and Procedure

3.1. A R

OBUST CONTRACTION LEMMA. It is well known that if T is a tree

of MSF edges, we can contract T into a single vertex while maintaining the in-

variant that the MSF of the contracted graph plus T gives the MSF for the graph

before contraction.

20 S. PETTIE AND V. RAMACHANDRAN

In our algorithm, we ﬁnd a tree of MSF edges T in a corrupted graph, where

some of the edge weights have been increased due to the use of a Soft Heap. In the

lemma givenbelow,we showthat useful information can beobtained by contracting

certain corrupted trees, in particular those constructed using some number of steps

from the Dijkstra–Jarnik–Prim (DJP) algorithm. Ideas similar to these are used in

Chazelle’s [1997] algorithm and more explicitly in the recent algorithm of Chazelle

[2000b] (see also Pettie [1999]).

Before stating the lemma, we need some notation and preliminary concepts. Let

V(G) and E(G) be the vertex and edge sets of G, and n and m be their cardinality,

respectively. Let the G-weight of an edge be its weight in graph G (the G may be

omitted if implied from context).

For the following deﬁnitions, M and C are subgraphs of G. Denote by G ⇑ M

some graph derived from G by raising the weight of each edge in M by ar-

bitrary amounts (these edges are said to be corrupted). Let M

C

be the set of

edges in M with exactly one endpoint in C. Let G\C denote the graph obtained

by contracting all connected components induced by C, that is, by replacing

each connected component with a single vertex and reassigning edge endpoints

appropriately.

Deﬁnition 3.1. A subgraph C is said to be DJP-contractible with respect to G

if after executing the DJP algorithm on G for some number of steps, with a suitable

start vertex in C, the tree that results is a spanning tree for C.

L

EMMA 3.2. Let M be a set of edges in a graph G. If C is a subgraph of

G that is DJP-contractible with respect to G ⇑ M, then MSF(G) is a subset of

MSF(C) ∪ MSF(G\C − M

C

) ∪ M

C

.

P

ROOF. Each edge in C that is not in MSF(C) is the heaviest edge on some

cycle in C. Since that cycle exists in G as well, that edge is not in MSF(G). So we

need only show that edges in G\C that are not in MSF(G\C − M

C

) ∪ M

C

are also

not in MSF(G).

Let H = G\C − M

C

; hence, we need to show that no edge in H − MSF(H)is

in MSF(G). Let e be in H − MSF(H), that is, e is the heaviest edge on some cycle

χ in H.Ifχdoes not involve the vertex derived by contracting C, then it exists in

G as well and e 6∈ MSF(G). Otherwise, χ forms a path P in G whose endpoints,

say x and y, are both in C. Let the end edges of P be (x, w) and (y, z). Since H

includes no corrupted edges with one endpoint in C, the G-weight of these edges

is the same as their (G ⇑ M)-weight.

Let T be the spanning tree of C ⇑ M derived by the DJP algorithm, Q be the

path in T connecting x and y, and g be the heaviest edge in Q. Notice that P ∪ Q

forms a cycle. By our choice of e, it must be heavier than both (x, w) and (y, z),

and by Lemma 2.1, the heavier of (x, w) and (y, z) is heavier than the (G ⇑ M)-

weight of g, which is an upper bound on the G-weights of all edges in Q. So with

respect to G-weights, e is the heaviest edge on the cycle P ∪ Q and cannot be

in MSF(G).

3.2. THE PARTITION PROCEDURE. Our algorithm uses the Partition procedure

that is given below. This procedure ﬁnds DJP-contractible subgraphs C

1

,...,C

k

in

which edges are progressively being corrupted by the Soft Heap. Let M

C

i

contain

only those corrupted edges with one endpoint in C

i

at the time it is completed.

An optimal minimum spanning tree algorithm

Figures

Citations

Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE

Design and analysis of an MST-based topology control algorithm

Design and analysis of an MST-based topology control algorithm

Spanning Trees and Optimization Problems

A new approach to all-pairs shortest paths on real-weighted graphs

References

A note on two problems in connexion with graphs

The Design and Analysis of Computer Algorithms

The Probabilistic Method

The evolution of random graphs

On the shortest spanning subtree of a graph and the traveling salesman problem

Related Papers (5)

Shortest connection networks and some generalizations

On the shortest spanning subtree of a graph and the traveling salesman problem

A note on two problems in connexion with graphs

On the History of the Minimum Spanning Tree Problem

Fibonacci heaps and their uses in improved network optimization algorithms