The Minimum Spanning Tree (MST) problem is a classic computer science problem.
We will study the development of algorithmic ideas for this problem, culminating with Chazelle's O(m α(m,n))-time algorithm, an algorithm that easily meets the "extreme" criterion.
A preview:
Our presentation will pull together material from various sources - see the references below. But most of it will come from [Eisn1997], [Pett1999], [CS153], [Weis2007].
Informal definition:
Formally:
Depicting a graph:
Graph conventions:
Definitions (for undirected graphs):
Data structures for graphs:
0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0
Note: adjMatrix[i][j] == adjMatrix[j][i] (convention for undirected graphs).
With this background, it's going to be easy to state the MST problem: find the spanning tree of minimum weight (among spanning trees):
Exercise:
Some assumptions and notation for the remainder:
The Cut-Edge Observation:
Exercise: Why?
Prim's algorithm (a preview):
The Cycle Property:
Kruskal's algorithm (a preview):
Key ideas:
Exercise: Show the steps in Kruskal's algorithm for this example:
Implementation:
Algorithm: Kruskal-MST (G) Input: Graph G=(V,E) with edge-weights. 1. Initialize MST to be empty; 2. Place each vertex in its own set; 3. Sort edges of G in increasing-order; 4. for each edge e = (u,v) in order 5. if u and v are not in the same set 6. Add e to MST; 7. Compute the union of the two sets; 8. endif 9. endfor 10. return MST Output: A minimum spanning tree for the graph G.
Algorithm: Kruskal-MST (adjMatrix) Input: Adjacency matrix: adjMatrix[i][j] = weight of edge (i,j) (if nonzero) 1. Initialize MST to be empty; // Initialize union-find algorithm and // place each vertex in its own set. 2. for i=0 to numVertices-1 3. makeSingleElementSet (i) 4. endfor 5. sortedEdgeList = sort edges in increasing-order; 6. Initialize treeMatrix to store MST; // Process edges in order. 7. while sortedEdgeList.notEmpty() 8. edge (i,j) = extract edge from sortedEdgeList. // If edge does not cause a cycle, add it. 9. if find(i) != find(j) 10. union (i, j) 11. treeMatrix[i][j] = treeMatrix[j][i] = adjMatrix[i][j] 12. endif 13. endwhile 14. return treeMatrix Output: adjacency matrix represention of tree.
Analysis: (adjacency list)
Key ideas:
Example:
That's why we associate edge-weights with vertices
→ only the edges from the new vertex add new information at
each step.
Implementation:
Algorithm: Prim-MST (G) Input: Graph G=(V,E) with edge-weights. 1. Initialize MST to vertex 0. 2. priority[0] = 0 3. For all other vertices, set priority[i] = infinity 4. Initialize prioritySet to all vertices; 5. while prioritySet.notEmpty() 6. v = remove minimal-priority vertex from prioritySet; 7. for each neighbor u of v 9. w = weight of edge (v, u) 8. if w < priority[u] 9. priority[u] = w // Decrease the priority. 10. endif 11. endfor 12. endwhile Output: A minimum spanning tree of the graph G.
Exercise: Consider line 6 in the algorithm above. How long does it take to execute?
Exercise: What is the running time when using a list for the priority set?
What is a priority queue?
A priority queue with the decreaseKey operation:
Implementing a priority queue with a binary heap:
Using a priority queue with Prim's algorithm:
Algorithm: Prim-MST (adjList) Input: Adjacency list: adjList[i] has list of edges for vertex i // Initialize vertex priorities and place in priority queue. 1. priority[i] = infinity 2. priorityQueue = all vertices (and their priorities) // Set the priority of vertex 0 to 0. 3. decreaseKey (0, 0) 4. while priorityQueue.notEmpty() // Extract best vertex. 5. v = priorityQueue.extractMin() // Explore edges going out from v. 6. for each edge e=(v, u) in adjList[v] 7. w = weight of edge e; // If there's an edge and it's not a self-loop. 8. if priority[u] > w // New priority. 9. priorityQueue.decreaseKey (u, w) 10. predecessor[u] = v 11. endif 12. endfor 13. endwhile // ... build tree in adjacency list form ... (not shown) Output: Adjacency list representation of MST
It turns out that neither Prim nor Kruskal were the first to address the MST problem [Grah1985]:
Key ideas in Boruvka's algorithm:
Boruvka's algorithm with edge contractions:
Algorithm: Boruvka (G) Input: Graph G = (V, E) 1. Create graph F = (V, φ) //No edges in F. 2. while F not connected 3. for each component in F 4. add cheapest outgoing edge to MST 5. endfor 6. Contract added edges in F and G 7. endwhile
Analyis:
First, the precursor to the fib-heap: the binomial heap
[Vuil1978].
Exercise: Show that the binomial tree B_{k} has 2^{k} nodes. Then argue that there are at most O(log n) trees.
The Fibonacci heap [Fred1987]: an overview
More detail:
for r = 0 to M while # rank-r trees > 2 merge into a rank r+1 tree endwhile endfor
Exercise: Look up Fibonacci numbers. How are they defined recursively? Prove the following:
Analysis:
→
|T_{0}| = F_{1}
and |T_{1}| = F_{2},
the second two Fibonacci numbers.
Back to MST:
A simple improvement with Boruvka's algorithm:
Fredman and Tarjan didn't stop with inventing the Fibonacci heap
[Fred1987].
1. Grow trees until all heaps reach k-limit. 2. Contract all trees to get new graph G' 3. Set G = G' and repeat
Let's try and get a sense of the size of β(m,n):
Exercise: Write down (legibly) the largest number you can think of on a square-inch of paper.
Ackermann's function [Corm2006][WP-3]:
On the topic of big numbers:
History:
Because the algorithms are quite complicated (extreme!), we will only sketch out the key ideas. There are two big innovations:
Key ideas:
1. G = original graph 2. while G not a single vertex 3. Identify sub-graphs according to size parameters 4. for each subgraph 5. Identify "bad" edges 6. Create node in contraction tree 7. Place sub-graph nodes as children 8. Contract into single vertex of G' 9. endfor 10. G = G' 11. endwhile 12 return contraction treeThis results in the contraction tree:
Analysis:
Optimality:
Let's step back and look at the history one more time:
End of story? Various other interesting variations
[WP-1]
Wikipedia entry on Minimum Spanning Trees.
[WP-2]
Wikipedia entry on Otakar Boruvka.
[WP-3]
Wikipedia entry on Ackermann's function.
[Aaro]
S.Aaronson.
Who can name the biggest number?
[Boru1926]
O.Boruvka. (paper written in Czech).
[Chaz1997] B.Chazelle.
A faster deterministic algorithm for minimum spanning trees.
Symp. Foundations of Computer Science, 1997, pp.22-31.
[Chaz2000] B.Chazelle.
The soft heap: an approximate priority queue with optimal error rate.
J. ACM, 47:6, 1012-1027, Nov. 2000.
[Chaz2000b] B.Chazelle.
A Minimum Spanning Tree Algorithm with Inverse-Ackermann Type Complexity.
J. ACM, 47, 2000, pp. 1028-1047.
[Cher1976]
D.Cheriton and R.E.Tarjan.
Finding minimum spanning trees.
SIAM J.Computing, Vol. 5, 1976, pp.724-742.
[Corm2006]
T.H.Cormen, C.E.Leiserson, R.Rivest and C.Stein.
Introduction to Algorithms (2nd Edition), McGraw-Hill, 2006.
[Dixo1992]
B.Dixon, M.Rauch and R.E.Tarjan.
Verification and sensitivity analysis of minimum spanning trees in linear time.
SIAM J.Computing,
21:6, 1992, pp.1184-1192.
[Dris1988]
J.R.Driscoll, H.N.Gabow, R.Shrairman and R.E.Tarjan.
Relaxed heaps: an alternative to Fibonacci heaps with applications
to parallel computation.
Comm.ACM, 31:11, 1988, pp.1343-54.
[Eisn1997]
J.Eisner.
State-of-the-art algorithms for minimum spanning trees: A tutorial discussion.
Manuscript, University of Pennsylvania, April 1997.
[Fred1987]
M.Fredman and R.Tarjan.
Fibonacci heaps and their uses in improved network optimization
algorithms.
J. ACM, 34, 1987, pp.596-615.
[Gabo1986]
H.N.Gabow, Z.Galil, T.Spencer and R.E.Tarjan.
Efficient algorithms for finding minimum spanning trees in
undirected and directed graphs.
Combinatorica,
6:2, 1986, pp.109-122.
[Grah1985]
R.L.Graham and P.Hell.
On the history of the minimum spanning tree problem.
Annals of the History of Computing, 7, 1985, pp.43-57.
[Jarn1930]
V.Jarnik. (paper written in Czech).
[Kapl2009]
H.Kaplan and U.Zwick.
A simpler implementation and analysis of Chazelle's soft heaps.
In Proc. ACM -SIAM Symposium on Discrete Algorithms (SODA),
New York, NY, Jan 2009, pp. 477-485.
[Karg1995]
D.R.Karger, P.N.Klein and R.E.Tarjan.
A randomized linear-time algorithm to find minimum spanning
trees.
J. ACM, 42:2, March 1995, pp.321-328.
[King1995]
V.King.
A simpler minimum spanning tree verification algorithm.
Proc. Int. Workshop on Algorithms and Data Structures,
1995, pp.440-448.
[Krus1956]
J.B.Kruskal.
On the shortest spanning subtree of a graph and the traveling
salesman problem.
Proc. Am. Math. Soc., 7, 1956, pp.48-50.
[Krus1997]
J.B.Kruskal.
A reminiscence about shortest spanning trees.
Archivum Mathematicum, 1997, pp.13-14.
[Pett1999]
S.Pettie.
Finding Minimum Spanning Trees in O(m alpha(m,n)) Time.
Technical report TR-99-23, University of Texas, Austin, 1999.
[Pett2000]
S.Pettie and V.Ramachandran.
An Optimal Minimum Spanning Tree Algorithm.
J. ACM, 49:1, 2000, pp. 16-34.
[Prim1957]
R.C.Prim.
Shortest connection networks and some generalizations.
Bell Sys. Tech. J., 36, 1957, pp.362-391.
[CS153]
R.Simha.
Course notes for CS-153 (Undergraduate algorithms course).
[Weis2007]
M.A.Weiss. Data Structures and Algorithm Analysis in Java,
2nd Ed., Addison-Wesley, 2007.
[Vuil1978]
J.Vuillemin.
A data structure for manipulating priority queues.
Comm. ACM, 21:4, April 1978, pp. 309-315.
[Yao1975]
A.Yao. An O(E loglog(V)) algorithm for finding minimum spanning trees.
Inf.Proc.Lett., Vol. 4, 1975, pp.21-23.