Content
Minimum Spanning Trees Introduction Pirm's Algorithm Kruskal's Algorithm MST Summary <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 06/07/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Marcus Darden and David Paoletti. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
i. The Minimum Spanning Tree Problem
Given: edge-weighted, undirected graph $G = (V, E)$
Find: subgraph $T = (V, E')$, $E' \subseteq E$ such that
All vertices are pair-wise connected
The sum of all edge weights in T is minimal
See a cycle in T?
Therefore, T must be a tree (no cycles)
Example: Data Centers Connection
ii. Create a MST
Minimum Spanning Tree (MST) if:
All vertices are pair-wise connected
The sum of all edge weights in T is minimal
Step 1 | Step 2 | Step 3 |
---|---|---|
Find MST on edge-weighted, connected, undirected graphs
Greedily select edges one by one and add to a growing sub-graph
Grows a tree from a single vertex
Steps to do Pirm's Algorithm:
Initializa a tree with a single vertex, chosen arbitarily from the graph
Grow the tree by one edge: of the edges that connect the tree to vertices not yet in the tree, find the minimum-weight edge, and add that vertex to the tree
Repeat step 2 (until all vertices are in the tree)
Given graph G = (V, E)
Start with 2 sets of vertices: 'innies' & 'outies'
'Innies' are visited nodes (initially empty)
'Outies' are not yet visited (initially V)
Select first innie arbitrarily (root of MST)
Repeat until no more outies
Choose outie (v') with smallest distance from any innie
Move v' from outies to innies
Implementation issue: use linear search or PQ?
Data Structures:
Three vectors
For each vertex v, record:
$k_v$: Has v been visited? (initially false for all $v \in V$)
$d_v$: What is the minimal edge weight to v? (initially $\infty$ for all $v \in V$, except $v_r = 0$)
$p_v$: What vertex precedes (is parent of) v? (initially unknown for all $v \in V$)
Pirm's (Linear Search) Algorithm
Set starting point dv to 0.
Loop v times (until every k_v is true): // |V| times
1. From the set of vertices for which k_v is
false, select the vertex v having the
smallest tentative distance d_v. // O(|V|)
2. Set k_v to true. // O(1)
3. For each vertex w adjacent to v for
which k_w is false, test whether dw is
greater than distance(v,w). If it is,
set d_w to distance(v,w) and set p_w to v. // Most at this vertex: O(|V|). Cost of each: O(1)
Implementing Prims's Algorithm
Implement in the order listed:
1: Loop over all vertices: find smallest false $k_v$
2: Mark $k_v$ as true
3: Loop over all vertices: update false neighbors of $k_v$
Common Mistake: Set the first vertex to true outside the loop
Reordering this can result in a simple algorithm that simply doesn’t work
Prim's (Heap) Algorithm
Set starting point dv to 0.
Loop v times (until every k_v is true): // |E| times
1. From the set of vertices for which k_v is
false, select the vertex v having the
smallest tentative distance d_v. // O(log |E|)
2. Set k_v to true. // O(1)
3. For each vertex w adjacent to v for
which k_w is false, test whether dw is
greater than distance(v,w). If it is,
set d_w to distance(v,w) and set p_w to v. // Most at this vertex: O(|V|). Cost of each: O(log (|V|))
// Visits every edge once (over all iterations) = O(|E|)
Algorithm Prims_Heaps(G, s0)
//Initialize
n = |V| // O(1)
create_table(n) //stores k,d,p // O(v)
create_pq() //empty heap // O(1)
table[s0].d = 0 // O(1)
insert_pq(0, s0) // O(1)
while (!pq.isempty) // O(E)
v0 = getMin() //heap top() & pop() // O(log E)
if (!table[v0].k) //not known // O(1)
table[v0].k = true // O(1)
for each vi Î Adj[v0] // O(1 + E / V)
distance = weight(v0, vi) // O(1)
if (distance < table[vi].d) // O(1)
table[vi].d = distance // O(1)
table[vi].p = v0 // O(1)
insert_pq(distance, vi) // O(log E)
Complexity Summary
$O(V^2)$ for the simplest nested-loop implementation
$O(E \log E)$ with heaps
Find an MST on edge-weighted, connected, undirected graphs
Greedily select edges one by one and add to a growing sub-graph
Grows a forest of trees that eventually merges into a single tree
Steps to do Kruskal's Algorithm
Presort all edges: $O(E \log E) \approx O(E \log V)$ time
Try inserting in order of increasing weight
Some edges will be discarded so as not to create cycles
Initial edges may be disjoint
Complexity Analysis
Sorting takes $E \log E$
Remaining work: a loop over E edges
Discarding an edge is trivial $O(1)$
Adding an edge is easy $O(1)$
Most time spent testing for cycles
Good news: takes less than $\log E \approx \log V$
Key idea: if vertices $v_i$ and $v_j$ are connected, then a new edge would create a cycle
Maintaining Disjoint Sets
N locations with no connecting roads
Roads are added one by one
Distances are unimportant (for now)
Connectivity is important
Want to connect cities ASAP
Redundant roads would slow us down
Question: For two cities k and j, would road (k, j) be redundant?
Answer: Use a Union-Find data structure.
MST is lowest-cost sub-graph that
Includes all nodes in a graph
Keeps all nodes connected
Two algorithms to find MST
Prim: iteratively adds closest node to current tree - very similar to Dijkstra, $O(V^2)$ or $O(E \log V)$
Kruskal: iteratively builds forest by adding minimal edges, $O(E \log V)$
For dense G, use the nested-loop Prim variant
For sparse G, Kruskal is faster
Relies on the efficiency of sorting algorithms
Relies on the efficiency of union-find