Minimum Spanning Tree — Kruskal and Prim Guide
What is a Spanning Tree?
A spanning tree of a connected, undirected graph is a subgraph that includes all vertices and is still a tree. Since a tree is connected and acyclic, a spanning tree must connect all vertices using exactly V-1 edges.
- If a graph has cycles, a spanning tree removes just enough edges to break those cycles.
- If edge weights are present, we often want the spanning tree with the minimum total weight.
- That special tree is called the Minimum Spanning Tree (MST).
Spanning Tree vs Minimum Spanning Tree
| Term | Meaning |
|---|---|
| Spanning Tree | Any tree that connects all vertices of the graph |
| Minimum Spanning Tree | A spanning tree whose total edge weight is minimum among all spanning trees |
Every MST is a spanning tree, but not every spanning tree is minimum. In an unweighted graph, any spanning tree can be considered minimum because all edges have equal cost.
Why Minimum Spanning Trees Matter
MSTs appear whenever we want to connect many points as cheaply as possible while preserving connectivity.
- Road and cable design: connect cities with minimum total cost.
- Communication networks: build efficient backbone links.
- Clustering: remove the heaviest MST edges to form groups.
- Approximation algorithms: several hard problems use MST ideas.
Properties of a Spanning Tree
| Property | Meaning |
|---|---|
| Contains all vertices | No vertex from the original graph is missing |
| Exactly V-1 edges | Any connected acyclic graph on V vertices has V-1 edges |
| No cycles | Adding any extra edge creates exactly one cycle |
| Connected | Every pair of vertices remains reachable |
When Does MST Apply?
- Graph should be undirected: classical MST is defined for undirected graphs.
- Graph should be connected: otherwise we get a minimum spanning forest.
- Weights may be positive, zero, or negative: MST algorithms still work because they compare edges, not path sums.
- Directed graphs: MST is not the right concept; the related structure is a minimum spanning arborescence.
MST Algorithms at a Glance
| Algorithm | Main Idea | Best For | Time Complexity |
|---|---|---|---|
| Kruskal's | Sort edges and keep the safe ones | Sparse graphs | O(E log E) |
| Prim's | Grow one tree outward from a start vertex | Dense graphs | O((V + E) log V) |
1. Kruskal's Algorithm
Kruskal's algorithm is an edge-based greedy method. It sorts edges by increasing weight and keeps adding the next lightest edge as long as that edge does not create a cycle.
How Kruskal Works
- Sort all edges by weight.
- Start with each vertex in its own component.
- Process edges in sorted order.
- Add an edge only if it connects two different components.
- Stop after selecting V-1 edges.
Why Union-Find is Used
We need a quick way to check whether two vertices are already in the same connected component. If they are, adding the edge would create a cycle. The Union-Find data structure makes these checks efficient.
Intuition Behind Kruskal
Kruskal's algorithm always tries to take the cheapest edge available. If that edge connects two different components, it is safe to include because it improves connectivity without forming a cycle. Over time, many small components merge until one spanning tree remains.
import java.util.*;
public class Kruskal {
static int[] parent, rank;
static int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
static boolean union(int x, int y) {
int px = find(x), py = find(y);
if (px == py) return false;
if (rank[px] < rank[py]) parent[px] = py;
else if (rank[px] > rank[py]) parent[py] = px;
else {
parent[py] = px;
rank[px]++;
}
return true;
}
static int kruskal(int V, int[][] edges) {
parent = new int[V];
rank = new int[V];
for (int i = 0; i < V; i++) parent[i] = i;
Arrays.sort(edges, Comparator.comparingInt(e -> e[2]));
int total = 0;
int used = 0;
for (int[] edge : edges) {
int u = edge[0], v = edge[1], w = edge[2];
if (union(u, v)) {
System.out.println(u + " - " + v + " : " + w);
total += w;
used++;
if (used == V - 1) break;
}
}
return total;
}
public static void main(String[] args) {
int V = 4;
int[][] edges = {
{0, 1, 10}, {0, 2, 6}, {0, 3, 5},
{1, 3, 15}, {2, 3, 4}
};
System.out.println("MST weight = " + kruskal(V, edges));
}
}
import java.util.*;
public class Prim {
static int prim(int V, List<int[]>[] adj) {
boolean[] inMST = new boolean[V];
int[] key = new int[V];
Arrays.fill(key, Integer.MAX_VALUE);
key[0] = 0;
PriorityQueue<int[]> pq =
new PriorityQueue<>(Comparator.comparingInt(a -> a[0]));
pq.offer(new int[]{0, 0});
int total = 0;
while (!pq.isEmpty()) {
int[] current = pq.poll();
int weight = current[0];
int u = current[1];
if (inMST[u]) continue;
inMST[u] = true;
total += weight;
for (int[] edge : adj[u]) {
int v = edge[0], w = edge[1];
if (!inMST[v] && w < key[v]) {
key[v] = w;
pq.offer(new int[]{w, v});
}
}
}
return total;
}
@SuppressWarnings("unchecked")
public static void main(String[] args) {
int V = 5;
List<int[]>[] adj = new ArrayList[V];
for (int i = 0; i < V; i++) adj[i] = new ArrayList<>();
int[][] edges = {
{0, 1, 2}, {0, 3, 6}, {1, 2, 3},
{1, 3, 8}, {1, 4, 5}, {2, 4, 7}, {3, 4, 9}
};
for (int[] e : edges) {
adj[e[0]].add(new int[]{e[1], e[2]});
adj[e[1]].add(new int[]{e[0], e[2]});
}
System.out.println("MST weight = " + prim(V, adj));
}
}
2. Prim's Algorithm
Prim's algorithm is a vertex-based greedy method. It starts from any vertex and repeatedly adds the minimum-weight edge that connects the growing MST to a new vertex outside the tree.
How Prim Works
- Choose a start vertex.
- Maintain a set of vertices already included in the MST.
- Among all edges leaving that set, choose the minimum one.
- Add the new vertex and continue until all vertices are included.
Intuition Behind Prim
Prim's algorithm grows one connected tree step by step. At every stage, it looks only at edges that leave the current tree and chooses the cheapest way to expand it. This makes Prim feel similar to Dijkstra, but the goal here is to minimize the total tree cost, not shortest path distances from a source.
Worked Example with Prim
Suppose we start from vertex 0 in a graph with edges: (0,1,2), (0,3,6), (1,2,3), (1,4,5), (2,4,7), (3,4,9).
- Start with vertex 0.
- The smallest edge leaving the tree is 0-1 with weight 2, so add it.
- Now the cheapest edge leaving {0,1} is 1-2 with weight 3, so add it.
- Next the cheapest edge leaving {0,1,2} is 1-4 with weight 5, so add it.
- Finally add 0-3 with weight 6 to include the last vertex.
The MST edges are {0-1, 1-2, 1-4, 0-3} and the total weight is 2 + 3 + 5 + 6 = 16.
Kruskal vs Prim
| Aspect | Kruskal's | Prim's |
|---|---|---|
| Strategy | Select globally lightest safe edge | Grow one tree from a start vertex |
| Main structure | Union-Find | Priority queue |
| Natural input | Edge list | Adjacency list / matrix |
| Usually preferred for | Sparse graphs | Dense graphs |
Union-Find in Simple Terms
Union-Find maintains groups of connected vertices. It supports two main operations:
- find(x): return the representative of the component containing x.
- union(x, y): merge the two components containing x and y.
With path compression and union by rank, these operations are extremely fast in practice.
Important MST Properties
- Cut property: the lightest edge crossing a cut is safe for some MST.
- Cycle property: the heaviest edge in a cycle cannot belong to an MST.
- Uniqueness: if all edge weights are distinct, the MST is unique.
- Disconnected graph: if the graph is disconnected, we get a minimum spanning forest instead of a single tree.
Why the Cut and Cycle Properties Matter
Cut property: if we divide the vertices into two groups, the lightest edge crossing that division is safe to include in some MST. This property justifies the greedy choice in Prim and Kruskal.
Cycle property: if a cycle exists, the heaviest edge in that cycle is never necessary for an MST. This explains why Kruskal can safely skip expensive edges that would create cycles.
Worked Example
Suppose edges are: (2,3,1), (0,1,2), (1,2,3), (0,3,4), (1,4,5), (0,2,6), (2,4,7).
- Kruskal adds 2-3, 0-1, 1-2, skips 0-3 because it creates a cycle, then adds 1-4.
- The MST edges are {2-3, 0-1, 1-2, 1-4}.
- Total MST weight = 1 + 2 + 3 + 5 = 11.
Applications of MST
| Application | How MST Helps |
|---|---|
| Network design | Connect all sites with minimum cable, wire, or road length |
| Clustering | Build MST, then remove the heaviest edges to form clusters |
| Approximation algorithms | Provides structure for near-optimal solutions in harder problems |
| Image segmentation | Use weighted similarity graph and cut expensive edges |
Complexity Summary
| Algorithm | Dominant Cost | Total Complexity |
|---|---|---|
| Kruskal's | Sorting the edges | O(E log E) |
| Prim's | Priority queue operations | O((V + E) log V) |
Common Questions and Pitfalls
- Can an MST have negative edges? Yes. MST algorithms still work correctly.
- Can there be multiple MSTs? Yes, if several edges share the same weight.
- Does MST apply to directed graphs? No, not in the standard sense.
- Does Prim always start from the same vertex? No. Different start vertices can give different edge choices but the same total minimum cost.
- Does MST find shortest paths from one source? No. That is a shortest path problem, not an MST problem.
Key Takeaways
- A spanning tree connects all vertices with no cycles.
- An MST is the spanning tree with the least total edge weight.
- Kruskal builds the MST by choosing safe edges globally.
- Prim builds the MST by growing one connected tree locally.
Level Up Your Daa Skills
Master Daa with these hand-picked resources