Difference between revisions of "Kruskal's algorithm"

From PEGWiki
Jump to: navigation, search
(Created page with ''''Kruskal's algorithm''' is a general-purpose algorithm for the minimum spanning tree problem, based on the disjoint sets data structure. The existence of very simple al…')
 
Line 1: Line 1:
'''Kruskal's algorithm''' is a general-purpose algorithm for the [[minimum spanning tree]] problem, based on the [[disjoint sets data structure]]. The existence of very simple algorithms to maintain disjoint sets in almost constant time gives rise to simple implementations of Kruskal's algorithm whose running times are close to [[Asymptotic analysis|linear]], usually outperforming [[Prim's algorithm]] in sparse graphs.
+
'''Kruskal's algorithm''' is a general-purpose algorithm for the [[minimum spanning tree]] problem, based on the [[disjoint sets data structure]]. The existence of very simple algorithms to maintain disjoint sets in almost [[Asymptotic analysis|constant time]] gives rise to simple implementations of Kruskal's algorithm whose running times are close to linear, usually outperforming [[Prim's algorithm]] in sparse graphs.
  
 
=Theory of the algorithm=
 
=Theory of the algorithm=
Kruskal's may be characterized as a [[greedy algorithm]], which builds the MST one edge at a time. As befits a MST algorithm, the greedy strategy is to continually add the remaining edge of lowest weight. Unlike Prim's, however, Kruskal's adds edges without regard to the connectivity of the partially built MST; that is, it does not necessarily add an edge emanating from a vertex that is in the partially built MST. Indeed, it may be said that Kruskal's starts with <math>V</math> forests of one vertex each, and adds edges one by one, each one causing two trees in the forest to coalesce into one, until all vertices have been placed in the same connected component and the MST is complete. In doing so, one must be careful not to add an edge between two vertices that are ''already'' in the same component, for doing so would create a cycle. <!--
+
Kruskal's may be characterized as a [[greedy algorithm]], which builds the MST one edge at a time. As befits a MST algorithm, the greedy strategy is to continually add the remaining edge of lowest weight. Unlike Prim's, however, Kruskal's adds edges without regard to the connectivity of the partially built MST; that is, it does not necessarily add an edge emanating from a vertex that is in the partially built MST. Indeed, it may be said that Kruskal's starts with <math>V</math> forests of one vertex each, and adds edges one by one, each one causing two trees in the forest to coalesce into one, until all vertices have been placed in the same connected component and the MST is complete. In doing so, one must be careful not to add an edge between two vertices that are ''already'' in the same component, for doing so would create a cycle. We shall assume that a spanning tree exists for the following sections. (If you find them too difficult, skip them.)
  
always branching out from the part of the tree currently built (and hence keeping the entire partial MST in one connected component, unlike Kruskal's), always adding the lowest-weight edge available. We assume below that a spanning tree exists (that the graph is fully connected). If you find the three sections below too difficult, skip them.
+
==Lemma==
 +
<p>Suppose that a subset <math>S</math> of the edges of a graph <math>G</math> is known to be a subset of the edges of some spanning tree of <math>G</math>. Consider the set of edges <math>T</math> containing exactly those edges <math>\in G\S</math> which, when added to <math>S</math>, do not induce a cycle. If a minimal-weight edge <math>\in T</math> is added to <math>S</math>, the resulting set is also guaranteed to be a subset of the edges of some spanning tree of <math>G</math>.
  
==Lemma 1==
+
''Proof'': Consider the graph <math>G'</math> with vertex set <math>V(G)</math> and edge set <math>S</math>. Now, any edge in <math>G\S</math> connects either two vertices in different connected components in <math>G'</math> or two vertices in the same component. If it connects two vertices in the same component, adding it generates a cycle (since there is already a path from one component to the other that does not use that edge, and adding the edge generates a return path) and it is not in <math>T</math>. Otherwise, it does not generate a cycle because it is a bridge in the resulting single connected component, and it is in <math>T</math>. That is, <math>T</math> consists of exactly those edges linking different connected components in <math>G'</math>. It is clear that <!--
<p>Suppose that a spanning tree <math>T</math> is given of some graph <math>G</math>. Then, the addition of any edge <math>\notin E(T)</math> to <math>E(T)</math>, followed by the removal of any edge from the resulting cycle, yields a spanning tree of <math>G</math>.</p>
+
 
+
<p>''Proof'': Before the operation, the number of vertices is one more than the number of edges. After the operation, this is again true. As the addition of the new edge generates exactly one simple cycle, there are no longer any cycles after an edge on this cycle is removed. So the new <math>T</math> has a vertex count which exceeds its edge count by one and contains no cycles; it must therefore be a tree.</p>
+
  
 
==Lemma 2==
 
==Lemma 2==

Revision as of 06:18, 22 December 2009

Kruskal's algorithm is a general-purpose algorithm for the minimum spanning tree problem, based on the disjoint sets data structure. The existence of very simple algorithms to maintain disjoint sets in almost constant time gives rise to simple implementations of Kruskal's algorithm whose running times are close to linear, usually outperforming Prim's algorithm in sparse graphs.

Theory of the algorithm

Kruskal's may be characterized as a greedy algorithm, which builds the MST one edge at a time. As befits a MST algorithm, the greedy strategy is to continually add the remaining edge of lowest weight. Unlike Prim's, however, Kruskal's adds edges without regard to the connectivity of the partially built MST; that is, it does not necessarily add an edge emanating from a vertex that is in the partially built MST. Indeed, it may be said that Kruskal's starts with V forests of one vertex each, and adds edges one by one, each one causing two trees in the forest to coalesce into one, until all vertices have been placed in the same connected component and the MST is complete. In doing so, one must be careful not to add an edge between two vertices that are already in the same component, for doing so would create a cycle. We shall assume that a spanning tree exists for the following sections. (If you find them too difficult, skip them.)

Lemma

Suppose that a subset S of the edges of a graph G is known to be a subset of the edges of some spanning tree of G. Consider the set of edges T containing exactly those edges \in G\S which, when added to S, do not induce a cycle. If a minimal-weight edge \in T is added to S, the resulting set is also guaranteed to be a subset of the edges of some spanning tree of G. Proof: Consider the graph G' with vertex set V(G) and edge set S. Now, any edge in G\S connects either two vertices in different connected components in G' or two vertices in the same component. If it connects two vertices in the same component, adding it generates a cycle (since there is already a path from one component to the other that does not use that edge, and adding the edge generates a return path) and it is not in T. Otherwise, it does not generate a cycle because it is a bridge in the resulting single connected component, and it is in T. That is, T consists of exactly those edges linking different connected components in G'. It is clear that