Difference between revisions of "Graph search"

From PEGWiki
Jump to: navigation, search
(Pseudocode for generic graph search algorithm)
m (Pseudocode for generic graph search algorithm)
Line 8: Line 8:
 
input G
 
input G
 
for each connected component C ∈ G
 
for each connected component C ∈ G
     for all v ∈ V(C)
+
     for each v ∈ V(C)
 
           let visited[v] = false
 
           let visited[v] = false
 
     let u = some vertex ∈ V(C)
 
     let u = some vertex ∈ V(C)
Line 33: Line 33:
 
<pre>
 
<pre>
 
input G
 
input G
for all u ∈ V(G)
+
for each u ∈ V(G)
 
     let visited[u] = false
 
     let visited[u] = false
for each vertex u ∈ V(G)
+
for each u ∈ V(G)
 
     if visited[u]=false
 
     if visited[u]=false
 
           let F = {u}
 
           let F = {u}

Revision as of 19:17, 14 November 2009

In graph theory, graph search or graph traversal refers to the technique of "visiting" each vertex of the graph in turn, executing some code whenever a vertex is "visited". By varying the order in which vertices are visited and the nature of the action performed when a vertex is visited, many interesting problems in graph theory can be solved efficiently. Conversely, many graph-theoretic algorithms can be classified as graph search algorithms: Dijkstra's algorithm, Prim's algorithm, A*, topological sorting, the algorithms to find bridges and articulation points, the strongly connected component-finding algorithms of Tarjan, Kosaraju, and Gabow, and the augmenting path techniques (Ford-Fulkerson method and Dinic's algorithm) for the maximum flow problem all depend upon graph search.

Basic principles of graph search

Although theoretically one can visit the vertices of a graph in any order, in all useful graph search algorithms the choice of which vertex to visit next is taken from the immediate neighbours of vertices that have already been visited. (Obviously, the first vertex visited does not follow this rule.) This means that usually the algorithm must be invoked separately on each connected component of the graph, because it is impossible to visit a vertex if the algorithm starts in a different component. The set of all vertices which have not yet been visited but may be visited next (because they are immediate neighbours of already-visited vertices) is called the fringe.

Pseudocode for generic graph search algorithm

input G
for each connected component C ∈ G
     for each v ∈ V(C)
          let visited[v] = false
     let u = some vertex ∈ V(C)
     let F = {u}
     while F ≠ {}
          u = some vertex ∈ F
          remove u from F
          if visited[u] = false
               visit u
               visited[u] = true
               for each v ∈ V(C) such that (u,v) ∈ E(G) and visited[v] = false
                    add v to F

Some of the notation used is explained at PEGWiki:Pseudocode conventions.
Here is a line-by-line analysis of the code:

  • We run the graph search exactly once for each connected component in the graph (for each connected component C ∈ G).
  • Initially, no vertices in C (v ∈ V(C)) have been visited yet, so we set each vertex's entry in the visited array to false (let visited[v] = false).
  • The algorithm starts by choosing some vertex in the component (let u = some vertex ∈ V(C)) and placing it in the fringe F (let F = {u}).
  • From then on, the next vertex we visit will always be a vertex in the fringe, so we know we can stop when the fringe becomes empty (while F ≠ {}).
  • When the fringe is not empty, we select any vertex in F (u = some vertex ∈ F) and remove it from F. If u has already been visited, we have nothing more to do for u. Otherwise (if visited[u] = false), we "visit" u, whatever "visit" happens to mean in the particular algorithm we are using. We record that u has been visited (visited[u] = true).
  • After visiting u, we add any neighbours of u (for each v ∈ V(C) such that (u,v) ∈ E(G)) to the fringe F that have not been visited already (and visited[v] = false). Then the loop starts over again, meaning that these newly added vertices are eventually visited.


As it stands, this code has a serious flaw: how do we know what the connected components of the graph are? As a matter of fact, breadth-first search and depth-first search are usually used to find the connected components, but they are themselves graph search! The solution is as follows: if a connected component has already been examined, then all vertices in that component have been visited, or, contrapositively, if a vertex has not been visited, then its component has not been visited either. This allows us to rewrite the code as

input G
for each u ∈ V(G)
     let visited[u] = false
for each u ∈ V(G)
     if visited[u]=false
          let F = {u}
          while F ≠ {}
               v = some vertex ∈ F
               remove v from F
               if visited[v] = false
                    visit v
                    visited[v] = true
                    for each w ∈ V(G) such that (v,w) ∈ E(G) and visited[w] = false
                         add w to F

Notice that F is not necessarily a real set in the mathematical sense; F might contain multiple copies of a vertex. The nature of F is dependent upon what type of graph search algorithm is being used. Also, an actual implementation of graph search need not actually look like this. For example, depth-first search is often implemented recursively. However, all graph search algorithms can be made to fit this form.

Types of graph search algorithms

Many details have been left out of the generic code above. For example, we are free to choose the initial vertex u in any way we want, to iterate over the connected components in any order, and to iterate over the neighbours of u in any order. However, there are two places where variation is most important: the instruction visit u can mean practically anything, leading to endless variety in graph algorithms, and selecting the next vertex from F can be done in various ways. When you learn a new algorithm, try to identify these choices and understand how they are crucial to the correctness of the algorithm. There are three common paradigms for the selection of the next vertex: breadth-first search, depth-first search, and priority-first search.

Breadth-first search

In breadth-first search, the next vertex selected from F is always the oldest remaining vertex in F (the one that was placed into F first.) That is, F is a FIFO (first in, first out) queue.

Depth-first search

In depth-first search, the next vertex selected from F is always the newest remaining vertex (the one most recently added to F). That is, F is a LIFO (last-in, first out) stack.

Priority-first search

In priority-first search, each vertex is assigned a priority, and whenever a vertex is to be selected from F, the one with the highest priority is selected. This is the basis of Dijkstra's algorithm and Prim's algorithm.