# Graph search

In graph theory, graph search or graph traversal refers to the technique of "visiting" each vertex of the graph in turn, executing some code whenever a vertex is "visited". By varying the order in which vertices are visited and the nature of the action performed when a vertex is visited, many interesting problems in graph theory can be solved efficiently. Conversely, many graph-theoretic algorithms can be classified as graph search algorithms (see the Applications below).

## Basic principles of graph search

Although theoretically one can visit the vertices of a graph in any order, in all useful graph search algorithms the choice of which vertex to visit next is taken from the immediate neighbours of vertices that have already been visited. (Obviously, the first vertex visited does not follow this rule.) This means that usually the algorithm must be invoked separately on each connected component of the graph, because it is impossible to visit a vertex if the algorithm starts in a different component. The set of all vertices which have not yet been visited but may be visited next (because they are immediate neighbours of already-visited vertices) is called the fringe.

## Pseudocode for generic graph search algorithm

```input G
for each connected component C ∈ G
for each v ∈ V(C)
let visited[v] = false
let u = some vertex ∈ V(C)
let F = {u}
while F ≠ {}
u = some vertex ∈ F
remove u from F
if visited[u] = false
visit u
visited[u] = true
for each v ∈ V(C) such that (u,v) ∈ E(G) and visited[v] = false
```

Here is a line-by-line analysis of the code:

• We run the graph search exactly once for each connected component in the graph (for each connected component C ∈ G).
• Initially, no vertices in C (v ∈ V(C)) have been visited yet, so we set each vertex's entry in the visited array to false (let visited[v] = false).
• The algorithm starts by choosing some vertex in the component (let u = some vertex ∈ V(C)) and placing it in the fringe F (let F = {u}).
• From then on, the next vertex we visit will always be a vertex in the fringe, so we know we can stop when the fringe becomes empty (while F ≠ {}).
• When the fringe is not empty, we select any vertex in F (u = some vertex ∈ F) and remove it from F. If u has already been visited, we have nothing more to do for u. Otherwise (if visited[u] = false), we "visit" u, whatever "visit" happens to mean in the particular algorithm we are using. We record that u has been visited (visited[u] = true).
• After visiting u, we add any neighbours of u (for each v ∈ V(C) such that (u,v) ∈ E(G)) to the fringe F that have not been visited already (and visited[v] = false). Then the loop starts over again, meaning that these newly added vertices are eventually visited.

As it stands, this code has a serious flaw: how do we know what the connected components of the graph are? As a matter of fact, breadth-first search and depth-first search are usually used to find the connected components, but they are themselves graph search! The solution is as follows: if a connected component has already been examined, then all vertices in that component have been visited, or, contrapositively, if a vertex has not been visited, then its component has not been visited either. This allows us to rewrite the code as

```input G
for each u ∈ V(G)
let visited[u] = false
for each u ∈ V(G)
if visited[u]=false
let F = {u}
while F ≠ {}
v = some vertex ∈ F
remove v from F
if visited[v] = false
visit v
visited[v] = true
for each w ∈ V(G) such that (v,w) ∈ E(G) and visited[w] = false
```

Notice that F is not necessarily a real set in the mathematical sense; F might contain multiple copies of a vertex. The nature of F is dependent upon what type of graph search algorithm is being used. Also, an actual implementation of graph search need not actually look like this. For example, depth-first search is often implemented recursively. However, all graph search algorithms can be made to fit this form.

## Types of graph search algorithms

Many details have been left out of the generic code above. For example, we are free to choose the initial vertex u in any way we want, to iterate over the connected components in any order, and to iterate over the neighbours of u in any order. However, there are two places where variation is most important: the instruction visit u can mean practically anything, leading to endless variety in graph algorithms, and selecting the next vertex from F can be done in various ways. When you learn a new algorithm, try to identify these choices and understand how they are crucial to the correctness of the algorithm. There are three common paradigms for the selection of the next vertex: breadth-first search, depth-first search, and priority-first search.

In breadth-first search, the next vertex selected from F is always the oldest remaining vertex in F (the one that was placed into F first.) That is, F is a FIFO (first in, first out) queue.

### Depth-first search

In depth-first search, the next vertex selected from F is always the newest remaining vertex (the one most recently added to F). That is, F is a LIFO (last-in, first out) stack.

### Priority-first search

In priority-first search, each vertex is assigned a priority, and whenever a vertex is to be selected from F, the one with the highest priority is selected. This is the basis of Dijkstra's algorithm, A*, and Prim's algorithm (see Applications).

## Applications of graph search

Graph search has the following uses, inter alia:

### Components in graphs

#### Connected components in undirected graphs

The problems of finding connected components in undirected graphs is solved using graph search; we start the search at some vertex, and after running the search, a vertex has its visited entry true if and only if it is in the same connected component as the starting vertex. This can be used to assign all vertices connected component IDs, so that two vertices are in the same component (and reachable from each other). (The proof that one pass of graph search will visit all vertices reachable from the starting vertex can be easily executed by induction on the unweighted distance from the starting vertex.) In fact, we can remove the visited array because we will be able to tell if a vertex has been visited yet by examining its ID number; if it has not been visited yet, then and ID number has not been assigned, and vice versa. Pseudocode:

```input G
let cnt = 0
for each u ∈ V(G)
let id[u] = -1
for each u ∈ V(G)
if id[u]=-1
let F = {u}
while F ≠ {}
v = some vertex ∈ F
remove v from F
if id[v] = -1
visit v
id[v] = cnt
for each w ∈ V(G) such that (v,w) ∈ E(G) and id[w] = -1
cnt = cnt + 1
```

We iterate through the vertices one at a time. If, for a given vertex, a component ID has not yet been assigned (it is currently -1), then we assign it the first unused non-negative integer and we do a graph search starting from that vertex, assigning this ID number to all vertices reachable from it. After doing the search, we increment the total number of components found so that the next component will be assigned a higher ID number. But if, when we reach a vertex, a component ID has already been assigned, then we skip that vertex.

##### Flood fill

The special case known as flood fill is often used as an example application of graph search. The task is to implement an algorithm that will accomplish something like the fill tool in a simple raster graphics editor (such as Paint, which will be familiar to users of Microsoft Windows): given a two-dimensional array of numbers (with each number presumably representing a colour), one location in the array (a pixel), and a new number (a new colour), the objective is to fill the entire contiguous single-valued (monochromatic) area containing the given location with the new number.

In this case, the graph is defined implicitly, and there is no need to explicitly construct it. Each element of the array corresponds to a vertex, which can be identified simply by its row and column indices. Most vertices have degree four (with the exception of those on the boundary); the neighbours' indices can be determined by adding one to or subtracting one from either the row or the column.

The article on depth-first search contains code for the typical recursive DFS implementation of flood fill.

The flood fill algorithm can be used to count the number of "blobs" in a grid, too; every time we encounter a nonempty square, we flood fill it to make it empty, and so remove the entire blob. The total number of times we have to do this is the total number of blobs, since each blob will only be removed once.

#### Biconnected components

The more complex problem of finding biconnected components (or finding bridges and articulation points) in an undirected graph can be solved with a depth-first search algorithm.

#### Directed graphs

For directed graphs, the equivalent of connected components are strongly connected components: again, maximal subsets of the vertices such that inside each component any vertex is reachable from any other. The three best-known algorithms for computing these, Tarjan's algorithm, Gabow's algorithm, and Kosaraju's algorithm, all use depth-first searches. The equivalents of articulation points in directed graphs — dominators — are computed using depth-first search too.

### Paths and spanning trees

One of the simplest uses of graph search is finding a path from one vertex s to another vertex t. For example, we might want to find a path out of a maze. The code for such an application might look like this:

```input G,s,t
for each u ∈ V(G)
let visited[u] = false
let F = {(s,s)}
while F ≠ {}
(u,v) = some pair ∈ F
remove (u,v) from F
if visited[v] = false
pred[v] = u
visited[v] = true
for each w ∈ V(G) such that (v,w) ∈ E(G) and visited[w] = false