Editing Graph search
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 1: | Line 1: | ||
In [[graph theory]], '''graph search''' or '''graph traversal''' refers to the technique of "visiting" each vertex of the graph in turn, executing some code whenever a vertex is "visited". By varying the order in which vertices are visited and the nature of the action performed when a vertex is visited, many interesting problems in graph theory can be solved efficiently. Conversely, many graph-theoretic algorithms can be classified as graph search algorithms (see the Applications below). | In [[graph theory]], '''graph search''' or '''graph traversal''' refers to the technique of "visiting" each vertex of the graph in turn, executing some code whenever a vertex is "visited". By varying the order in which vertices are visited and the nature of the action performed when a vertex is visited, many interesting problems in graph theory can be solved efficiently. Conversely, many graph-theoretic algorithms can be classified as graph search algorithms (see the Applications below). | ||
− | + | =Basic principles of graph search= | |
Although theoretically one can visit the vertices of a graph in any order, in all useful graph search algorithms the choice of which vertex to visit next is taken from the immediate neighbours of vertices that have already been visited. (Obviously, the first vertex visited does not follow this rule.) This means that usually the algorithm must be invoked separately on each [[connected component]] of the graph, because it is impossible to visit a vertex if the algorithm starts in a different component. The set of all vertices which have not yet been visited but may be visited next (because they are immediate neighbours of already-visited vertices) is called the ''fringe''. | Although theoretically one can visit the vertices of a graph in any order, in all useful graph search algorithms the choice of which vertex to visit next is taken from the immediate neighbours of vertices that have already been visited. (Obviously, the first vertex visited does not follow this rule.) This means that usually the algorithm must be invoked separately on each [[connected component]] of the graph, because it is impossible to visit a vertex if the algorithm starts in a different component. The set of all vertices which have not yet been visited but may be visited next (because they are immediate neighbours of already-visited vertices) is called the ''fringe''. | ||
− | + | =Pseudocode for generic graph search algorithm= | |
<pre> | <pre> | ||
input G | input G | ||
Line 28: | Line 28: | ||
* When the fringe is not empty, we select any vertex in ''F'' (''u = some vertex ∈ F'') and remove it from ''F''. If ''u'' has already been visited, we have nothing more to do for ''u''. Otherwise (''if visited[u] = false''), we "visit" ''u'', whatever "visit" happens to mean in the particular algorithm we are using. We record that ''u'' has been visited (''visited[u] = true''). | * When the fringe is not empty, we select any vertex in ''F'' (''u = some vertex ∈ F'') and remove it from ''F''. If ''u'' has already been visited, we have nothing more to do for ''u''. Otherwise (''if visited[u] = false''), we "visit" ''u'', whatever "visit" happens to mean in the particular algorithm we are using. We record that ''u'' has been visited (''visited[u] = true''). | ||
* After visiting ''u'', we add any neighbours of ''u'' (''for each v ∈ V(C) such that (u,v) ∈ E(G)'') to the fringe ''F'' that have not been visited already (''and visited[v] = false''). Then the loop starts over again, meaning that these newly added vertices are eventually visited. | * After visiting ''u'', we add any neighbours of ''u'' (''for each v ∈ V(C) such that (u,v) ∈ E(G)'') to the fringe ''F'' that have not been visited already (''and visited[v] = false''). Then the loop starts over again, meaning that these newly added vertices are eventually visited. | ||
− | + | <br/> | |
As it stands, this code has a serious flaw: how do we know what the connected components of the graph are? As a matter of fact, breadth-first search and depth-first search are usually used to ''find'' the connected components, but they are themselves graph search! The solution is as follows: if a connected component has already been examined, then all vertices in that component have been visited, or, contrapositively, if a vertex has not been visited, then its component has not been visited either. This allows us to rewrite the code as | As it stands, this code has a serious flaw: how do we know what the connected components of the graph are? As a matter of fact, breadth-first search and depth-first search are usually used to ''find'' the connected components, but they are themselves graph search! The solution is as follows: if a connected component has already been examined, then all vertices in that component have been visited, or, contrapositively, if a vertex has not been visited, then its component has not been visited either. This allows us to rewrite the code as | ||
<pre> | <pre> | ||
Line 47: | Line 47: | ||
</pre> | </pre> | ||
Notice that ''F'' is not necessarily a real set in the mathematical sense; ''F'' might contain multiple copies of a vertex. The nature of ''F'' is dependent upon what type of graph search algorithm is being used. Also, an actual implementation of graph search need not actually look like this. For example, depth-first search is often implemented [[recursion|recursively]]. However, all graph search algorithms can be made to fit this form.<br/> | Notice that ''F'' is not necessarily a real set in the mathematical sense; ''F'' might contain multiple copies of a vertex. The nature of ''F'' is dependent upon what type of graph search algorithm is being used. Also, an actual implementation of graph search need not actually look like this. For example, depth-first search is often implemented [[recursion|recursively]]. However, all graph search algorithms can be made to fit this form.<br/> | ||
+ | <br/> | ||
− | + | =Types of graph search algorithms= | |
Many details have been left out of the generic code above. For example, we are free to choose the initial vertex ''u'' in any way we want, to iterate over the connected components in any order, and to iterate over the neighbours of ''u'' in any order. However, there are two places where variation is most important: the instruction ''visit u'' can mean practically anything, leading to endless variety in graph algorithms, and selecting the next vertex from ''F'' can be done in various ways. When you learn a new algorithm, try to identify these choices and understand how they are crucial to the correctness of the algorithm. There are three common paradigms for the selection of the next vertex: breadth-first search, depth-first search, and priority-first search. | Many details have been left out of the generic code above. For example, we are free to choose the initial vertex ''u'' in any way we want, to iterate over the connected components in any order, and to iterate over the neighbours of ''u'' in any order. However, there are two places where variation is most important: the instruction ''visit u'' can mean practically anything, leading to endless variety in graph algorithms, and selecting the next vertex from ''F'' can be done in various ways. When you learn a new algorithm, try to identify these choices and understand how they are crucial to the correctness of the algorithm. There are three common paradigms for the selection of the next vertex: breadth-first search, depth-first search, and priority-first search. | ||
− | + | ==Breadth-first search== | |
In [[breadth-first search]], the next vertex selected from ''F'' is always the oldest remaining vertex in ''F'' (the one that was placed into ''F'' first.) That is, ''F'' is a FIFO (first in, first out) [[queue]]. | In [[breadth-first search]], the next vertex selected from ''F'' is always the oldest remaining vertex in ''F'' (the one that was placed into ''F'' first.) That is, ''F'' is a FIFO (first in, first out) [[queue]]. | ||
− | + | ==Depth-first search== | |
In [[depth-first search]], the next vertex selected from ''F'' is always the newest remaining vertex (the one most recently added to ''F''). That is, ''F'' is a LIFO (last-in, first out) [[stack]]. | In [[depth-first search]], the next vertex selected from ''F'' is always the newest remaining vertex (the one most recently added to ''F''). That is, ''F'' is a LIFO (last-in, first out) [[stack]]. | ||
− | + | ==Priority-first search== | |
In priority-first search, each vertex is assigned a priority, and whenever a vertex is to be selected from ''F'', the one with the highest priority is selected. This is the basis of Dijkstra's algorithm, A*, and Prim's algorithm (see Applications). | In priority-first search, each vertex is assigned a priority, and whenever a vertex is to be selected from ''F'', the one with the highest priority is selected. This is the basis of Dijkstra's algorithm, A*, and Prim's algorithm (see Applications). | ||
− | + | =Applications of graph search= | |
Graph search has the following uses, ''inter alia'': | Graph search has the following uses, ''inter alia'': | ||
− | + | ==Components in graphs== | |
− | + | ||
The problems of finding connected components in undirected graphs is solved using graph search; we start the search at some vertex, and after running the search, a vertex has its ''visited'' entry true if and only if it is in the same connected component as the starting vertex. This can be used to assign ''all'' vertices connected component IDs, so that two vertices are in the same component (and reachable from each other). (The proof that one pass of graph search will visit all vertices reachable from the starting vertex can be easily executed by induction on the unweighted distance from the starting vertex.) In fact, we can remove the ''visited'' array because we will be able to tell if a vertex has been visited yet by examining its ID number; if it has not been visited yet, then and ID number has not been assigned, and ''vice versa''. Pseudocode: | The problems of finding connected components in undirected graphs is solved using graph search; we start the search at some vertex, and after running the search, a vertex has its ''visited'' entry true if and only if it is in the same connected component as the starting vertex. This can be used to assign ''all'' vertices connected component IDs, so that two vertices are in the same component (and reachable from each other). (The proof that one pass of graph search will visit all vertices reachable from the starting vertex can be easily executed by induction on the unweighted distance from the starting vertex.) In fact, we can remove the ''visited'' array because we will be able to tell if a vertex has been visited yet by examining its ID number; if it has not been visited yet, then and ID number has not been assigned, and ''vice versa''. Pseudocode: | ||
<pre> | <pre> | ||
Line 84: | Line 84: | ||
cnt = cnt + 1 | cnt = cnt + 1 | ||
</pre> | </pre> | ||
− | We iterate through the vertices one at a time. If, for a given vertex, a component ID has not yet been assigned (it is currently -1), then we assign it the first unused non-negative integer and we do a graph search starting from that vertex, assigning this ID number to all vertices reachable from it. After doing the search, we increment the total number of components found so that the next component will be assigned a higher ID number. But if, when we reach a vertex, a component ID has already been assigned, then we skip that vertex. | + | We iterate through the vertices one at a time. If, for a given vertex, a component ID has not yet been assigned (it is currently -1), then we assign it the first unused non-negative integer and we do a graph search starting from that vertex, assigning this ID number to all vertices reachable from it. After doing the search, we increment the total number of components found so that the next component will be assigned a higher ID number. But if, when we reach a vertex, a component ID has already been assigned, then we skip that vertex.<br/> |
+ | <br/> | ||
+ | The more complex problem of finding [[biconnected component]]s (or finding [[bridge]]s and [[articulation points]]) in an undirected graph can be solved with a depth-first search algorithm.<br/> | ||
+ | <br/> | ||
+ | For directed graphs, the equivalent of connected components are [[strongly connected component]]s: again, maximal subsets of the vertices such that inside each component any vertex is reachable from any other. The three best-known algorithms for computing these, [[Tarjan's strongly connected components algorithm|Tarjan's algorithm]], [[Gabow's algorithm]], and [[Kosaraju's algorithm]], all use depth-first searches. The equivalents of articulation points in directed graphs -- [[Dominator tree|dominators]] - are computed using depth-first search too. | ||
− | + | ==Paths and spanning trees== | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
One of the simplest uses of graph search is finding a path from one vertex ''s'' to another vertex ''t''. For example, we might want to find a path out of a maze. The code for such an application might look like this: | One of the simplest uses of graph search is finding a path from one vertex ''s'' to another vertex ''t''. For example, we might want to find a path out of a maze. The code for such an application might look like this: | ||
<pre> | <pre> | ||
Line 117: | Line 106: | ||
add (v,w) to F | add (v,w) to F | ||
</pre> | </pre> | ||
− | Here, instead of simply pushing a vertex's neighbour onto the fringe ''F'', we push a pair containing the original vertex and the neighbour. That way, when this pair is removed from ''F'', we know both which vertex we wish to visit ''and'' where we "came from" (the ''predecessor''). We record the predecessor in the array '' | + | Here, instead of simply pushing a vertex's neighbour onto the fringe ''F'', we push a pair containing the original vertex and the neighbour. That way, when this pair is removed from ''F'', we know both which vertex we wish to visit ''and'' where we "came from" (the ''predecessor''). We record the predecessor in the array ''visited[]''. After this algorithm has terminated, if ''t'' is in the same connected component as ''s'', then at some point ''t'' will have been visited, and at that time ''pred[t]'' will have been set to some vertex which has ''t'' as its neighbour. That vertex, in turn, must have been visited, so its predecessor was recorded also, and we can ultimately trace back the path to ''s''. (In this implementation, ''s'' is its own predecessor; this is not necessary however.) In other words, in the general case, ''s'', ''pred[pred[pred[...t...]]]'', ..., ''pred[pred[pred[t]]]'', ''pred[pred[t]]'', ''pred[t]'', ''t'' will be a path from ''s'' to ''t'' after the algorithm has terminated.<br/> |
− | + | <br/> | |
− | As a matter of fact, this algorithm finds paths from ''s'' to ''all'' other vertices reachable from ''s''. We can replace ''t'' by any other vertex and use the ''pred[]'' array to find a path from ''s'' to this vertex. Since every vertex except for ''s'' has one unique ''parent'' (predecessor), the vertices form a [[tree]], with an edge between two vertices if and only if one is the other's predecessor when this algorithm is run. The root of the tree is the only vertex lacking a parent, or ''s''. The algorithm is said to produce a ''spanning tree'' of the connected component containing ''s'': a subgraph of that connected component which is also a tree. | + | As a matter of fact, this algorithm finds paths from ''s'' to ''all'' other vertices reachable from ''s''. We can replace ''t'' by any other vertex and use the ''pred[]'' array to find a path from ''s'' to this vertex. Since every vertex except for ''s'' has one unique ''parent'' (predecessor), the vertices form a [[tree]], with an edge between two vertices if and only if one is the other's predecessor when this algorithm is run. The root of the tree is the only vertex lacking a parent, or ''s''. The algorithm is said to produce a ''spanning tree'' of the connected component containing ''s'': a subgraph of that connected component which is also a tree.<br/> |
− | + | <br/> | |
− | Breadth-first search will find the ''shortest'' paths from one vertex to all others reachable from it in an ''unweighted'' graph; [[Dijkstra's algorithm]] and [[A*]] will do the same in a ''weighted'' graph. [[Prim's algorithm]] will find a [[minimum spanning tree]] in an undirected graph, a spanning tree whose total weight (the sum of the weights of all its edges) is minimal. Finding a path from one vertex to another is also essential to the ''augmenting path algorithms'', including [[ | + | Breadth-first search will find the ''shortest'' paths from one vertex to all others reachable from it in an ''unweighted'' graph; [[Dijkstra's algorithm]] and [[A*]] will do the same in a ''weighted'' graph. [[Prim's algorithm]] will find a [[minimum spanning tree]] in an undirected graph, a spanning tree whose total weight (the sum of the weights of all its edges) is minimal. Finding a path from one vertex to another is also essential to the ''augmenting path algorithms'', including [[Dinic's algorithm]] and those using the [[Ford-Fulkerson method]], for computing a [[maximum flow]]. |
− | + | ==Topological sort== | |
The problem of [[topological sorting]] can be easily solved using graph search (it does not matter if it is breadth-first, depth-first, or anything else). For reference, the problem is to find, in a [[directed acyclic graph]], an ordering of the vertices such that if an edge (''u'',''v'') exists then ''u'' precedes ''v'' in the ordering. To solve this using graph search, we first add all sources to the fringe, because they can be placed first in the ordering, and subsequently we visit each vertex only after all vertices that must precede it have already been visited, adding it to the end of the ordering as it is visited. | The problem of [[topological sorting]] can be easily solved using graph search (it does not matter if it is breadth-first, depth-first, or anything else). For reference, the problem is to find, in a [[directed acyclic graph]], an ordering of the vertices such that if an edge (''u'',''v'') exists then ''u'' precedes ''v'' in the ordering. To solve this using graph search, we first add all sources to the fringe, because they can be placed first in the ordering, and subsequently we visit each vertex only after all vertices that must precede it have already been visited, adding it to the end of the ordering as it is visited. |