Editing Graph theory

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 1: Line 1:
A '''graph''' is a mathematical object with '''vertices''' (also known as '''nodes'''), discrete objects, and '''edges''' (also known as '''arcs'''), relationships between pairs of objects. Because of the wide variety of objects and relationships that may be abstracted as vertices and edges, graphs are highly versatile, and may be used to model a great number of different real-world entities, such as cities and highways, social networks, and the positions of the Rubik's Cube. Likewise, many interesting problems in computer science concern graphs themselves. The subdiscipline of mathematics and computer science concerning graphs is known as ''graph theory''.
+
A '''graph''' is a mathematical object with '''vertices''' (also known as '''nodes'''), discrete objects, and '''edges''', relationships between pairs of objects. Because of the wide variety of objects and relationships that may be abstracted as vertices and edges, graphs are highly versatile, and may be used to model a great number of different real-world entities, such as cities and highways, social networks, and the positions of the Rubik's Cube. Likewise, many interesting problems in computer science concern graphs themselves. The subdiscipline of mathematics and computer science concerning graphs is known as ''graph theory''.
  
 
==Structure of a graph==
 
==Structure of a graph==
Line 10: Line 10:
  
 
===Directed and undirected graphs===
 
===Directed and undirected graphs===
Some relationships are two-way, but some are only one-way. For example, suppose that in the social network example given in the preceding section, we place an edge between the vertices representing two people if and only if one of them has a crush on the other. The trouble here is that clearly, Alice having a crush on Bob is different than Bob having a crush on Alice, and hence these two scenarios ought to be represented by ''different graphs''. In order to arrange this, we stipulate that each edge also has a ''direction'', and that an edge from <math>u</math> to <math>v</math> is not the same as an edge from <math>v</math> to <math>u</math>. A graph that encodes this one-way information is known as a '''directed graph''' or '''digraph''', whereas one that does not is an '''undirected graph'''. When an edge from <math>u</math> to <math>v</math> is drawn in the diagram of a graph, generally an arrowhead is added on the end of the line segment representing that edge near <math>v</math>, so that the segment "points" from <math>u</math> to <math>v</math>. Note that it is possible for two-way relationships to be represented by directed graphs; maybe Alice and Bob secretly have a crush on each other. This would be represented by both edges existing in <math>E</math> and a double-ended arrow. The point is that directed graphs must be used when it is not guaranteed that all relationships will be bidirectional. An edge from <math>u</math> to <math>v</math> is said to '''enter''' <math>v</math> and '''leave''' <math>u</math>. An edge in a directed graph is also known as an '''arc'''.
+
Some relationships are two-way, but some are only one-way. For example, suppose that in the social network example given in the preceding section, we place an edge between the vertices representing two people if and only if one of them has a crush on the other. The trouble here is that clearly, Alice having a crush on Bob is different than Bob having a crush on Alice, and hence these two scenarios ought to be represented by ''different graphs''. In order to arrange this, we stipulate that each edge also has a ''direction'', and that an edge from <math>u</math> to <math>v</math> is not the same as an edge from <math>v</math> to <math>u</math>. A graph that encodes this one-way information is known as a '''directed graph''', whereas one that does not is an '''undirected graph'''. When an edge from <math>u</math> to <math>v</math> is drawn in the diagram of a graph, generally an arrowhead is added on the end of the line segment representing that edge near <math>v</math>, so that the segment "points" from <math>u</math> to <math>v</math>. Note that it is possible for two-way relationships to be represented by directed graphs; maybe Alice and Bob secretly have a crush on each other. This would be represented by both edges existing in <math>E</math> and a double-ended arrow. The point is that directed graphs must be used when it is not guaranteed that all relationships will be bidirectional. An edge from <math>u</math> to <math>v</math> is said to '''enter''' <math>v</math> and '''leave''' <math>u</math>.
  
 
===Weighted graphs===
 
===Weighted graphs===
Line 48: Line 48:
 
A '''path''' is a [[sequence]] of vertices <math>v_0, v_1, ..., v_n</math> such that for all <math>1 \leq i \leq n</math>, there is an edge from <math>v_{i-1}</math> to <math>v_i</math>. If such a sequence exists, we say it is a path ''from'' <math>v_0</math> to <math>v_n</math> or ''between'' <math>v_0</math> and <math>v_n</math>, or that <math>v_0</math> and <math>v_n</math> are ''connected'' by that path, or that <math>v_n</math> is '''reachable''' from <math>v_0</math>. Note that in a directed graph, saying there is a path from <math>u</math> to <math>v</math> is not the same as saying there is a path from <math>v</math> to <math>u</math>, and the same applies to reachability; if <math>u</math> and <math>v</math> are reachable from each other then they are said to be '''mutually reachable'''. We also say that the path '''visits''' each of the vertices <math>v_0, v_1, ..., v_n</math>. On a diagram of a graph, a path is obtained by "following" the edges (going only in the directions of arrowheads, if the graph is directed). The '''length''' of a path is the number of edges on that path (one less than the number of vertices), and the '''weight''' of the path, if the graph is weighted, is the sum of the weights of the edges along that path. (The definition must be slightly modified when multiple edges are allowed.) Note that in some cases ''length'' can actually mean ''weight'' in a weighted graph; mind the context. A path is said to be a '''simple path''' if it does not visit any vertex more than once.
 
A '''path''' is a [[sequence]] of vertices <math>v_0, v_1, ..., v_n</math> such that for all <math>1 \leq i \leq n</math>, there is an edge from <math>v_{i-1}</math> to <math>v_i</math>. If such a sequence exists, we say it is a path ''from'' <math>v_0</math> to <math>v_n</math> or ''between'' <math>v_0</math> and <math>v_n</math>, or that <math>v_0</math> and <math>v_n</math> are ''connected'' by that path, or that <math>v_n</math> is '''reachable''' from <math>v_0</math>. Note that in a directed graph, saying there is a path from <math>u</math> to <math>v</math> is not the same as saying there is a path from <math>v</math> to <math>u</math>, and the same applies to reachability; if <math>u</math> and <math>v</math> are reachable from each other then they are said to be '''mutually reachable'''. We also say that the path '''visits''' each of the vertices <math>v_0, v_1, ..., v_n</math>. On a diagram of a graph, a path is obtained by "following" the edges (going only in the directions of arrowheads, if the graph is directed). The '''length''' of a path is the number of edges on that path (one less than the number of vertices), and the '''weight''' of the path, if the graph is weighted, is the sum of the weights of the edges along that path. (The definition must be slightly modified when multiple edges are allowed.) Note that in some cases ''length'' can actually mean ''weight'' in a weighted graph; mind the context. A path is said to be a '''simple path''' if it does not visit any vertex more than once.
  
A '''cycle''' is a path in which the first and last vertex are the same. By definition, a cycle can never be a simple path, but if it repeats no vertex other than the first and last, it is known as a '''simple cycle'''. (An exception is that a simple cycle of length two cannot exist in an undirected graph, since this would use an ''edge'' twice.) An '''odd cycle''' is a cycle with odd length, whereas an '''even cycle''' is a cycle with even length. If a graph has no cycles, it is said to be '''acyclic'''.
+
A '''cycle''' is a path in which the first and last vertex are the same. By definition, a cycle can never be a simple path, but if it repeats no vertex other than the first and last, it is known as a '''simple cycle'''. An '''odd cycle''' is a cycle with odd length, whereas an '''even cycle''' is a cycle with even length. If a graph has no cycles, it is said to be '''acyclic'''.
  
 
===Types of graphs===
 
===Types of graphs===
A '''complete graph''' is one in which an edge exists between every pair of distinct vertices. If it is directed, it will have one each way. A complete directed graph on <math>n</math> vertices has <math>n(n-1)</math> edges (an edge from every vertex to every other). A complete undirected graph on <math>n</math> vertices, denoted <math>K_n</math>, will have half that many, <math>n(n-1)/2</math> edges.
 
 
 
Given a graph, if we remove some (possibly zero) vertices and some (possibly zero) edges, we obtain a '''subgraph'''. (Note that when a vertex is removed, all edges incident upon it must be removed too.)
 
Given a graph, if we remove some (possibly zero) vertices and some (possibly zero) edges, we obtain a '''subgraph'''. (Note that when a vertex is removed, all edges incident upon it must be removed too.)
  
An [[undirected graph]] is said to be '''connected''' if and only if every [[vertex]] is reachable from every other vertex. A [[directed graph]] is said to be '''strongly connected''' if and only if each pair of distinct vertices is mutually reachable. If an undirected graph is not connected, it has two or more [[subgraphs]] called '''connected components'''. A [[connected component]] consists of a vertex and all the vertices reachable from it (and all the incident edges); if two vertices are reachable from each other than they will be in the same connected component, but if not, they will be in different connected components. Put another way, define an [[equivalence relation]] on the vertices of the graph so that two vertices are equivalent if and only if one is reachable from the other; then connected components are equivalence classes. If a directed graph is not strongly connected, it has two or more subgraphs called '''strongly connected components'''; these are analogous to connected components, and two vertices are in the same [[strongly connected component]] if and only if they are mutually reachable. <span style="opacity:0">The definition for a strongly connected component in an [[undirected graph]] is uncommon will have to be clarified in a problem statement.</span>
+
An undirected graph is said to be '''connected''' if and only if every vertex is reachable from every other vertex. A directed graph is said to be '''strongly connected''' if and only if each pair of distinct vertices is mutually reachable. If an undirected graph is not connected, it has two or more subgraphs called '''connected components'''. A connected component consists of a vertex and all the vertices reachable from it (and all the incident edges); if two vertices are reachable from each other than they will be in the same connected component, but if not, they will be in different connected components. Put another way, define an equivalence relation on the vertices of the graph so that two vertices are equivalent if and only if one is reachable from the other; then connected components are equivalence classes. If a directed graph is not strongly connected, it has two or more subgraphs called '''strongly connected components'''; these are analogous to connected components, and two vertices are in the same strongly connected component if and only if they are mutually reachable. Identifying the connected components of a graph can be easily accomplished in linear time ''via'' a [[graph search]]. Identifying strongly connected components is more challenging, but can still be accomplished in linear time ''via'' [[Kosaraju's algorithm]], [[Tarjan's algorithm]], or [[Gabow's algorithm]].
  
 
A '''[[tree]]''' is an undirected graph that is both connected and acyclic. A '''forest''' is a graph that consists of one or more trees. If a graph is directed and acyclic, it is simply known as a '''[[directed acyclic graph]]''' or by the initialism '''DAG'''. A directed graph that is like a tree, but in which every edge points ''away'' from one of the tree's vertices, is called an '''arborescence'''.
 
A '''[[tree]]''' is an undirected graph that is both connected and acyclic. A '''forest''' is a graph that consists of one or more trees. If a graph is directed and acyclic, it is simply known as a '''[[directed acyclic graph]]''' or by the initialism '''DAG'''. A directed graph that is like a tree, but in which every edge points ''away'' from one of the tree's vertices, is called an '''arborescence'''.
  
A '''bridge''' or '''cut edge''' is an edge that, when removed, causes an increase in the number of connected components of a graph. If a connected graph has no bridges, then it remains connected when any edge is removed, and is said to be '''edge-connected'''. If a graph is not edge-connected, it has two or more '''edge-connected components''', defined analogously to connected components; two vertices are in the same edge-connected component if they remain connected when any edge is removed. A '''cut vertex''' or '''articulation point''' is a vertex that, when removed, causes an increase in the number of connected components. If a connected graph has no articulation points, then it remains connected when any vertex is removed, and is said to be '''biconnected'''. If a graph is not biconnected, it has two or more '''biconnected components''', defined analogously to edge-connected components.
+
A '''bridge''' or '''cut edge''' is an edge that, when removed, causes an increase in the number of connected components of a graph. If a connected graph has no bridges, then it remains connected when any edge is removed, and is said to be '''edge-connected'''. If a graph is not edge-connected, it has two or more '''edge-connected components''', defined analogously to connected components; two vertices are in the same edge-connected component if they remain connected when any edge is removed. A '''cut vertex''' or '''articulation point''' is a vertex that, when removed, causes an increase in the number of connected components. If a connected graph has no articulation points, then it remains connected when any vertex is removed, and is said to be '''biconnected'''. If a graph is not biconnected, it has two or more '''biconnected components''', defined analogously to edge-connected components. Cut vertices and edges may be identified in linear time using [[Detection of cut vertices and cut edges|an algorithm due to Hopcroft and Tarjan]].
  
 
Edges may also be '''spliced''' or '''subdivided''', which refers to removing an edge and connecting its two vertices ''via'' a third vertex inserted "between" them (by adding two new edges). Splicing increases the number of vertices by one and the number of edges by one. If repeated splicing operations on a graph <math>G</math> yield graph <math>G'</math>, then <math>G'</math> is said to be a '''subdivision''' of <math>G</math>, and <math>G</math> is said to be a '''minor''' of <math>G'</math>.
 
Edges may also be '''spliced''' or '''subdivided''', which refers to removing an edge and connecting its two vertices ''via'' a third vertex inserted "between" them (by adding two new edges). Splicing increases the number of vertices by one and the number of edges by one. If repeated splicing operations on a graph <math>G</math> yield graph <math>G'</math>, then <math>G'</math> is said to be a '''subdivision''' of <math>G</math>, and <math>G</math> is said to be a '''minor''' of <math>G'</math>.
  
A '''flow graph''' or '''flow network''' is a directed graph in which two distinct vertices are specifically denoted the '''source''', <math>s</math>, and the '''sink''', <math>t</math>, no edges enter <math>s</math>, and no edges leave <math>t</math>. An <math>s</math>-<math>t</math> cut is a set of edges which, when deleted from a flow graph, cause the sink to become unreachable from the source. As the name suggests, a flow graph is useful for modelling the flow of something (be it concrete or abstract) from the source to sink. For example, a flow network may model a computer network, in which each computer is a vertex and the weight of an edge represents the bandwidth of a cable between two computers.
+
A '''flow graph''' or '''flow network''' is a directed graph in which two distinct vertices are specifically denoted the '''source''', <math>s</math>, and the '''sink''', <math>t</math>, no edges enter <math>s</math>, and no edges leave <math>t</math>. An <math>s-t</math> cut is a set of edges which, when deleted from a flow graph, cause the sink to become unreachable from the source. As the name suggests, a flow graph is useful for modelling the flow of something (be it concrete or abstract) from the source to sink.
 
+
A graph is said to be '''planar''' if it is possible to draw it with no edges crossing.
+
 
+
A '''[[bipartite graph]]''' is a graph in which it is possible to partition the vertices into two sets <math>A</math> and <math>B</math>, such that edges exist only between vertices in <math>A</math> and vertices in <math>B</math>, with no edges within <math>A</math> alone or <math>B</math> alone. This is equivalent to the condition that no odd cycles exist in an undirected graph.
+
 
+
==Graph-theoretic algorithms==
+
''Problems relating to trees and bipartite graphs are discussed in the respective articles.''
+
* ''Finding connected components'': Given an undirected graph, label each vertex such that whenever two vertices have the same label, there is a path between them, whereas whenever two vertices have different labels, no such path exists. This can be accomplished in linear time using [[depth-first search]] or [[breadth-first search]].
+
* '''[[Shortest path]] problem''': Given two vertices in a graph, find the path of least length (unweighted) or weight (weighted) from one to the other.
+
:* The '''distance''' between two nodes is the length (unweighted) or weight (weighted) of the shortest path from one to the other.
+
:* The '''diameter''' of a graph is the maximum of the distance function taken over all pairs of nodes. This can be determined by finding [[all-pairs shortest paths]].
+
:* The '''girth''' of an unweighted graph is the length of its shortest simple cycle. This can be determined all-pairs shortest paths as well, but in an undirected graph, we have to remember that we cannot go backward along an edge.
+
:* The '''bounded cost shortest path problem''' asks us to find the shortest path that does not exceed some fixed cost, where a cost is assigned to each edge in addition to its weight; this is like finding the shortest sequence of flights that fits within one's budget.
+
:* The problem of finding ''two'' paths between a given pair of vertices such that they have no common edges and the sum of their weights is minimized can be solved using [[Suurballe's algorithm]].
+
* '''Spanning tree''': A tree <math>T</math> is said to '''span''' a graph <math>G</math> when <math>T</math> is a subgraph of <math>G</math> and <math>T</math> contains all of <math>G</math>'s vertices. A spanning tree can be found in linear time using [[depth-first search]] or [[breadth-first search]].
+
:* '''[[Minimum spanning tree]] problem''': The weight of a tree is the sum of the weights of the edges it contains. Find a tree that spans <math>G</math>, while minimizing its weight. This can be accomplished using a priority-first search instead.
+
:* The '''[[minimum diameter spanning tree]] problem''' is analogous, but now we want a spanning tree with the lowest possible diameter.
+
* The '''[[minimum spanning arborescence]] problem''' is analogous to the minimum spanning tree problem, but more difficult.
+
* An '''[[Eulerian path]]''' is a path that uses every edge in a graph exactly once. If it also begins and ends with the same node, it is known as an '''Eulerian circuit''' or '''Eulerian tour'''. It is possible to determine whether a graph has an Eulerian path or Eulerian tour, and find one if one exists, in linear time.
+
* A '''Hamiltonian path''' or '''Hamiltonian circuit''' is defined analogously, but visits every ''vertex'' once, instead of visiting every edge once. Detection and construction of Hamiltonian paths/circuits, however, is NP-complete/NP-hard.
+
* '''[[Maximum flow]] problem''': In a flow network, let the weight of an edge represent the '''capacity''' of that edge. We want to assign a '''flow''' to each edge which is less than or equal to its capacity, such that the sum of all the flows of edges entering any vertex equals the sum of all the flows of edges leaving (so that nothing "accumulates" in a vertex, nor does the vertex "run out"). The exceptions are the source and the sink; the sum of all the flows out of the source equals the sum of all the flows into the sink. We want to assign flows such that this value is maximized.
+
:* '''Minimum <math>s</math>-<math>t</math> cut problem''': Instead of regarding the weight of an edge as its capacity, regard it as the cost required to delete that edge. Find an <math>s</math>-<math>t</math> cut of minimum cost.
+
:* '''Minimum vertex cut problem''': Find a subset of a flow graph's vertices, of minimum size, such that removing these vertices from the graph causes the sink to become unreachable from the source. (We are not allowed to remove the source or the sink.) This can be solved by replacing all nodes in the flow graph except the source and the sink by a pair of nodes, one for all the in-edges and one for all the out-edges, with an edge of cost 1 from the in-node to the out-node, and then finding a minimum edge cut, as above. (We set the costs of all other edges to infinity.)
+
:* '''[[Minimum cost maximum flow]] problem''': Maximize the flow, but also minimize the cost; the cost of sending flow along an edge is the product of the amount of flow along that edge and some constant specific to that edge.
+
* ''Finding strongly connected components'': Analogous to finding connected components, but in a directed graph. This is a bit trickier but can still be accomplished in linear time using [[Kosaraju's algorithm]], [[Tarjan's strongly connected components algorithm|Tarjan's algorithm]], or [[Gabow's algorithm]].
+
* ''Finding edge-connected and biconnected components'': We can identify all the cut vertices and cut edges of an undirected graph in linear time using [[Finding cut vertices and edges|a depth-first search algorithm due to Hopcroft and Tarjan]].
+
* ''Dominators'': These are like articulation points, but for directed graphs instead. In a control flow graph with source <math>s</math>, we say that a vertex <math>u</math> '''dominates''' a vertex <math>v</math> if every path from <math>s</math> to <math>v</math> must visit <math>u</math>. Every vertex dominates itself, but for all <math>v \neq s</math>, there is also an '''immediate dominator''' <math>u</math> such that <math>u \neq v</math>, <math>u</math> dominates <math>v</math>, and any other dominator of <math>v</math> that is not <math>v</math> itself also dominates <math>u</math>. Computing all immediate dominators gives a '''dominator tree''', which can be computed in linear time using the [[Lengauer–Tarjan algorithm]].
+
* A '''matching''' is a subset of a graph's edges such that no two edges in the subset have a common vertex. The '''maximum matching problem''' is that of finding a matching of a graph with the maximum possible number of edges. A maximum matching can be found in polynomial time using [[Edmond's matching algorithm]].
+
* A '''vertex cover''' is a subset of a graph's vertices such that every edge is incident upon at least one vertex in that subset. The '''minimum vertex cover problem''' is that of finding a vertex cover of a graph of minimum size. This problem is NP-hard in general.
+
* An '''edge cover''' is a subset of a graph's edges such that each vertex is an endpoint for at least one edge in that subset. The '''minimum edge cover problem''' is that of finding an edge cover of a graph of minimum size. This can be solved by first finding a maximum matching, as above, and then choosing one vertex from each edge in the matching as well as all the remaining vertices. If a matching is also an edge cover, it is called a '''perfect matching'''.
+
* An '''independent set''' is a subset of a graph's vertices such that no two vertices in the subset are adjacent. The '''maximum independent subset problem''' is that of finding an independent subset of maximum size. This problem is NP-hard.
+
* It is possible to determine whether a graph is planar in linear time using an algorithm due to Tarjan, but this is extremely complex.
+
  
[[Category:Graph theory]]
+
==Problems in graph theory==
[[Category:Pages needing diagrams]]
+

Please note that all contributions to PEGWiki are considered to be released under the Attribution 3.0 Unported (see PEGWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)