# Difference between revisions of "Directed acyclic graph"

A directed acyclic graph (DAG) is a directed graph that contains no cycles. DAGs arise in a natural way in modelling situations in which, in some sense, going "forward" is sometimes possible but going "backward" is definitely not, so that if $v$ is reachable from $u$, we know that $u$ is not reachable from $v$ (unless $u = v$). An example is given by Water Park from CCC 2007. In the absence of mechanical intervention, water always flows from high altitudes to low altitudes, so that, when travelling along the waterslides, one can never arrive at the same spot twice. If we create a graph that has a vertex for each marked point in the water park and a directed edge from $u$ to $v$ whenever there is a water slide from $u$ to $v$, this graph will be acyclic, and hence a DAG. DAGs do not necessarily resemble undirected acyclic graphs (that is, forests) in form or in their properties; there may be multiple paths between a given pair of nodes.

## As a partial order

DAGs are often considered representations of partial orderings. We can establish a correspondence as follows:

1. Give each vertex of the DAG a unique label (such as the letters A, B, C, etc.). Each label corresponds to an element in our set.
2. We define our partial ordering as follows: $a \leq b$ in the ordering if and only if there exists a path from $u$ to $v$, where $u$ is the vertex in the DAG labelled with $a$ and $v$ is the vertex in the DAG labelled with $b$.

The ordering thus defined is reflexive, since every vertex is reachable from itself. It is antisymmetric, because if $u \leq v$ and $v \leq u$, and $u \neq v$, then we have a cycle that contains $a$ and $b$, a contradiction. And it is transitive because reachability is transitive; if there are paths from $u$ to $v$ and from $v$ to $w$ then we can concatenate them to obtain a path from $u$ to $w$. So the ordering we have defined is a partial ordering.

Note that whereas a given DAG will correspond to exactly one partial ordering (up to isomorphism) using this convention, there may be more than one DAG that corresponds to a given partial ordering. The DAG with the most edges that corresponds to a given partial ordering is known as the transitive closure, whereas the one with the fewest edges is known as the transitive reduction.

## Properties

A source is a node with zero in-degree; all edges point outward. Likewise, a sink is a node with zero out-degree. Every finite DAG has at least one source and one sink. To see this, choose any node. If it is a sink we are done; otherwise select any outgoing edge and follow it to another node. Repeating this process, we will never repeat a node, because the graph has no cycles, so after $V-1$ iterations (where $V$ is the number of nodes in the graph), this process must terminate. The proof that there must be at least one source is analogous.

In fact, a stronger statement holds: a directed graph may be topologically ordered if and only if it is acyclic (a DAG).

Because a DAG has no cycles, each vertex in a DAG forms a separate strongly connected component.

A DAG with $V$ vertices cannot have more than $V(V-1)/2$ edges, and this bound is attained by taking a complete undirected graph on $V$ vertices, assigning a different number to each vertex, and orienting all edges so that they point from the lower-numbered vertex to the higher-numbered vertex. Such a DAG represents a total order. The proof is not hard and is left as an exercise to the reader.

## Problems

The following problems either involve DAGs specifically or have simpler/faster solutions for DAGs than for other graphs.

• Single-source shortest paths can be computed in $O(E+V)$ time by topologically sorting first and then using dynamic programming.
• All-pairs shortest paths can be computed in $O(V(E+V))$ by $V$ invocations of the single-source algorithm. This is, however, no faster than breadth-first search when the DAG is unweighted, and no faster than the Floyd–Warshall algorithm when the graph is dense (so that it is an improvement only over sparse weighted DAGs).
• The same applies to the transitive closure problem – it is not easy to improve over the $O(V^3)$ bound given by Warshall's algorithm, but see this for a method that may perform better under certain circumstances. The bounds on the transitive reduction problem match those on transitive closure.
• Longest path problem: This can also be solved efficiently using dynamic programming. Note that we only need to consider paths that start at a source.
• Identifying all strongly connected components in a digraph and contracting each component to a single vertex produces a DAG, known as the kernel DAG. Adapting the special transitive closure algorithm on DAGs to the kernel DAG gives an algorithm which may be fast for some classes of digraphs. The kernel DAG is also useful as an intermediate structure in the solutions of various other problems.
• The minimum spanning arborescence problem does not have much to do with DAGs in general, but an arborescence is a specific kind of DAG that looks like a tree.