Editing Heavy-light decomposition
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 6: | Line 6: | ||
==Properties== | ==Properties== | ||
<p>Denote the size of a subtree rooted at vertex <math>v</math> as <math>\operatorname{size}(v)</math>.</p> | <p>Denote the size of a subtree rooted at vertex <math>v</math> as <math>\operatorname{size}(v)</math>.</p> | ||
− | <p>Suppose that a vertex <math>v</math> has two children <math>u</math> and <math>w</math> and the edges <math>v</math>-<math>u</math> and <math>v</math>-<math>w</math> are both heavy. Then, <math>\operatorname{size}(u) + \operatorname{size}(w) > \frac{1}{2}\operatorname{size}(v)+\frac{1}{2}\operatorname{size}(v) | + | <p>Suppose that a vertex <math>v</math> has two children <math>u</math> and <math>w</math> and the edges <math>v</math>-<math>u</math> and <math>v</math>-<math>w</math> are both heavy. Then, <math>\operatorname{size}(u) + \operatorname{size}(w) > \frac{1}{2}\operatorname{size}(v)+\frac{1}{2}\operatorname{size}(v) > \operatorname{size}(v)</math>. This is a contradiction, since we know that <math>\operatorname{size}(u) + \operatorname{size}(v) + 1 \le \operatorname{size}(v)</math>. (<math>v</math>, of course, may have more than two children.) We conclude that '''there may be at most one heavy edge of all the edges joining any given vertex to its children.'''</p> |
<p>At most two edges incident upon a given vertex may then be heavy: the one joining it to its parent, and at most one joining it to a child. Consider the subgraph of the tree in which all light edges are removed. Then, all resulting connected components are paths (although some contain only one vertex and no edges at all) and two neighboring vertices' heights differ by one. We conclude that '''the heavy edges, along with the vertices upon which they are incident, partition the tree into disjoint paths, each of which is part of some path from the root to a leaf.'''</p> | <p>At most two edges incident upon a given vertex may then be heavy: the one joining it to its parent, and at most one joining it to a child. Consider the subgraph of the tree in which all light edges are removed. Then, all resulting connected components are paths (although some contain only one vertex and no edges at all) and two neighboring vertices' heights differ by one. We conclude that '''the heavy edges, along with the vertices upon which they are incident, partition the tree into disjoint paths, each of which is part of some path from the root to a leaf.'''</p> | ||
<p>Suppose a tree contains <math>N</math> vertices. If we follow a light edge from the root, the subtree rooted at the resulting vertex has size at most <math>N/2</math>; if we repeat this, we reach a vertex with subtree size at most <math>N/4</math>, and so on. It follows that '''the number of light edges on any path from root to leaf is at most <math>\lg N</math>.'''</p> | <p>Suppose a tree contains <math>N</math> vertices. If we follow a light edge from the root, the subtree rooted at the resulting vertex has size at most <math>N/2</math>; if we repeat this, we reach a vertex with subtree size at most <math>N/4</math>, and so on. It follows that '''the number of light edges on any path from root to leaf is at most <math>\lg N</math>.'''</p> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Applications== | ==Applications== | ||
− | The utility of the H-L decomposition lies in that problems involving paths between nodes can often be solved more efficiently by "skipping over" heavy paths rather than considering each edge in the path individually. This is because long paths in trees that are very poorly balanced tend to consist mostly of heavy edges. | + | <p>The utility of the H-L decomposition lies in that problems involving paths between nodes can often be solved more efficiently by "skipping over" heavy paths rather than considering each edge in the path individually. This is because long paths in trees that are very poorly balanced tend to consist mostly of heavy edges.</p> |
− | + | <p>To "skip" from any node in the tree to the root: | |
− | To "skip" from any node in the tree to the root: | + | |
<pre> | <pre> | ||
while current_node ≠ root | while current_node ≠ root | ||
Line 43: | Line 20: | ||
current_node = skip(current_node) | current_node = skip(current_node) | ||
</pre> | </pre> | ||
− | In the above, <code>color(u,v)</code> represents the color (heavy or light) of the edge from <code>u</code> to <code>v</code>, <code>parent(v)</code> represents the parent of a node <code>v</code>, and <code>skip(v)</code> represents the parent of the highest (''i.e.'', least deep) node that is located on the same heavy path as node <code>v</code>. (It is so named because it allows us to "skip" through heavy paths.) In order for the latter to be implemented efficiently, every node whose parent link is heavy must be augmented with a pointer to the result of the <code>skip</code> function, but that is trivial to achieve. | + | In the above, <code>color(u,v)</code> represents the color (heavy or light) of the edge from <code>u</code> to <code>v</code>, <code>parent(v)</code> represents the parent of a node <code>v</code>, and <code>skip(v)</code> represents the parent of the highest (''i.e.'', least deep) node that is located on the same heavy path as node <code>v</code>. (It is so named because it allows us to "skip" through heavy paths.) In order for the latter to be implemented efficiently, every node whose parent link is heavy must be augmented with a pointer to the result of the <code>skip</code> function, but that is trivial to achieve. This might not seem like much, but let's see how it can be used to solve a simple problem.</p> |
− | |||
− | |||
− | |||
− | |||
− | |||
===Lowest common ancestor=== | ===Lowest common ancestor=== | ||
<p>The H-L decomposition gives an efficient solution to the [[lowest common ancestor]] (LCA) problem. Consider first a naive solution: | <p>The H-L decomposition gives an efficient solution to the [[lowest common ancestor]] (LCA) problem. Consider first a naive solution: | ||
Line 82: | Line 54: | ||
return last | return last | ||
</pre> | </pre> | ||
− | Here's how this code works: in each iteration of the while loop, the | + | Here's how this code works: in each iteration of the while loop, the shallower of the two nodes is advanced up the tree (that is, if its parent link is light, it becomes its parent, otherwise it "skips", as previously described). Note that if the two nodes are equally deep, one of them still gets advanced; this is intentional and it does not matter which one is advanced. When the two nodes become equal, we are done.</p> |
<p>Or are we? Suppose that the highest edges on the paths from each node to the LCA are light. Then, it is clear that the while loop ends with both nodes now pointing to the LCA, after having advanced through the respective light edges. If, however, one is heavy, there is a possibility that we "overshoot"; that is, that we end up at a node higher in the tree than the actual LCA. If this occurs, then the last advancement before its occurrence is from the actual LCA to the ending node (convince yourself of this by drawing a diagram if necessary), so that all we have to do is "remember" the last node visited in this case. That is what the <code>last</code> variable holds. (If both final links to the LCA are light, it is set to the root of the tree, which is otherwise impossible, to indicate this.) Notice that we do not have to consider the case in which ''both'' final links to the LCA are heavy, since this is impossible; two child links from a node cannot both be heavy, so these two heavy links would have to be the same link, making the child of that link a lower common ancestor (a contradiction).</p> | <p>Or are we? Suppose that the highest edges on the paths from each node to the LCA are light. Then, it is clear that the while loop ends with both nodes now pointing to the LCA, after having advanced through the respective light edges. If, however, one is heavy, there is a possibility that we "overshoot"; that is, that we end up at a node higher in the tree than the actual LCA. If this occurs, then the last advancement before its occurrence is from the actual LCA to the ending node (convince yourself of this by drawing a diagram if necessary), so that all we have to do is "remember" the last node visited in this case. That is what the <code>last</code> variable holds. (If both final links to the LCA are light, it is set to the root of the tree, which is otherwise impossible, to indicate this.) Notice that we do not have to consider the case in which ''both'' final links to the LCA are heavy, since this is impossible; two child links from a node cannot both be heavy, so these two heavy links would have to be the same link, making the child of that link a lower common ancestor (a contradiction).</p> | ||
− | |||
===Dynamic distance query=== | ===Dynamic distance query=== | ||
− | + | Now consider the following problem: A weighted, rooted tree is given followed by a large number of queries and modifications interspersed with each other. A ''query'' asks for the distance between a given pair of nodes and a ''modification'' changes the weight of a specified edge to a new (specified) value. How can the queries and modifications be performed efficiently?</p> | |
− | + | <p>First of all, we notice that the distance between nodes <math>u</math> and <math>v</math> is given by <math>\operatorname{dist}(u,v) = \operatorname{dist}(root,u) + \operatorname{dist}(root,v) - 2\,\operatorname{dist}(root,\operatorname{LCA}(u,v))</math>. We already know how to efficiently solve lowest common ancestor queries, so if we can efficiently solve queries of a node's distance from the root, a solution to this problem follows trivially.</p> | |
− | First of all, we notice that the distance between nodes <math>u</math> and <math>v</math> is given by <math>\operatorname{dist}(u,v) = \operatorname{dist}(root,u) + \operatorname{dist}(root,v) - 2\,\operatorname{dist}(root,\operatorname{LCA}(u,v))</math>. | + | <p>To solve this simpler problem of querying the distance from the root to a given node, we augment our walk procedure as follows: when following a light edge, simply add its distance to a running total; when following a heavy path, add all the distances along the edges traversed to the running total (as well as the distance along the light edge at the top, if any). The latter can be performed efficiently, along with modifications, if each heavy path is augmented with a data structure such as a [[segment tree]] or [[binary indexed tree]] (in which the individual array elements correspond to the weights of the edges of the heavy path).</p> |
− | + | ||
− | To solve this simpler problem of querying the distance from the root to a given node, we augment our walk procedure as follows: when following a light edge, simply add its distance to a running total; when following a heavy path, add all the distances along the edges traversed to the running total (as well as the distance along the light edge at the top, if any). The latter can be performed efficiently, along with modifications, if each heavy path is augmented with a data structure such as a [[segment tree]] or [[binary indexed tree]] (in which the individual array elements correspond to the weights of the edges of the heavy path). | + | |
− | + | ||
− | + | ||
==References== | ==References== | ||
* Wang, Hanson. (2009). Personal communication. | * Wang, Hanson. (2009). Personal communication. | ||
* D. D. Sleator and R. E. Tarjan, A data structure for dynamic trees, ''in'' "Proc. Thirteenth Annual ACM Symp. on Theory of Computing," pp. 114–122, 1981. | * D. D. Sleator and R. E. Tarjan, A data structure for dynamic trees, ''in'' "Proc. Thirteenth Annual ACM Symp. on Theory of Computing," pp. 114–122, 1981. |