Editing Convex hull trick

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 2: Line 2:
  
 
==History==
 
==History==
The convex hull trick is perhaps best known in algorithm competitions from being required to obtain full marks in several USACO problems, such as [http://tjsct.wikidot.com/usaco-mar08-gold MAR08 "acquire"], which began to be featured in national olympiads after its debut in the IOI '02 task {{problem|ioi0221|Batch Scheduling}}, which itself credits the technique to a 1995 paper (see references). Very few online sources mention it, and almost none describe it. It has been suggested (van den Hooff, 2010) that this is because the technique is "obvious" to anybody who has learned the [[sweep line algorithm]] for the [[line segment intersection problem]].
+
The convex hull trick is perhaps best known in algorithm competition from being required to obtain full marks in several USACO problems, such as [http://tjsct.wikidot.com/usaco-mar08-gold MAR08 "acquire"], which began to be featured in national olympiads after its debut in the IOI '02 task [http://ioinformatics.org/locations/ioi02/contest/day2/batch/batch.pdf Batch Scheduling], which itself credits the technique to a 1995 paper (see references). Very few online sources mention it, and almost none describe it. It has been suggested (van den Hooff, 2010) that this is because the technique is "obvious" to anybody who has learned the [[sweep line algorithm]] for the [[line segment intersection problem]].
  
 
==The problem==
 
==The problem==
Line 10: Line 10:
  
 
<p>[[File:Convex_hull_trick1.png|200px|thumb|right|Graphical representation of the lines in this example]]
 
<p>[[File:Convex_hull_trick1.png|200px|thumb|right|Graphical representation of the lines in this example]]
When the lines are graphed, this is easy to see: we want to determine, at the <math>x</math>-coordinate 1 (shown by the red vertical line), which line is "lowest" ( has the lowest <math>y</math>-coordinate). Here it is the heavy dashed line, <math>y=4/3+2/3\,x</math>.</p>
+
When the lines are graphed, this is easy to see: we want to determine, at the <math>x</math>-coordinate 1 (shown by the red vertical line), which line is "lowest" (has the lowest <math>y</math>-coordinate). Here it is the heavy dashed line, <math>y=4/3+2/3\,x</math>.</p>
  
 
==Naive algorithm==
 
==Naive algorithm==
For each of the <math>Q</math> queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given <math>x</math>-value. If <math>M</math> lines are given along with <math>Q</math> queries, the complexity of this solution is <math>\mathcal{O}(MQ)</math>. The "trick" enables us to speed up the time for this computation to <math>\mathcal{O}((Q+M)\lg M)</math>, a significant improvement.
+
For each of the <math>Q</math> queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given <math>x</math>-value. If <math>M</math> lines are given along with <math>Q</math> queries, the complexity of this solution is <math>\mathcal{O}(MQ)</math>. The "trick" enables us to speed up the time for this computation to <math>\mathcal{O}((Q+M)\lg Q)</math>, a significant improvement.
  
 
==The technique==
 
==The technique==
<p>Consider the diagram above. Notice that the line <math>y=4</math> will ''never'' be the lowest one, regardless of the <math>x</math>-value. Of the remaining three lines, ''each one is the minimum in a single contiguous interval'' (possibly having plus or minus infinity as one bound). That is, the heavy dotted line is the best line at all <math>x</math>-values left of its intersection with the heavy solid line; the heavy solid line is the best line between that intersection and its intersection with the light solid line; and the light solid line is the best line at all <math>x</math>-values greater than that. Notice also that, as <math>x</math> increases, the slope of the minimal line decreases: 2/3, -1/2, -3. Indeed, it is not difficult to see that this is ''always'' true.</p>
+
<p>Consider the diagram above. Notice that the line <math>y=4</math> will ''never'' be the lowest one, regardless of the <math>x</math>-value. Of the remaining three lines, ''each one is the minimum in a single contiguous interval'' (possibly having plus or minus infinity as one bound). That is, the heavy dotted line is the best line at all <math>x</math>-values left of its intersection with the heavy solid line; the heavy solid line is the best line between that intersection and its intersection with the light solid line; and the light solid line is the best line at all <math>x</math>-values greater than that. Notice also that, as <math>x</math> increases, the slope of the minimal line decreases: 4/3, -1/2, -3. Indeed, it is not difficult to see that this is ''always'' true.</p>
 
<p>Thus, if we remove "irrelevant" lines such as <math>y=4</math> in this example (the lines which will never give the minimum <math>y</math>-coordinate, regardless of the query value) and sort the remaining lines by slope, we obtain a collection of <math>N</math> intervals (where <math>N</math> is the number of lines remaining), in each of which one of the lines is the minimal one. If we can determine the endpoints of these intervals, it becomes a simple matter to use [[binary search]] to answer each query.</p>
 
<p>Thus, if we remove "irrelevant" lines such as <math>y=4</math> in this example (the lines which will never give the minimum <math>y</math>-coordinate, regardless of the query value) and sort the remaining lines by slope, we obtain a collection of <math>N</math> intervals (where <math>N</math> is the number of lines remaining), in each of which one of the lines is the minimal one. If we can determine the endpoints of these intervals, it becomes a simple matter to use [[binary search]] to answer each query.</p>
 
===Etymology===
 
===Etymology===
Line 43: Line 43:
 
input N
 
input N
 
for i ∈ [1..N]
 
for i ∈ [1..N]
     input rect[i].h
+
     input rect[N].h
     input rect[i].w
+
     input rect[N].w
 
let cost[0] = 0
 
let cost[0] = 0
 
for i ∈ [1..N]
 
for i ∈ [1..N]
Line 58: Line 58:
 
Let <math>m_j=\mathrm{rect}[j+1].w</math>, <math>b_j=\mathrm{cost}[j]</math>, and <math>x=\mathrm{rect}[i].h</math>. Then, it is clear that the inner loop in the above DP solution is actually trying to minimize the function <math>y=m_j x + b_j</math> by choosing <math>j</math> appropriately. That is, it is trying to solve exactly the problem discussed in this article. Thus, assuming we have implemented the lower envelope data structure discussed in this article, the improved code looks as follows:
 
Let <math>m_j=\mathrm{rect}[j+1].w</math>, <math>b_j=\mathrm{cost}[j]</math>, and <math>x=\mathrm{rect}[i].h</math>. Then, it is clear that the inner loop in the above DP solution is actually trying to minimize the function <math>y=m_j x + b_j</math> by choosing <math>j</math> appropriately. That is, it is trying to solve exactly the problem discussed in this article. Thus, assuming we have implemented the lower envelope data structure discussed in this article, the improved code looks as follows:
 
<pre>
 
<pre>
input N
+
input N
 
for i ∈ [1..N]
 
for i ∈ [1..N]
     input rect[i].h
+
     input rect[N].h
     input rect[i].w
+
     input rect[N].w
 
let E = empty lower envelope structure
 
let E = empty lower envelope structure
 
let cost[0] = 0
 
let cost[0] = 0
Line 72: Line 72:
 
</pre>
 
</pre>
 
Notice that the lines are already being given in descending order of slope, so that each line is added "at the right"; this is because we already sorted them by width. The query step can be performed in logarithmic time, as discussed, and the addition step in amortized constant time, giving a <math>\Theta(N\lg N)</math> solution. We can modify our data structure slightly to take advantage of the fact that query values are non-decreasing (that is, no query occurs further left than its predecessor, so that no line chosen has a greater slope than the previous one chosen), and replace the binary search with a [[pointer walk]], reducing query time to amortized constant as well and giving a <math>\Theta(N)</math> solution for the DP step. The overall complexity, however, is still <math>O(N\lg N)</math>, due to the sorting step.
 
Notice that the lines are already being given in descending order of slope, so that each line is added "at the right"; this is because we already sorted them by width. The query step can be performed in logarithmic time, as discussed, and the addition step in amortized constant time, giving a <math>\Theta(N\lg N)</math> solution. We can modify our data structure slightly to take advantage of the fact that query values are non-decreasing (that is, no query occurs further left than its predecessor, so that no line chosen has a greater slope than the previous one chosen), and replace the binary search with a [[pointer walk]], reducing query time to amortized constant as well and giving a <math>\Theta(N)</math> solution for the DP step. The overall complexity, however, is still <math>O(N\lg N)</math>, due to the sorting step.
 
==Another example: APIO 2010 Commando==
 
You are given:
 
* A sequence of <math>N</math> '''positive''' integers (<math>1 \le N \le 10^6</math>).
 
* The integer coefficients of a quadratic function <math>f(x) = ax^2 + bx + c</math>, where <math>a < 0</math>.
 
The objective is to partition the sequence into contiguous subsequences such that the sum of <math>f</math> taken over all subsequences is maximized, where the value of a subsequence is the sum of its elements.
 
 
An <math>O(N^2)</math> dynamic programming approach is not hard to see.
 
We define:
 
# <code>sum(i,j) = x[i] + x[i+1] + ... + x[j]</code>
 
# <code>adjust(i,j) = a*sum(i,j)^2 + b*sum(i,j) + c</code>
 
Then let
 
:<math>\operatorname{dp}(n) = \max_{0 \le k < n} [\operatorname{dp}(k) + \operatorname{adjust}(k+1, n)]</math>.
 
Pseudocode:
 
<pre>
 
dp[0] &larr; 0
 
for n &isin; [1..N]
 
    for k &isin; [0..n-1]
 
        dp[n] &larr; max(dp[n], dp[k] + adjust(k + 1 , n))
 
</pre>
 
Now let's play around with the function "adjust".
 
 
Denote <math>\operatorname{sum}(1,x)</math> by <math>\sigma(x)</math>. Then, for some value of <math>k</math>, we can write
 
:<math>
 
\begin{array}{rl}
 
\operatorname{dp}(n) &= \operatorname{dp}(k) + a(\sigma(n)-\sigma(k))^2 + b(\sigma(n)-\sigma(k)) + c \\
 
&= \operatorname{dp}(k)+  a(\sigma(n)^2 - 2\sigma(n)\sigma(k) + \sigma(k)^2) + b(\sigma(n) - \sigma(k)) + c \\
 
&= (a\sigma(n)^2 + b\sigma(n) + c) + \operatorname{dp}(k) - 2a\sigma(n)\sigma(k) + a\sigma(k)^2 - b\sigma(k)
 
\end{array}
 
</math>
 
Now, let
 
# <math>z = \sigma(n)</math>
 
# <math>m = -2a\sigma(k)</math>
 
# <math>p = \operatorname{dp}(k) + a\sigma(k)^2 - b\sigma(k)</math>
 
Then, we see that <math>mz + p</math> is the quantity we aim to maximize by our choice of <math>k</math>. Adding <math>a\sigma(n)^2 + b\sigma(n) + c</math> (which is independent of <math>k</math>) to the maximum gives the correct value of <math>\operatorname{dp}(n)</math>. Also, <math>z</math> is independent of <math>k</math>, whereas <math>m</math> and <math>p</math> are independent of <math>n</math>, as required.
 
 
Unlike in task "acquire", we are interested in building the "upper envelope". We notice that the slope of the "maximal" line increases as <math>x</math> increases. Since the problem statement indicates <math>a < 0</math>, the slope of each line is positive.
 
 
Suppose <math>k_2 > k_1</math>. For <math>k_1</math> we have slope <math>m_1 = -2a \sigma(k_1)</math>. For <math>k_2</math> we have slope <math>m_2 = -2a \sigma(k_2)</math>. But <math>\sigma(k_2) > \sigma(k_1)</math>,
 
and since the given sequence is positive, so <math>m_2 > m_1</math>. We conclude that lines are added to our data structure in increasing order of slope.
 
 
Since <math>\sigma(n) > \sigma(n-1)</math>, query values are given in increasing order and a [[pointer walk]] suffices (it is not necessary to use binary search.)
 
  
 
==Fully dynamic variant==
 
==Fully dynamic variant==
Line 120: Line 78:
 
<p>It turns out, however, that it is possible to support arbitrary insertions in amortized logarithmic time. To do this, we store the lines in an ordered dynamic set (such as C++'s <code>std::set</code>). Each line possesses the attributes of slope and y-intercept, the former being the key, as well as an extra <math>left</math> field, the minimum <math>x</math>-coordinate at which this line is the lowest in the set. To insert a new line, we merely insert it into its correct position in the set, and then all the lines to be removed, if any, are contiguous with it. The procedure is then largely the same as for the case in which we always inserted lines of minimal slope: if the line to be added is <math>l_3</math>, the line to the left is <math>l_2</math>, and the line to the left of that is <math>l_1</math>, then we check if the <math>l_1</math>-<math>l_3</math> intersection is to the left of the <math>l_1</math>-<math>l_2</math> intersection; if so, <math>l_2</math> is discarded and we repeat; similarly, if lines <math>l_4</math> and <math>l_5</math> are on the right, then <math>l_4</math> can be removed if the <math>l_3</math>-<math>l_5</math> intersection is to the left of the <math>l_3</math>-<math>l_4</math> intersection, and this too is performed repeatedly until no more lines are to be discarded. We compute the new <math>left</math> values (for <math>l_3</math>, it is the <math>l_2</math>-<math>l_3</math> intersection, and for <math>l_4</math>, it is the <math>l_3</math>-<math>l_4</math> intersection).</p>
 
<p>It turns out, however, that it is possible to support arbitrary insertions in amortized logarithmic time. To do this, we store the lines in an ordered dynamic set (such as C++'s <code>std::set</code>). Each line possesses the attributes of slope and y-intercept, the former being the key, as well as an extra <math>left</math> field, the minimum <math>x</math>-coordinate at which this line is the lowest in the set. To insert a new line, we merely insert it into its correct position in the set, and then all the lines to be removed, if any, are contiguous with it. The procedure is then largely the same as for the case in which we always inserted lines of minimal slope: if the line to be added is <math>l_3</math>, the line to the left is <math>l_2</math>, and the line to the left of that is <math>l_1</math>, then we check if the <math>l_1</math>-<math>l_3</math> intersection is to the left of the <math>l_1</math>-<math>l_2</math> intersection; if so, <math>l_2</math> is discarded and we repeat; similarly, if lines <math>l_4</math> and <math>l_5</math> are on the right, then <math>l_4</math> can be removed if the <math>l_3</math>-<math>l_5</math> intersection is to the left of the <math>l_3</math>-<math>l_4</math> intersection, and this too is performed repeatedly until no more lines are to be discarded. We compute the new <math>left</math> values (for <math>l_3</math>, it is the <math>l_2</math>-<math>l_3</math> intersection, and for <math>l_4</math>, it is the <math>l_3</math>-<math>l_4</math> intersection).</p>
  
<p>To handle queries, we keep ''another'' set, storing the same data but this time ordered by the <math>left</math> value. When we insert or remove lines from that set (or update, in the case of <math>l_4</math> above), we use the <math>left</math> value from the element in that set to associate it with an element in this one. Whenever a query is made, therefore, all we have to do is to find the greatest <math>left</math> value in this set that is less than the query value; the corresponding line is the optimal one .</p>
+
<p>To handle queries, we keep ''another'' set, storing the same data but this time ordered by the <math>left</math> value. When we insert or remove lines from that set (or update, in the case of <math>l_4</math> above), we use the <math>left</math> value from the element in that set to associate it with an element in this one. Whenever a query is made, therefore, all we have to do is to find the greatest <math>left</math> value in this set that is less than the query value; the corresponding line is the optimal one.</p>
  
 
==References==
 
==References==
Line 131: Line 89:
 
==Code==
 
==Code==
 
* [[Convex hull trick/acquire.cpp|C++ solution for "acquire"]]
 
* [[Convex hull trick/acquire.cpp|C++ solution for "acquire"]]
* [[Convex hull trick/commando.cpp|C++ solution for "commando"]]
 
 
==External links==
 
* [http://www.apio.olympiad.org/2010/ APIO 2010]
 
 
[[Category:Algorithms]]
 
[[Category:Data structures]]
 

Please note that all contributions to PEGWiki are considered to be released under the Attribution 3.0 Unported (see PEGWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)

Templates used on this page: