Difference between revisions of "Convex hull trick"
(→Case study: USACO MAR08 acquire) |
(→The technique) |
||
Line 14: | Line 14: | ||
==Naive algorithm== | ==Naive algorithm== | ||
For each of the <math>Q</math> queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given <math>x</math>-value. If <math>M</math> lines are given along with <math>Q</math> queries, the complexity of this solution is <math>\mathcal{O}(MQ)</math>. The "trick" enables us to speed up the time for this computation to <math>\mathcal{O}((Q+M)\lg M)</math>, a significant improvement. | For each of the <math>Q</math> queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given <math>x</math>-value. If <math>M</math> lines are given along with <math>Q</math> queries, the complexity of this solution is <math>\mathcal{O}(MQ)</math>. The "trick" enables us to speed up the time for this computation to <math>\mathcal{O}((Q+M)\lg M)</math>, a significant improvement. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Another example: APIO 2010 Commando== | ==Another example: APIO 2010 Commando== |
Revision as of 16:05, 29 August 2023
The convex hull trick is a technique (perhaps best classified as a data structure) used to determine efficiently, after preprocessing, which member of a set of linear functions in one variable attains an extremal value for a given value of the independent variable. It has little to do with convex hull algorithms.
Contents
History
The convex hull trick is perhaps best known in algorithm competitions from being required to obtain full marks in several USACO problems, such as MAR08 "acquire", which began to be featured in national olympiads after its debut in the IOI '02 task Batch Scheduling, which itself credits the technique to a 1995 paper (see references). Very few online sources mention it, and almost none describe it. It has been suggested (van den Hooff, 2010) that this is because the technique is "obvious" to anybody who has learned the sweep line algorithm for the line segment intersection problem.
The problem
Suppose that a large set of linear functions in the form is given along with a large number of queries. Each query consists of a value of and asks for the minimum value of that can be obtained if we select one of the linear functions and evaluate it at . For example, suppose our functions are , , , and and we receive the query . We have to identify which of these functions assumes the lowest -value for , or what that value is. (It is the function , assuming a value of 2.)
The variation in which we seek the maximal line, not the minimal one, is a trivial modification and we will focus our attention on finding the minimal line.
When the lines are graphed, this is easy to see: we want to determine, at the -coordinate 1 (shown by the red vertical line), which line is "lowest" ( has the lowest -coordinate). Here it is the heavy dashed line, .Naive algorithm
For each of the queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given -value. If lines are given along with queries, the complexity of this solution is . The "trick" enables us to speed up the time for this computation to , a significant improvement.
Another example: APIO 2010 Commando
You are given:
- A sequence of positive integers ().
- The integer coefficients of a quadratic function , where .
The objective is to partition the sequence into contiguous subsequences such that the sum of taken over all subsequences is maximized, where the value of a subsequence is the sum of its elements.
An dynamic programming approach is not hard to see. We define:
-
sum(i,j) = x[i] + x[i+1] + ... + x[j]
-
adjust(i,j) = a*sum(i,j)^2 + b*sum(i,j) + c
Then let
- .
Pseudocode:
dp[0] ← 0 for n ∈ [1..N] for k ∈ [0..n-1] dp[n] ← max(dp[n], dp[k] + adjust(k + 1 , n))
Now let's play around with the function "adjust".
Denote by . Then, for some value of , we can write
Now, let
Then, we see that is the quantity we aim to maximize by our choice of . Adding (which is independent of ) to the maximum gives the correct value of . Also, is independent of , whereas and are independent of , as required.
Unlike in task "acquire", we are interested in building the "upper envelope". We notice that the slope of the "maximal" line increases as increases. Since the problem statement indicates , the slope of each line is positive.
Suppose . For we have slope . For we have slope . But , and since the given sequence is positive, so . We conclude that lines are added to our data structure in increasing order of slope.
Since , query values are given in increasing order and a pointer walk suffices (it is not necessary to use binary search.)
Fully dynamic variant
The convex hull trick is easy to implement when all insertions are given before all queries (offline version) or when each new line inserted has a lower slope than any line currently in the envelope. Indeed, by using a deque, we can easily allow insertion of lines with higher slope than any other line as well. However, in some applications, we might have no guarantee of either condition holding. That is, each new line to be added may have any slope whatsoever, and the insertions may be interspersed with queries, so that sorting the lines by slope ahead of time is impossible, and scanning through an array to find the lines to be removed could take linear time per insertion
It turns out, however, that it is possible to support arbitrary insertions in amortized logarithmic time. To do this, we store the lines in an ordered dynamic set (such as C++'s std::set
). Each line possesses the attributes of slope and y-intercept, the former being the key, as well as an extra field, the minimum -coordinate at which this line is the lowest in the set. To insert a new line, we merely insert it into its correct position in the set, and then all the lines to be removed, if any, are contiguous with it. The procedure is then largely the same as for the case in which we always inserted lines of minimal slope: if the line to be added is , the line to the left is , and the line to the left of that is , then we check if the - intersection is to the left of the - intersection; if so, is discarded and we repeat; similarly, if lines and are on the right, then can be removed if the - intersection is to the left of the - intersection, and this too is performed repeatedly until no more lines are to be discarded. We compute the new values (for , it is the - intersection, and for , it is the - intersection).
To handle queries, we keep another set, storing the same data but this time ordered by the value. When we insert or remove lines from that set (or update, in the case of above), we use the value from the element in that set to associate it with an element in this one. Whenever a query is made, therefore, all we have to do is to find the greatest value in this set that is less than the query value; the corresponding line is the optimal one .
References
- Brucker, P. (1995). Efficient algorithms for some path partitioning problems. Discrete Applied Mathematics, 62(1-3), 77-85.
- Christiano, Paul. (2007). Land acquisition. Retrieved from an archived copy of the competition problem set at http://tjsct.wikidot.com/usaco-mar08-gold
- Peng, Richard. (2008). USACO MAR08 problem 'acquire' analysis. Retrieved from http://ace.delos.com/TESTDATA/MAR08.acquire.htm
- van den Hooff, Jelle. (2010). Personal communication.
- Wang, Hanson. (2010). Personal communication.