Convex hull trick

From PEGWiki
Revision as of 05:03, 3 January 2010 by Brian (Talk | contribs)

Jump to: navigation, search

The convex hull trick is a technique (perhaps best classified as a data structure) used to determine efficiently, after preprocessing, which member of a set of linear functions in one variable attains an extremal value for a given value of the independent variable. It has little to do with convex hull algorithms.

History

The convex hull trick is perhaps best known in algorithm competition from being required to obtain full marks in several USACO problems, such as MAR08 "acquire", which began to be featured in national olympiads after its debut in the IOI '02 task Batch Scheduling, which itself credits the technique to a 1995 paper (see references). Very few online sources mention it, and almost none describe it. (The technique might not be considered important enough, or people might want to avoid providing the information to other countries' IOI teams.)

The problem

Suppose that a large set of linear functions in the form y = m_i x + b_i is given along with a large number of queries. Each query consists of a value of x and asks for the minimum value of y that can be obtained if we select one of the linear functions and evaluate it at x. For example, suppose our functions are y=4, y=4/3+2/3\,x, y=12-3x, and y=3-1/2\,x and we receive the query x=1. We have to identify which of these functions assumes the lowest y-value for x=1, or what that value is. (It is the function y=4/3+2/3\,x, assuming a value of 2.)

Graphical representation of the lines in this example
When the lines are graphed, this is easy to see: we want to determine, at the x-coordinate 1 (shown by the red vertical line), which line is "lowest" (has the lowest y-coordinate). Here it is the heavy dashed line, y=4/3+2/3\,x.

Naive algorithm

For each of the Q queries, of course, we may simply evaluate every one of the linear functions, and determine which one has the least value for the given x-value. If M lines are given along with Q queries, the complexity of this solution is \mathcal{O}(MQ). The "trick" enables us to speed up the time for this computation to \mathcal{O}((Q+M)\lg Q), a significant improvement.

The technique

Consider the diagram above. Notice that the line y=4 will never be the lowest one, regardless of the x-value. Of the remaining three lines, each one is the minimum in a single contiguous interval (possibly having plus or minus infinity as one bound). That is, the heavy dotted line is the best line at all x-values left of its intersection with the heavy solid line; the heavy solid line is the best line between that intersection and its intersection with the light solid line; and the light solid line is the best line at all x-values greater than that. Notice also that, as x increases, the slope of the minimal line decreases: 4/3, -1/2, -3. Indeed, it is not difficult to see that this is always true.

Thus, if we remove "irrelevant" lines such as y=4 in this example (the lines which will never give the minimum y-coordinate, regardless of the query value) and sort the remaining lines by slope, we obtain a collection of N intervals (where N is the number of lines remaining), in each of which one of the lines is the minimal one. If we can determine the endpoints of these intervals, it becomes a simple matter to use binary search to answer each query.

Etymology

The term convex hull is sometimes misused to mean upper/lower envelope. If we consider the "optimal" segment of each of the three lines (ignoring y=4), what we see is the lower envelope of the lines: nothing more or less than the set of points obtained by choosing the lowest point for every possible x-coordinate. (The lower envelope is highlighted in green in the diagram above.) Notice that the set bounded above by the lower envelope is convex. We could imagine the lower envelope being the upper convex hull of some set of points, and thus the name convex hull trick arises.

References

  • Brucker, P. (1995). Efficient algorithms for some path partitioning problems. Discrete Applied Mathematics, 62(1-3), 77-85.