Editing Maximum subvector sum

The '''maximum subvector sum''' problem is that of finding a segment of a vector ([[array]] of numbers)<ref>In computer science, ''vector'' is often used to mean ''array of real numbers''; sometimes it is simply synonymous with ''array''. Here, ''array'' is used in contradistinction with ''[[linked list]]'', which does not support efficient random access. This meaning is related to but different from the meaning of the word in mathematics and physics.</ref>, possibly empty, with maximum sum. If all elements of the array are nonnegative, then the problem may be trivially solved by taking the entire vector; if all elements are negative, then the problem is again trivially solved by taking the empty segment (whose sum is [[Empty sum|conventionally defined to be zero]]). The problem requires more thought, however, when the vector contains a mixture of positive and negative numbers.

This problem originated in the domain of image processing;<ref name="bentley">Bentley, Jon (1984), "Programming pearls: algorithm design techniques", ''Communications of the ACM'' '''27''' (9): 865–873, doi:10.1145/358234.381162.</ref> applications have also been found in data mining.<ref name="takaoka">Takaoka, T. (2002), "Efficient algorithms for the maximum subarray problem by distance matrix multiplication", ''Electronic Notes in Theoretical Computer Science'' '''61'''.</ref> In algorithmic programming competitions, Kadane's [[linear time]] algorithm for this problem is often useful as a building block within a more complex algorithm for processing a multidimensional array.

==One-dimensional problem==
Bentley<ref name="bentley"/> describes four algorithms for this problem, of running time <math>O(n^3)</math>, <math>O(n^2)</math>, <math>O(n\log n)</math>, and <math>O(n)</math>. We discuss only the latter in this article, which is known as ''Kadane's algorithm'' after its discoverer.

Kadane's algorithm is a classic example of [[dynamic programming]]. It works by scanning the vector from left to right and computing the maximum-sum subvector ''ending at'' each of the vector's entries; denote the best sum ending at entry <math>i</math> by <math>M_i</math>. Also, for convenience, set <math>M_0 = 0</math> for the empty subvector. The maximum subvector sum for the entire array <math>A</math> is then, of course, <math>\max(M_0, M_1, M_2, \ldots, M_n)</math>. <math>M_i</math> for <math>i > 0</math> may then be computed as follows:
:<math>M_i = \max(A_i, A_i + M_{i-1}) = A_i + \max(0, M_{i-1})</math>
This formula is based on the following optimal substructure: The maximum-sum subvector ending at position <math>i</math> consists of either only the element <math>A_i</math> itself, or that element plus one or more elements <math>A_j, A_{j+1}, \ldots, A_{i-1}</math> (that is, ending at the previous position <math>i-1</math>). But the sum obtained in the latter case is simply <math>A_i</math> plus the sum of the subvector ending at <math>A_{i-1}</math>, so we want to make the latter as great as possible, requiring us to choose the maximum-sum subvector ending at <math>A_{i-1}</math>. This accounts for the <math>A_i + M_{i-1}</math> term. Of course, if <math>M_{i-1}</math> turned out to be negative, then there is no point in including any terms before <math>A_i</math> at all. This is why we must take the greater of <math>A_i</math> and <math>A_i + M_{i-1}</math>.

===Implementation (C++)===
This snippet is a direct translation of the algorithm given in the text:
<syntaxhighlight lang="cpp">
template<class Vector>
int max_subvector_sum(Vector V)
{
    int* M = new int[V.size() + 1];         // dynamically allocate array M
    M[0] = 0;
    for (int i = 0; i < V.size(); i++)
        M[i+1] = V[i] + max(0, M[i]);       // apply the formula, but with zero-indexed arrays
    int res = max_element(M, M+V.size()+1);
    delete[] M;
    return res;
}
</syntaxhighlight>
With a bit of thought, this can be simplified. In particular, we don't need to keep the entire array <math>M</math>; we only need to remember the last element computed.
<syntaxhighlight lang="cpp">
template<class Vector>
int max_subvector_sum(Vector V)
{
    int res = 0, cur = 0;
    for (int i = 0; i < V.size(); i++)
        res = max(res, cur = V[i] + max(0, cur));
    return res;
}
</syntaxhighlight>

==Higher dimensions==
The obvious generalization of the problem is as follows: given a <math>d</math>-dimensional array (or ''tensor'') of dimensions <math>n_1 \leq n_2 \leq \ldots \leq n_d</math>, find indices <math>1 \leq a_1 \leq b_1 \leq n_1, 1 \leq a_2 \leq b_2 \leq n_2, \ldots, 1 \leq a_d \leq b_d \leq n_d</math> such that the sum <math>\sum_{i_1=a_1}^{b_1} \sum_{i_2=a_2}^{b_2} \ldots \sum_{i_d=a_d}^{b_d} A_{i_1, i_2, \ldots, i_d}</math> is maximized (or return 0 if all entries are negative). If we imagine a two-dimensional array as a matrix, then the problem is to pick some axis-aligned rectangle within the matrix with maximum sum.

We describe a simple algorithm for this problem. It works by recursively reducing a <math>d</math>-dimensional problem to <math>O(n_1^2)</math> simpler, <math>(d-1)</math>-dimensional problems, terminating at <math>d=1</math> which can be solved in <math>O(n_d)</math> time using the one-dimensional algorithm. Evidently, this algorithm takes time <math>O(n_1^2 n_2^2 \ldots n_{d-1}^2 n_d)</math>. In the case with all dimensions equal, this reduces to <math>O(n^{2d-1})</math>, or <math>O(n^3)</math> in the two-dimensional case.

The details are as follows. We try all possible sets of bounds <math>[a_1, b_1] \in [1, n_1]</math> for the last index. For each such interval, we create a <math>(d-1)</math>-dimensional tensor <math>B</math> where <math>B_{i_2, i_3, \ldots, i_d} = \sum_{i_1=a_1}^{b_1} A_{i_1, i_2, \ldots, i_d}</math> and compute the maximum subtensor sum in <math>B</math>. The range of indices that this represents in the original array will be the Cartesian product of the indices in the maximum-sum subtensor of <math>B</math> and the original range <math>[a_1, b_1]</math>, so that by trying all possibilities for the latter, we will account for all possible subtensors of the original array <math>A</math>.

This algorithm is not [[asymptotically optimal]]; for example, the <math>d=2</math> case can be solved in <math>O(n^3 (\log n)/(\log \log n))</math> time by a result due to Takaoka.<ref name="takaoka"/> This implies that for <math>d > 1</math>, we can always achieve <math>O(n^{2d-1} (\log n)/(\log \log n))</math>, which is slightly better than the algorithm given above. In practice, however, the gains are not large.

==Problems==
* {{Problem|monument|Baltic OI '09: Monument}}
* {{Problem|wc02Sp1|WC '02: Dark Lord}}

==Notes and references==
<references/>
@@ Line 9: / Line 9: @@
 :<math>M_i = \max(A_i, A_i + M_{i-1}) = A_i + \max(0, M_{i-1})</math>
 This formula is based on the following optimal substructure: The maximum-sum subvector ending at position <math>i</math> consists of either only the element <math>A_i</math> itself, or that element plus one or more elements <math>A_j, A_{j+1}, \ldots, A_{i-1}</math> (that is, ending at the previous position <math>i-1</math>). But the sum obtained in the latter case is simply <math>A_i</math> plus the sum of the subvector ending at <math>A_{i-1}</math>, so we want to make the latter as great as possible, requiring us to choose the maximum-sum subvector ending at <math>A_{i-1}</math>. This accounts for the <math>A_i + M_{i-1}</math> term. Of course, if <math>M_{i-1}</math> turned out to be negative, then there is no point in including any terms before <math>A_i</math> at all. This is why we must take the greater of <math>A_i</math> and <math>A_i + M_{i-1}</math>.
-===Example===
-In the vector [5, -3, -3, 3, 0, -1, 4, -5, 5], the maximal subvector is [3, 0, -1, 4]; this subvector has a sum of 6, which is greater than that of any other subvector. Notice that in order to construct this subvector, we had to take a negative entry because it lay between two large positive entries; if we refused to include any negative entries, the maximum sum obtainable would be only 5 (by taking one of the elements at the ends).
 ===Implementation (C++)===
@@ Line 41: / Line 38: @@
 ==Higher dimensions==
-The obvious generalization of the problem is as follows: given a <math>d</math>-dimensional array (or ''tensor'') of dimensions <math>n_1 \leq n_2 \leq \ldots \leq n_d</math>, find indices <math>1 \leq a_1 \leq b_1 \leq n_1, 1 \leq a_2 \leq b_2 \leq n_2, \ldots, 1 \leq a_d \leq b_d \leq n_d</math> such that the sum <math>\sum_{i_1=a_1}^{b_1} \sum_{i_2=a_2}^{b_2} \ldots \sum_{i_d=a_d}^{b_d} A_{i_1, i_2, \ldots, i_d}</math> is maximized (or return 0 if all entries are negative).
+The obvious generalization of the problem is as follows: given a <math>d</math>-dimensional array (or ''tensor'') of dimensions <math>n_1 \leq n_2 \leq \ldots \leq n_d</math>, find indices <math>1 \leq a_1 \leq b_1 \leq n_1, 1 \leq a_2 \leq b_2 \leq n_2, \ldots, 1 \leq a_d \leq b_d \leq n_d</math> such that the sum <math>\sum_{i_1=a_1}^{b_1} \sum_{i_2=a_2}^{b_2} \ldots \sum_{i_d=a_d}^{b_d} A_{i_1, i_2, \ldots, i_d}</math> is maximized (or return 0 if all entries are negative). If we imagine a two-dimensional array as a matrix, then the problem is to pick some axis-aligned rectangle within the matrix with maximum sum.
-We describe a simple algorithm for this problem. It works by recursively reducing a <math>d</math>-dimensional problem to <math>O(n_1^2)</math> simpler, <math>(d-1)</math>-dimensional problems, terminating at <math>d=1</math> which can be solved in <math>O(n_d)</math> time using the one-dimensional algorithm. Evidently, this algorithm takes time <math>O(n_1^2 n_2^2 \ldots n_{d-1}^2 n_d)</math>. In the case with all dimensions equal, this reduces to <math>O(n^{2d-1})</math>.
+We describe a simple algorithm for this problem. It works by recursively reducing a <math>d</math>-dimensional problem to <math>O(n_1^2)</math> simpler, <math>(d-1)</math>-dimensional problems, terminating at <math>d=1</math> which can be solved in <math>O(n_d)</math> time using the one-dimensional algorithm. Evidently, this algorithm takes time <math>O(n_1^2 n_2^2 \ldots n_{d-1}^2 n_d)</math>. In the case with all dimensions equal, this reduces to <math>O(n^{2d-1})</math>, or <math>O(n^3)</math> in the two-dimensional case.
-The details are as follows. We try all possible sets of bounds <math>[a_1, b_1] \subseteq [1, n_1]</math> for the first index. For each such interval, we create a <math>(d-1)</math>-dimensional tensor <math>B</math> where <math>B_{i_2, i_3, \ldots, i_d} = \sum_{i_1=a_1}^{b_1} A_{i_1, i_2, \ldots, i_d}</math> and compute the maximum subtensor sum in <math>B</math>. The range of indices that this represents in the original array will be the Cartesian product of the indices in the maximum-sum subtensor of <math>B</math> and the original range <math>[a_1, b_1]</math>, so that by trying all possibilities for the latter, we will account for all possible subtensors of the original array <math>A</math>.
+The details are as follows. We try all possible sets of bounds <math>[a_1, b_1] \in [1, n_1]</math> for the last index. For each such interval, we create a <math>(d-1)</math>-dimensional tensor <math>B</math> where <math>B_{i_2, i_3, \ldots, i_d} = \sum_{i_1=a_1}^{b_1} A_{i_1, i_2, \ldots, i_d}</math> and compute the maximum subtensor sum in <math>B</math>. The range of indices that this represents in the original array will be the Cartesian product of the indices in the maximum-sum subtensor of <math>B</math> and the original range <math>[a_1, b_1]</math>, so that by trying all possibilities for the latter, we will account for all possible subtensors of the original array <math>A</math>.
-===Two-dimensional implementation===
-In two dimensions, the time required is <math>O(m^2 n)</math>, where <math>m < n</math>, or <math>O(n^3)</math> if <math>m = n</math>. If we imagine a two-dimensional array as a matrix, then the problem is to pick some axis-aligned rectangle within the matrix with maximum sum. The algorithm described above can be written as follows:
-<syntaxhighlight lang="cpp">
-template<class Matrix>
-int max_submatrix_sum(Matrix M)
-{
-    int m = M.size();
-    int n = M[0].size();
-    vector<int> B(n);
-    int res = 0;
-    for (int a = 0; a < m; a++)
-    {
-        fill(B.begin(), B.end(), 0);
-        for (int b = a; b < m; b++)
-        {
-            for (int i = 0; i < n; i++)
-                B[i] += M[b][i];
-            res = max(res, max_subvector_sum(B));
-        }
-    }
-    return res;
-}
-</syntaxhighlight>
-Observe that, in the above code, the outer loop fixes <math>a</math>, the starting row, and the middle loop fixes <math>b</math>, the ending row. For every combination of starting and ending row, we compute the sum of the elements in those rows ''for each column''; these sums are stored in the vector <math>B</math>, which is then passed to the one-dimensional subroutine given earlier in the article. That is, <math>B_i = A_{a, i} + A_{a+1, i} + \ldots + A_{b, i}</math>. However, we cannot afford to actually ''compute'' each entry of <math>B</math> in this way; that would give <math>O(m^3 n)</math> time. Instead, we compute <math>B</math> dynamically, by re-using the values from the previous <math>b</math>, since <math>A_{a, i} + \ldots + A_{b, i} = (A_{a, i} + \ldots + A_{b-1, i}) + A_{b, i}</math>. This accounts for the line <code>B[i] += M[b][i]</code> in the code above.
-===Faster algorithms===
 This algorithm is not [[asymptotically optimal]]; for example, the <math>d=2</math> case can be solved in <math>O(n^3 (\log n)/(\log \log n))</math> time by a result due to Takaoka.<ref name="takaoka"/> This implies that for <math>d > 1</math>, we can always achieve <math>O(n^{2d-1} (\log n)/(\log \log n))</math>, which is slightly better than the algorithm given above. In practice, however, the gains are not large.