Editing Prefix sum array and difference array

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 67: Line 67:
 
To solve this, we will first transform array <math>S</math> into its prefix sum array <math>P(0,S)</math>. Notice that the sum of each contiguous subsequence <math>S_i + S_{i+1} + S_{i+2} + ... + S_{j-1}</math> corresponds to the difference of two elements of <math>P</math>, that is, <math>P_j - P_i</math>. So what we want to find is the number of pairs <math>(i,j)</math> with <math>P_j - P_i = 47</math> and <math>i < j</math>. (Note that if <math>i > j</math>, we will instead get a subsequence with sum -47.)
 
To solve this, we will first transform array <math>S</math> into its prefix sum array <math>P(0,S)</math>. Notice that the sum of each contiguous subsequence <math>S_i + S_{i+1} + S_{i+2} + ... + S_{j-1}</math> corresponds to the difference of two elements of <math>P</math>, that is, <math>P_j - P_i</math>. So what we want to find is the number of pairs <math>(i,j)</math> with <math>P_j - P_i = 47</math> and <math>i < j</math>. (Note that if <math>i > j</math>, we will instead get a subsequence with sum -47.)
  
However, this is quite easy to do. We sweep through <math>P</math> from left to right, keeping a [[map]] of all elements of <math>P</math> we've seen so far, along with their frequencies; and for each element <math>P_j</math> we count the number of times <math>P_j - 47</math> has appeared so far, by looking up that value in our map; this tells us how many contiguous subsequences ending at <math>S_{j-1}</math> have sum 47. And finally, adding the number of contiguous subsequences with sum 47 ending at each entry of <math>S</math> gives the total number of such subsequences in the array. Total time taken is <math>O(N)</math>, if we use a [[hash table]] implementation of the map.
+
However, this is quite easy to do. We sweep through <math>P</math> from left to right, keeping a [[map]] of all elements of <math>P</math> we've seen so far, along with their frequencies; and for each element <math>P_j</math> we count the number of times <math>P_j - 48</math> has appeared so far, by looking up that value in our map; this tells us how many contiguous subsequences ending at <math>S_{j-1}</math> have sum 47. And finally, adding the number of contiguous subsequences with sum 47 ending at each entry of <math>S</math> gives the total number of such subsequences in the array. Total time taken is <math>O(N)</math>, if we use a [[hash table]] implementation of the map.
  
 
==Use of difference array==
 
==Use of difference array==
The difference array is used to keep track of an array when ranges of said array can be updated all at once. If we have array <math>A</math> and add an increment <math>k</math> to elements <math>A_i, A_{i+1}, ..., A_{i-1}</math>, then notice that <math>D_0, D_1, ..., D_{i-2}</math> are not affected; that <math>D_{i-1} = A_i - A_{i-1}</math> is increased by <math>k</math>; that <math>D_i, D_{i+1}, ..., D_{j-2}</math> are not affected; that <math>D_{j-1} = A_j - A_{j-1}</math> is decreased by <math>k</math>; and that <math>D_j, D_{j+1}, ...</math> are unaffected. Thus, if we are required to update many ranges of an array in this manner, we should keep track of <math>D</math> rather than <math>A</math> itself, and then integrate at the end to reconstruct <math>A</math>.
+
The difference array is used to keep track of an array when ranges of said array can be updated all at once. If we have array <math>A</math> and add an increment <math>k</math> to elements <math>A_i, A_{i+1}, ..., A_{j-1}</math>, then notice that <math>D_0, D_1, ..., D_{i-2}</math> are not affected; that <math>D_{i-1} = A_i - A{i-1}</math> is increased by <math>k</math>; that <math>D_i, D_{i+1}, ..., D_{j-2}</math> are not affected; that <math>D_{j-1} = A_j - A_{j-1}</math> is decreased by <math>k</math>; and that <math>D_j, D_{j+1}, ...</math> are unaffected. Thus, if we are required to update many ranges of an array in this manner, we should keep track of <math>D</math> rather than <math>A</math> itself, and then integrate at the end to reconstruct <math>A</math>.
  
 
===Example: Wireless (CCC)===
 
===Example: Wireless (CCC)===
Line 110: Line 110:
 
After the prefix sum array has been computed, we can use it to add together any rectangle in <math>A</math>, that is, all the elements with their indices in <math>[x_1,x_2), [y_1,y_2)</math>. To do so, we first observe that <math>P_{x_2,y_2} - P_{x_1,y_2}</math> gives the sum of all the elements with indices in the box <math>[x_1, x_2) \times [0, y_2)</math>. Then we similarly observe that <math>P_{x_2,y_1} - P_{x_1,y_1}</math> corresponds to the box <math>[x_1, x_2) \times [0, y_1)</math>. Subtracting gives the box <math>[x_1, x_2) \times [y_1, y_2)</math>. This gives a final formula of <math>P_{x_2,y_2} - P_{x_1,y_2} - P_{x_2,y_1} + P_{x_1,y_1}</math>.
 
After the prefix sum array has been computed, we can use it to add together any rectangle in <math>A</math>, that is, all the elements with their indices in <math>[x_1,x_2), [y_1,y_2)</math>. To do so, we first observe that <math>P_{x_2,y_2} - P_{x_1,y_2}</math> gives the sum of all the elements with indices in the box <math>[x_1, x_2) \times [0, y_2)</math>. Then we similarly observe that <math>P_{x_2,y_1} - P_{x_1,y_1}</math> corresponds to the box <math>[x_1, x_2) \times [0, y_1)</math>. Subtracting gives the box <math>[x_1, x_2) \times [y_1, y_2)</math>. This gives a final formula of <math>P_{x_2,y_2} - P_{x_1,y_2} - P_{x_2,y_1} + P_{x_1,y_1}</math>.
  
In the <math>n</math>-dimensional case, to sum the elements of <math>A</math> with indices in the box <math>[x_{1,0}, x_{1,1}) \times [x_{2,0}, x_{2,1}) \times ... \times [x_{n,0}, x_{n,1})</math>, we will use the formula <math>\sum_{k_1=0}^1 \sum_{k_2=0}^1 ... \sum_{k_n=0}^1 (-1)^{n - k_1 - k_2 - ... - k_n} P_{x_{1,k_1}, x_{2,k_2}, ..., x_{n,k_n}}</math>. We will not state the proof, instead noting that it is a form of the inclusion&ndash;exclusion principle.
+
In the <math>n</math>-dimensional case, to sum the elements of <math>A</math> with indices in the box <math>[x_{1,0}, x_{1,1}) \times [x_{2,0}, x_{2,1}) \times ... \times [x_{n,0}, x_{n,1})</math>, we will use the formula <math>(-1)^n \sum_{k_1=0}^1 \sum_{k_2=0}^1 ... \sum_{k_n=0}^1 P_{x_{1,k_1}, x_{2,k_2}, ..., x_{n,k_n}}</math>. We will not state the proof, instead noting that it is a form of the inclusion&ndash;exclusion principle.
  
 
====Example: Diamonds (BOI)====
 
====Example: Diamonds (BOI)====
Line 119: Line 119:
  
 
We will simply define the difference array <math>D</math> to be the array whose prefix sum array is <math>A</math>. This means in particular that <math>D_{i,j}</math> is the sum of elements of <math>D</math> with indices in the box <math>[i,i+1) \times [j,j+1)</math> (this box contains only the single element <math>D_{i,j}</math>). Using the prefix sum array <math>A</math>, we obtain <math>D_{i,j} = A_{i+1,j+1} - A_{i,j+1} - A_{i+1,j} + A_{i,j}</math>. In general:
 
We will simply define the difference array <math>D</math> to be the array whose prefix sum array is <math>A</math>. This means in particular that <math>D_{i,j}</math> is the sum of elements of <math>D</math> with indices in the box <math>[i,i+1) \times [j,j+1)</math> (this box contains only the single element <math>D_{i,j}</math>). Using the prefix sum array <math>A</math>, we obtain <math>D_{i,j} = A_{i+1,j+1} - A_{i,j+1} - A_{i+1,j} + A_{i,j}</math>. In general:
:<math>D_{i_1, i_2, ..., i_n} = \sum_{k_1=0}^1 \sum_{k_2=0}^1 ... \sum_{k_n=0}^1 (-1)^{n - k_1 - k_2 - ... - k_n} A_{i_1+k_1,i_2+k_2,...,i_n+k_n}</math>.
+
:<math>D_{i_1, i_2, ..., i_n} = (-1)^n \sum_{k_1=0}^1 \sum_{k_2=0}^1 ... \sum_{k_n=0}^1 A_{i_1+k_1,i_2+k_2,...,i_n+k_n}</math>.
  
 
Should we actually need to ''compute'' the difference array, on the other hand, the easiest way to do so is by reversing the computation of the prefix sum array:
 
Should we actually need to ''compute'' the difference array, on the other hand, the easiest way to do so is by reversing the computation of the prefix sum array:
Line 145: Line 145:
  
 
====Example: The Cake is a Dessert====
 
====Example: The Cake is a Dessert====
{{Problem|cake|The Cake is a Dessert}} from the Waterloo&ndash;Woburn 2011 Mock CCC is a straightforward application of the difference array in two dimensions.
+
{{Problem|cake|The Cake is a Dessert}} from the Woburn&ndash;Waterloo 2011 Mock CCC is a straightforward application of the difference array in two dimensions.
  
 
==A word on the dynamic case==
 
==A word on the dynamic case==
The dynamic version of the problem that the prefix sum array is intended to solve requires us to be able to carry out an intermixed sequence of operations, where each operation either changes an element of <math>A</math> or asks us to determine the sum <math>A_0 + A_1 + ... + A_{i-1}</math> for some <math>i</math>. That is, we need to be able to compute entries of the prefix sum array of an array that is changing (dynamic). If we can do this, then we can do range sum queries easily on a dynamic array, simply by taking the difference of prefix sums as in the static case. This is a bit trickier to do, but is most easily and efficiently accomplished using the [[binary indexed tree]] data structure, which is specifically designed to solve this problem and can be updated in <math>O(\log(m_1) \log(m_2) ... \log(m_n))</math> time, where <math>n</math> is the number of dimensions and each <math>m</math> is a dimension. Each prefix sum query also takes that amount of time, but we need to perform <math>2^n</math> of these to find a box sum using the formula given previously, so the running time of the query is asymptotically <math>2^n</math> times greater than that of the update. In the most commonly encountered one-dimensional case, both query and update are simply <math>O(\log m)</math>.
+
The dynamic version of the problem that the prefix sum array is intended to solve requires us to be able to carry out an intermixed sequence of operations, where each operation either changes an element of <math>A</math> or asks us to determine the sum <math>A_0 + A_1 + ... + A_{i-1}</math> for some <math>i</math>. That is, we need to be able to compute entries of the prefix sum array of an array that is changing (dynamic). If we can do this, then we can do range sum queries easily on a dynamic array, simply by taking the difference of prefix sums as in the static case. This is a bit trickier to do, but is most easily and efficiently accomplished using the [[binary indexed tree]] data structure, which allows both queries and updates in <math>O(2^n \log(m_1) \log(m_2) ... \log(m_n))</math> time, where <math>n</math> is the number of dimensions and each <math>m</math> is a dimension. In the most commonly encountered one-dimensional case, this is just logarithmic in the size of the array.
  
The dynamic version of the problem that the difference array is intended to solve is analogous: we want to be able to increment entire ranges of an array, but at any point we also want to be able to determine any element of the array. Here, we simply use a binary indexed tree for the difference array, and compute prefix sums whenever we want to access an element. Here, it is the update that is slower by a factor of <math>2^n</math> than the query (because we have to update <math>2^n</math> entries at a time, but only query one.)
+
The dynamic version of the problem that the difference array is intended to solve is analogous: we want to be able to increment entire ranges of an array, but at any point we also want to be able to determine any element of the array. Here, we simply use a binary indexed tree for the difference array, and compute prefix sums whenever we want to access an element. (Note that this incurs an additional factor of <math>2^n</math> in the running time of the update operation.)
  
On the other hand, if we combine the two, that is, we wish to be able to increment ranges of some array <math>A</math> as well as query prefix sums of <math>A</math> in an intermixed sequence, matters become more complicated. Notice that in the static case we would just track the difference array of <math>A</math> and then integrate twice at the end. In the dynamic case, however, we need to use a [[segment tree]] with lazy propagation. This will give us <math>O(2^n \log(m_1) \log(m_2) ... \log(m_n))</math> time on both the query and the update, but is somewhat more complex and slower than the binary indexed tree on the [[invisible constant factor]].
+
On the other hand, if we combine the two, that is, we wish to be able to increment ranges of some array <math>A</math> as well as query prefix sums of <math>A</math> in an intermixed sequence, matters become more complicated. Notice that in the static case we would just track the difference array of <math>A</math> and then integrate twice at the end. In the dynamic case, however, we need to use a [[segment tree]] with lazy propagation. This will give us <math>O(2^n \log(m_1) \log(m_2) ... \log(m_n))</math> time on both the query and the update, but is somewhat more complex and slower than the binary indexed tree.
  
 
[[Category:Pages needing diagrams]]
 
[[Category:Pages needing diagrams]]

Please note that all contributions to PEGWiki are considered to be released under the Attribution 3.0 Unported (see PEGWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)