Dynamic programming
Dynamic programming (DP) is a technique for solving problems that involves computing the solution to a large problem using previously-computed solutions to smaller problems. DP, however, differs from recursion in that in DP, one solves the smallest possible subproblems (the base cases) first, and works up to the original problem, storing the solution to each subproblem in a table (usually a simple array or equivalent data structure) as it is computed so that its value can be quickly looked up when computing the solutions to larger subproblems. DP is especially well-suited to optimization problems. A number of examples are presented in this article to help explain exactly what DP is.
Contents
Basic principles of DP
Not every problem can be solved using DP, just as not every problem can be solved using recursion. In order for a problem to be solvable using DP, it must have at least the properties of overlapping subproblems and optimal substructure. We will examine the problem Plentiful Paths from the 1996 Canadian Computing Competition and show how it satisfies these criteria.
Optimal substructure
An optimization problem is said to have optimal substructure when it can be broken down into subproblems and the optimal solution to the original problem can be computed from the optimal solutions of the subproblems. Another way of saying this is that the optimal solution to a large problem contains an optimal solution (sometimes more than one) for a subproblem (maybe more than one). Let's put this into the context of Plentiful Paths.
The Plentiful Paths problem asks us to optimize the number of apples collected in travelling from the lower-left square to the upper-right square. We start at (1,1) and end up eventually at (M,N). It is fairly clear that at some point we must pass through either (M-1,N) or (M,N-1). (Think about this carefully if you don't see why this is.) Suppose that the optimal solution involved travelling through (M-1,N). So we take some path to reach (M-1,N) and then we take one step to reach the end, (M,N). A bit of thought should convince you that we must collect as many apples as possible on the path from (1,1) to (M-1,N) if we want to collect the maximum possible number of apples on the path from (1,1) to (M,N). This is because if we found a better path to get to (M-1,N) - that is, one with more apples along the way - then we would also collect more apples in total getting to (M,N).
Having said that, the optimal path to (M-1,N) must itself consist of an optimal path to one of the squares immediately before it, for the same reason: either (M-2,N) or (M-1,N-1)... and so on.
(Notice that if A is a subproblem of B, and B is a subproblem of C, then A is a subproblem of C as well. Then, a DP solution would first solve A, then use this to solve B, and finally use that to solve C. But if A is a subproblem of B, B cannot be a subproblem of A. This is because, in order to find an optimal solution to B, we would first need an optimal solution to A, but to find an optimal solution to A, we would first need an optimal solution to B, so ultimately we can solve neither problem.)
The Plentiful Paths problem has optimal substructure because any optimal solution to an instance of Plentiful Paths contains an optimal solution to a subproblem. In general, in a DP solution, we start by solving the smallest possible subproblems (such as: "how many apples can we collect going from (1,1) to (1,1)?") and work our way up to larger problems.
Overlapping subproblems
A problem is said to have overlapping subproblems when some of its subproblems are repeated. For example, in Plentiful Paths, in order to find an optimal path to (M,N), we must find the optimal paths to (M-1,N) and (M,N-1). To find the optimal path to (M-1,N), we need to find the optimal paths to both (M-2,N) and (M-1,N-1); to find the optimal path to (M,N-1) we must find the optimal path to both (M-1,N-1) and (M,N-2). Notice that the subproblem (M-1,N-1) was repeated. It is the fact that subproblems are repeated that gives DP its massive efficiency advantage over the recursive solution in the next section.
Recursive solution to Plentiful Paths
Based on the optimal substructure property of the Plentiful Paths problem, we might write the following pseudocode:
function plentiful_paths(A,x,y) if x=1 if y=1 then return A[x,y] else return A[x,y]+plentiful_paths(A,x,y-1) else if y=1 then return A[x,y]+plentiful_paths(A,x-1,y) else return A[x,y]+max(plentiful_paths(A,x-1,y),plentiful_paths(A,x,y-1)) input M,N,A print plentiful_paths(A,M,N)
Here, A is a matrix (two-dimensional array) in which an entry is 1 if the corresponding position in the grid contains an apple and 0 if it does not contain an apple; the function plentiful_paths(A,x,y) returns the maximum number of apples that can be retrieved in going from (1,1) to (x,y) in grid A.
If x and y are both 1, then the question is how many apples we can collect if we travel from (1,1) to (1,1); this will simply be 1 if there is an apple at (1,1) and 0 otherwise. (That is the quantity A[x,y].)
If x is 1 but y is not, then in order to travel from (1,1) to (x,y), we must first travel from (1,1) to (x,y-1), then take one step to reach (x,y). The maximum number of apples we can collect in travelling to (x,y) is equal to the maximum number of apples we can collect in travelling to (x,y-1) (this is plentiful_paths(A,x,y-1)), plus one if there is an apple at (x,y).
Similarly, if x is not 1 and y is 1, then we collect the maximum number of apples in travelling to (x-1,y), then we add one apple if there is an apple in the square (x,y).
If x and y are both greater than 1, then the optimal path to (x,y) goes through either (x-1,y) or (x,y-1). The maximum number of apples collected in reaching (x,y) will be either the maximum number of apples collected in reaching (x-1,y) or the same for (x,y-1), whichever is greater (that is to say, max(plentiful_paths(A,x-1,y),plentiful_paths(A,x,y-1))), plus one if there is an apple at (x,y).
The problem with this recursive solution is that, even though the value of plentiful_paths(A,i,j) will be the same every time it is calculated as long as A, i, and j stay the same, it might calculate that value hundreds of times. For example, in calculating plentiful_paths(A,5,5), we will actually calculate plentiful_paths(A,1,1) seventy times. This is because of the overlapping subproblems. For larger grids, we waste so much time recalculating values that the algorithm becomes impractically slow and therefore unusable. We can avoid this by using memoization. However, DP presents an alternative solution.
(Note that in a real recursive solution we would not pass A by value, since its value never changes and thus passing by value would slow down the solution massively by copying A over and over again. Instead, A would be either passed by reference or stored as a global variable. In the latter case, it would disappear from the parameter list altogether.)
Dynamic solution to Plentiful Paths
input M,N,A for x = 1 to M for y = 1 to N if x=1 then if y=1 then dp[x,y]=A[x,y] else dp[x,y]=A[x,y]+dp[x,y-1] else if y=1 then dp[x,y]=A[x,y]+dp[x-1,y] else dp[x,y]=A[x,y]+max(dp[x-1,y],dp[x,y-1]) print dp[M,N]
In the dynamic solution, dp[x,y] represents the maximum possible number of apples that can be obtained on some path from (1,1) to (x,y). As soon as we calculate this value, we store it in the two-dimensional array dp[]. We see that this solution follows essentially the same logic as the recursive solution, and our final answer is dp[M,N]. However, no entry in the dp[] table is calculated more than once. When M and N are each 1000, it would take far longer than the age of the universe for the recursive solution to finish executing even if the entire surface fo the Earth were covered in transistors being used exclusively to execute the algorithm, but the dynamic solution would run in under a second on a modern desktop computer.
This solution exemplifies both the optimal substructure property (because in calculating dp[x,y], an optimal solution for the square (x,y) we potentially examine the values of dp[x-1,y] and dp[x,y-1], optimal solutions for possible preceding squares, or subproblems) and the overlapping subproblems property (most entries in the dp[] table will be examined twice, but only calculated once). It is the presence of overlapping subproblems that gives DP such a huge efficiency advantage over the naive recursive solution: we will only calculate the solution to each subproblem once, and from then on we will look it up whenever it is needed. Finally, we have computed the solution bottom-up, by solving the smallest subproblems first so that by the time we reach a particular subproblem, we have already computed the solutions to any subproblems on which that subproblem depends. Contrast this with the recursive solution, which considers the largest subproblem (the original problem) first; it is top-down.