Difference between revisions of "Recursive function"

From PEGWiki
Jump to: navigation, search
m
 
(One intermediate revision by the same user not shown)
Line 59: Line 59:
 
Recursion is a powerful tool for solving problems, but it is not always obvious how to approach a problem recursively. The general principle, often called '''divide and conquer''', is as follows:
 
Recursion is a powerful tool for solving problems, but it is not always obvious how to approach a problem recursively. The general principle, often called '''divide and conquer''', is as follows:
  
<div style="border: 1px solid #ccf; width: 60%; padding: 5px; margin-left:auto; margin-right:auto">Identify instances of your problem that are so small or simple that they are trivial. Determine how to break a large or complex instance into one or more smaller or simpler instances so that the solution to the latter can be used to determine a solution to the former. By repeating this process enough times, it is possible to solve any instance, no matter how large, by reducing it to trivial pieces.</div><!-- todo: move this styling to a template, to separate it from content -->
+
{{ImportantConceptBox|Identify instances of your problem that are so small or simple that they are trivial. Determine how to break a large or complex instance into one or more smaller or simpler instances so that the solution to the latter can be used to determine a solution to the former. By repeating this process enough times, it is possible to solve any instance, no matter how large, by reducing it to trivial pieces.}}
  
 
===Non-computer science examples===
 
===Non-computer science examples===
Line 169: Line 169:
 
   print "File not found"
 
   print "File not found"
 
</syntaxhighlight>
 
</syntaxhighlight>
 +
 +
[[Category:Incomplete]]

Latest revision as of 02:05, 14 February 2012

Recursion is the property exhibited by entities that are defined in terms of themselves, that is, recur in themselves. In computer science, the most important recursive entities are recursively defined functions. Mathematical functions are usually thought of as sets of ordered pairs, the first element in the pair from the domain, the second from the range, and, hence, a function cannot be intrinsically recursive. However, functions in computer programs are instead defined in a form that allows computers to evaluate them (efficiently, it is hoped), and the definition of a function in a computer program may reference the function itself. Such a function, that occurs in its own definition, is said to be recursively defined, and, with the understanding that we are speaking of the definition of a function rather than its abstract mathematical essence, we will refer to them simply as recursive functions. The term recursion is then defined also as the process of evaluating recursive functions, and, by extension, the formulation of algorithms in terms of recursive functions.

Notation[edit]

In mathematics[edit]

Recursive functions are notated similarly to piecewise functions. An example of a piecewise function is the absolute value function:

\operatorname{abs}(x) =
\begin{cases}
x & \text{if }x \geq 0 \\
-x & \text{if }x < 0
\end{cases}

To evaluate this function, we find the first line within the curly braces for which the condition on the right hand side is satisfied, and use the expression on the left hand side as the value. Thus, if x \geq 0, then the value of \operatorname{abs}(x) is x, whereas if x < 0, then the value of \operatorname{abs}(x) is instead -x.

Often, the condition on the last line will be replaced with the word "otherwise". This is like the else condition in a case statement (switch in C and related languages); it is chosen when none of the preceding conditions are satisfied. Thus we could rewrite the above as

\operatorname{abs}(x) =
\begin{cases}
x & \text{if }x \geq 0 \\
-x & \text{otherwise}
\end{cases}

It is good form, but not strictly necessary, to make the conditions in the definition exhaustive and mutually exclusive, that is, exactly one condition should be true for each element in the domain. If they are not exhaustive, it is a sign that the domain was poorly specified, whereas if they are not mutually exclusive then it can be slightly confusing.

A recursive function is characterized by the recurrence of the function name in one or more of the left-hand expressions within the brace. Here is a classic example:

F(x) =
\begin{cases}
1 & \text{if }x = 1\text{ or }x = 2 \\
F(x-1) + F(x-2) & \text{otherwise}
\end{cases}

The domain of the function F is \mathbb{N}_1. We see that F(1) = 1 and F(2) = 1. To evaluate F(3), we must use the "otherwise" clause, and hence we see that

F(3) = F(3-1) + F(3-2) = F(2) + F(1) = 1 + 1 = 2
F(4) = F(4-1) + F(4-2) = F(3) + F(2) = 2 + 1 = 3
F(5) = F(5-1) + F(5-2) = F(4) + F(3) = 3 + 2 = 5
(and so on)

The cases in the definition that call the defined function are called recursive cases; the "otherwise" line in the definition of F is the recursive case. Those that do not are called base cases; the line that defines F(1) = F(2) = 1 is the base case. Every recursive function has at least one recursive case (if it did not, it would not be called recursive) and at least one base case. This is because a recursive function definition without a base case would be an endlessly circular definition; attempting to evaluate the function would lead to endless or infinite recursion.

In code[edit]

Nearly all mainstream programming languages that support functions support recursive functions. (A language such as HTML is an obvious exception; it is intended to describe a document instead of a computation, and hence lacks functions altogether). In these languages, in any place in the code where a function calls another function, it can just as well call itself using the same notation. The code parallels the mathematical notation, for example:

C:

int F(int x)
{
    if (x == 1 || x == 2)
        return 1;
    else // i.e., "otherwise"
        return F(x - 1) + F(x - 2);
}

Haskell:

f :: Int -> Int
f x | x == 1 || x == 2   =   1
    | otherwise          =   f (x-1) + f (x-2)

(Note that function names are not allowed to start with uppercase letters in Haskell.)

Philosophy[edit]

Statement[edit]

Recursion is a powerful tool for solving problems, but it is not always obvious how to approach a problem recursively. The general principle, often called divide and conquer, is as follows:

Identify instances of your problem that are so small or simple that they are trivial. Determine how to break a large or complex instance into one or more smaller or simpler instances so that the solution to the latter can be used to determine a solution to the former. By repeating this process enough times, it is possible to solve any instance, no matter how large, by reducing it to trivial pieces.

Non-computer science examples[edit]

This philosophy is not restricted to computer science. Below we give some examples from other fields that may assist the reader in internalizing the divide and conquer philosophy and appreciating its ubiquity. When the solution to an instance requires the solutions to simpler instances, those simpler instances will be indented an additional level to show this relationship.

Problem: Climb a ladder. The generic form of this problem is Be at the top of the ladder. The instance is where you already are. Hence, climbing from the bottom to the top and climbing from the middle to the top are both the same problem, but different instances of it.

Solution: If you are already at the top, this is a trivial instance; you don't actually have to do anything. If you are not at the top, suppose there are x rungs remaining, where x > 0. Notice that the problem would be easier to solve if there were only x-1 rungs separating you from the top of the ladder. So climb up one rung; this will reduce the instance at hand to an easier one. Repeat as necessary.

Example instance:

You are 4 rungs away from the top of the ladder. The instance would become simpler if you were only 3 rungs away from the top, so climb up one rung.
You are 3 rungs away from the top of the ladder. The instance would become simpler if you were only 2 rungs away from the top, so climb up one rung.
You are 2 rungs away from the top of the ladder. The instance would become simpler if you were only 1 rung away from the top, so climb up one rung.
You are 1 rung away from the top of the ladder. The instance would become simpler if you were only 0 rungs away from the top, so climb up one rung.
You are at the top of the ladder. Congratulations! You've solved the problem.


Problem: Differentiate a function with respect to x. The generic form of this problem is Differentiate with respect to x. The instance is the specific function you want to differentiate.

Solution: If the function to be differentiated is a power of x, a constant, or a standard function of x (for example, \sin x), this is a trivial instance; you can simply use the power rule or look up the derivative in your calculus textbook. Otherwise, find the last step in computing the function; it may be addition, subtraction, multiplication, division, or a function application. Then use the sum rule, difference rule, product rule, quotient rule, or chain rule, as appropriate, on the subexpression(s), which is/are necessarily simpler than the original function. By applying these rules enough times, any elementary function can be differentiated.

Example instance:

We wish to differentiate \log x + e^{x^2/2}. This is a complex instance. The lowest-precedence operation is the addition, so we will use the sum rule; hence, \frac{d}{dx} \left(\log x + e^{x^2/2}\right) = \frac{d}{dx} \log x + \frac{d}{dx} e^{x^2/2}. Each of the two derivatives on the right side is a simpler instance of this problem.
We wish to differentiate \log x. This is a trivial instance, since the textbook will tell you that \frac{d}{dx} \log x = 1/x.
We wish to differentiate e^{x^2/2}. The lowest-precedence operation is the exponential; it is done last. So we will use the chain rule, hence, \frac{d}{dx} e^{x^2/2} = \left(\lambda x.\frac{d}{dx}e^x\right)\frac{x^2}{2} \frac{d}{dx} \frac{x^2}{2}.
We wish to differentiate e^x. This is a trivial instance; the textbook indicates that its derivative is e^x.
We wish to differentiate x^2/2. This is a complex instance; we use the rule \frac{d}{dx} kf(x) = k\frac{d}{dx}f(x), where k is a constant. Thus, \frac{d}{dx} x^2/2 = \frac{1}{2} \frac{d}{dx}x^2.
We wish to differentiate x^2. This is a trivial instance; the power rule gives its derivative as 2x.
Therefore, the derivative of x^2/2 is \frac{1}{2}(2x) = x.
Therefore, the derivative of e^{x^2/2} is \left(\lambda x.e^x\right)\frac{x^2}{2}\cdot x = xe^{x^2/2}.
Therefore, the derivative of \log x + e^{x^2/2} is 1/x + xe^{x^2/2}. Congratulations! You've solved the problem.


Problem: Obtain a certain item. The generic form of this problem is Obtain. The instance is the specific item you wish to obtain.

Solution: If you already have the item, then you don't need to do anything; this is a trivial instance. If the item is commercially available and you wouldn't mind buying it, then simply do so; this is also a trivial instance. Otherwise, build it yourself. Identify simpler components of which the item is composed. Obtaining these components comprises easier instances of the given problem. After obtaining the required components, assemble them into the item desired.

Example instance:

You live on a farm and want to prepare a rack of ribs for dinner. This is a complex instance. To solve it, you will have to first obtain the ribs and the barbecue sauce.
You wish to obtain a rack of ribs. This is a complex instance; you will need need to slaughter and butcher a pig.
You wish to obtain a pig. You don't raise pigs on your farm, but your neighbour raises pigs. This is a trivial instance, as you can simply buy a pig from your neighbour.
You have obtained a pig. Now slaughter and butcher it. You have obtained a rack of ribs.
You wish to obtain barbecue sauce. Unfortunately, you don't have any left, and your neighbour doesn't either, and it's a holiday, so the market is closed. This is a complex instance. You look in your cookbook, and find a recipe for barbecue sauce consisting of ketchup, sugar, and vinegar. You will need to obtain these and mix them.
You wish to obtain ketchup. You already have ketchup, so this is a trivial instance.
You wish to obtain sugar. You already have sugar, so this is a trivial instance.
You wish to obtain vinegar. You already have vinegar, so this is a trivial instance.
Having obtained ketchup, sugar, and vinegar, you mix them in the correct ratios, thereby obtaining barbecue sauce.
Having obtained ribs and barbecue sauce, you apply the sauce to the ribs, apply heat, and then apply more sauce. Congratulations! You have obtained a dish of ribs, ready to enjoy.

Examples from computer science[edit]

Problem: Given two non-negative integers x and y, find their greatest common divisor (GCD), denoted \gcd(x,y). The general form of the problem is compute the GCD. The specific instance is given by x and y.

Solution: If x=0, we have a trivial case, and the answer is y. If y=0, again it is a trivial case, and the answer is x. Otherwise, we have a complex case. But if we can somehow make one of the numbers smaller, the problem will become simpler, and ultimately one of the two numbers will have to reach zero. It turns out that \gcd(x,y) = \gcd(x,y-x), if x \leq y (and the reverse, \gcd(x,y) = \gcd(x-y,y) if x \geq y). By reducing the problem to one of these two possible subproblems, and repeating enough times, we eventually discover the answer.

Example instance:

We wish to find the GCD of 30 and 48. Since neither is 0, this is a complex case. We note that \gcd(30,48) = \gcd(30,18).
We wish to find the GCD of 30 and 18. Since neither is 0, this is a complex case. We note that \gcd(30,18) = \gcd(12,18).
We wish to find the GCD of 12 and 18. Since neither is 0, this is a complex case. We note that \gcd(12,18) = \gcd(12,6).
We wish to find the GCD of 12 and 6. Since neither is 0, this is a complex case. We note that \gcd(12,6) = \gcd(6,6).
We wish to find the GCD of 6 and 6. Since neither is 0, this is a complex case. We note that \gcd(6,6) = \gcd(6,0).
We wish to find the GCD of 6 and 0. This is a trivial case, and the answer is 6.

Implementation (C):

int gcd(int x, int y)
{
  if (x == 0) return y;
  else if (y == 0) return x;
  else if (x >= y) return gcd(x-y, y);
  else return gcd(x, y-x);
}


Problem: Locate a file by name on a filesystem, or conclude that the file does not exist. The problem here is find if exists; the instance consists of the filename of the file we wish to find, and the directory that we wish to search.

Solution: If the directory is empty, this is a trivial instance, and we give up. Otherwise, we will have to examine the contents of the directory. For each entry of the directory, if it is a file, check whether it matches the given filename; if so, we are done. If not, go on to the next one. If the entry is a directory itself, this is a subinstance of the problem: attempt to locate the desired file in the subdirectory; again, if we succeed, then we are done, and otherwise, we must go on to the next entry.

Example instance:

We wish to locate the file apache2.conf somewhere under the root directory of a Unix-like system. The root directory is not empty; it contains the entries: bin/ dev/ etc/ home/ lib/ proc/ root/ tmp/ usr/ var/
We wish to locate the file apache2.conf somewhere in /bin. This directory is not empty; it contains the entries: cat chgrp chmod chown cp date dd df echo ... and no subdirectories. None of these matches the desired filename, so we give up. (This turned out to be a trivial instance, since this directory contains no subdirectories on which to recurse.)
We wish to locate the file apache2.conf somewhere in /dev. This directory is not empty; it contains the entries: console full mem null port ptmx pts/ random sda sda1 stderr stdin stdout tty zero. We check each of these in turn: no, no, no, no, no, no... a directory. We have to search pts/ for the file.
We wish to locate the file apache2.conf somewhere in /dev/pts. This directory is empty, so this is a trivial instance, and we give up.
We continue trying to match against random and so on: no, no, no, no, no, no, no, no. We give up; the file apache2.conf is not in /dev.
We wish to locate the file apache2.conf somewhere in /etc. This directory is not empty; it contains the entries: apache2/ crontab fstab group gshadow hostname hosts inittab localtime motd passwd profile resolv.conf shadow shells X11/.
We wish to locate the file apache2.conf somewhere in /etc/apache2. This directory is not empty; it contains the entries: apache2.conf conf.d/ envvars httpd.conf magic mods-available/ mods-enabled/ ports.conf sites-available/ sites-enabled/. We find a match at the first entry; success!

Implementation (Python):

import os
 
def find(path, name):
  for entry in os.listdir(path):                               # iterate through every entry
    newpath = os.path.join(path, entry)                        # get the absolute path to this entry
    if os.path.isdir(newpath) and not os.path.islink(newpath): # descend into directory
      print "Descending into directory " + newpath
      ret = find(newpath, name)                                # complex case; recurse into subproblem
      if ret:
        return ret
    else:                                                      # it's a file; just check whether it matches
      if entry == name:
        return newpath
  return False                                                 # give up
 
res = find("/", "apache2.conf")
if res:
  print "File found at " + res
else:
  print "File not found"