Editing Preprocessing and postprocessing

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 5: Line 5:
 
In other cases, preprocessing is required because the algorithm is easier to implement when its input is a kind of "special case", but the general case may easily be reduced to it. For example, the stack-based algorithm for the [[all nearest smaller values]] problem can be simplified by introducing a sentinel value at the beginning of the list that is smaller than all other elements; this makes it unnecessary to check whether the stack ever becomes empty, and naturally allows recovery of the elements in the list with no preceding smaller value. This effectively reduces the general case to the specific case in which the first element in the list is the smallest one.
 
In other cases, preprocessing is required because the algorithm is easier to implement when its input is a kind of "special case", but the general case may easily be reduced to it. For example, the stack-based algorithm for the [[all nearest smaller values]] problem can be simplified by introducing a sentinel value at the beginning of the list that is smaller than all other elements; this makes it unnecessary to check whether the stack ever becomes empty, and naturally allows recovery of the elements in the list with no preceding smaller value. This effectively reduces the general case to the specific case in which the first element in the list is the smallest one.
  
More frequently, the data contained in the input must be reorganized and rehashed into a more useful form, that is, a form that will make the algorithm more efficient, or that is essential to the correct functioning of the algorithm itself. Some of these modifications are again quite trivial; for example, in ACM-style problems, input will often contain names of imaginary people, each of whom has a certain amount of money, or something like that; it is usually more convenient to assign the people unique consecutive integer IDs starting from zero, rather than to work with their names. Other modifications are less trivial conceptually, and are often known as '''precomputation'''; for example, many problems involving arrays of numbers are more easily solved by using the [[prefix sum array]] rather than the given array itself; this is often the case because the algorithm requires frequently computing the sums of segments of the array. A great many algorithms require their input to be sorted. In the [[Knuth–Morris–Pratt algorithm]], the preprocessing of the needle is the most difficult step; it augments the needle with information about how it matches shifts of itself. Likewise, constructing the [[suffix tree]] of a string is a form of preprocessing, which is very difficult to do efficiently.
+
More frequently, the data contained in the input must be reorganized and rehashed into a more useful form, that is, a form that will make the algorithm more efficient, or that is essential to the correct functioning of the algorithm itself. Some of these modifications are again quite trivial; for example, in ACM-style problems, input will often contain names of imaginary people, each of whom has a certain amount of money, or something like that; it is usually more convenient to assign the people unique consecutive integer IDs starting from zero, rather than to work with their names. Other modifications are less trivial conceptually; for example, many problems involving arrays of numbers are more easily solved by using the [[prefix sum array]] rather than the given array itself; this is often the case because the algorithm requires frequently computing the sums of segments of the array. A great many algorithms require their input to be sorted. In the [[Knuth–Morris–Pratt algorithm]], the preprocessing of the needle is the most difficult step; it augments the needle with information about how it matches shifts of itself. Likewise, constructing the [[suffix tree]] of a string is a form of preprocessing, which is very difficult to do efficiently.
  
 
Occasionally, the output of an algorithm may have to undergo '''postprocessing'''. An obvious example is that if the input consists of names, and we decide to work with numerical IDs instead, and names are expected to appear in output, then we must convert the output of our algorithm from numerical IDs to names before we can print our output. Often, the output is expected to be sorted, even if the input was not. Sometimes, duplicate data points are expected to be removed, and we might have the option of doing this either in the preprocessing stage or the postprocessing stage.
 
Occasionally, the output of an algorithm may have to undergo '''postprocessing'''. An obvious example is that if the input consists of names, and we decide to work with numerical IDs instead, and names are expected to appear in output, then we must convert the output of our algorithm from numerical IDs to names before we can print our output. Often, the output is expected to be sorted, even if the input was not. Sometimes, duplicate data points are expected to be removed, and we might have the option of doing this either in the preprocessing stage or the postprocessing stage.

Please note that all contributions to PEGWiki are considered to be released under the Attribution 3.0 Unported (see PEGWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)