Longest common substring
Not to be confused with Longest common subsequence.
The longest common substring problem is the problem of finding a string of maximum length which is simultaneously a substring of two or more given strings (usually two). For example, the longest common substring of atlas and elastic is las. This problem is much less applicable (in real life) than the more error-tolerant longest common subsequence problem, but it is theoretically interesting.
Discussion
We can observe immediately that this problem is solvable in polynomial time, because a string of length has substrings, allowing us to simply list all substrings of each string and then find the longest entry(ies) common to both strings. This, however, gives time (assuming we have two strings of lengths and and we use an efficient string search algorithm). In addressing the question of whether it is possible to do better, we at first consider three variations on the problem:
- Return all longest common substrings. There may only be of these, where is the length of the first string and is the length of the second. This is because there are only substrings of any given length in the first string, and in the second string.
- Return any longest common substring.
- Return only the length of a longest common substring.
We shall see that all three variations can be solved in linear time, no matter how many strings are involved.