Difference between revisions of "Longest palindromic subsequence"
m (→Theoretical background) |
m |
||
Line 11: | Line 11: | ||
'''Theorem''': Returning all longest palindromic subsequences cannot be accomplished in worst-case polynomial time. | '''Theorem''': Returning all longest palindromic subsequences cannot be accomplished in worst-case polynomial time. | ||
− | '''Proof'''<ref name="schneider"/>: Consider a string made up of <math>N/2</math> ones, followed by <math>N/4</math> zeroes, and finally <math>N/4</math> ones. (Assume <math>N</math> is a multiple of 4, although it does not really matter.) Any palindromic substring either does not contain any zeroes, in which case its length is only up to <math>3N/4</math>, or it contains at least one zero. If it contains at least one zero, it must be of the form <math>1^a0^b1^c</math>, but <math>a</math> and <math>c</math> must be equal. (This is because the middle of the palindrome must lie somewhere within the zeroes, otherwise there would be no zeroes on one side of it and at least one zero on the other side; but as long as the middle lies within the zeroes, there must be an equal number of ones on each side.) But <math>c</math> can only be up to <math>N/4</math>, and likewise with <math>b</math>, so again the palindrome cannot be longer than <math>3N/4</math> characters. However, there are <math>\binom{N/2}{N/4}+1</math> palindromic substrings of length <math>3N/4</math>; we can either take all the ones, or we can take all <math>N/4</math> zeroes, all <math>N/4</math> terminal ones, and <math>N/4</math> out of the <math>N/2</math> initial ones. Thus the output size is not polynomial in <math>N</math>, and then neither can the algorithm be in the worst case. <math>_\blacksquare</math> | + | '''Proof'''<ref name="schneider">Jonathan T. Schneider (2010). Personal communication.</ref>: Consider a string made up of <math>N/2</math> ones, followed by <math>N/4</math> zeroes, and finally <math>N/4</math> ones. (Assume <math>N</math> is a multiple of 4, although it does not really matter.) Any palindromic substring either does not contain any zeroes, in which case its length is only up to <math>3N/4</math>, or it contains at least one zero. If it contains at least one zero, it must be of the form <math>1^a0^b1^c</math>, but <math>a</math> and <math>c</math> must be equal. (This is because the middle of the palindrome must lie somewhere within the zeroes, otherwise there would be no zeroes on one side of it and at least one zero on the other side; but as long as the middle lies within the zeroes, there must be an equal number of ones on each side.) But <math>c</math> can only be up to <math>N/4</math>, and likewise with <math>b</math>, so again the palindrome cannot be longer than <math>3N/4</math> characters. However, there are <math>\binom{N/2}{N/4}+1</math> palindromic substrings of length <math>3N/4</math>; we can either take all the ones, or we can take all <math>N/4</math> zeroes, all <math>N/4</math> terminal ones, and <math>N/4</math> out of the <math>N/2</math> initial ones. Thus the output size is not polynomial in <math>N</math>, and then neither can the algorithm be in the worst case. <math>_\blacksquare</math> |
However, this does not rule out the existence of a polynomial-time algorithm for the first two variations on the problem. We now present such an algorithm. | However, this does not rule out the existence of a polynomial-time algorithm for the first two variations on the problem. We now present such an algorithm. | ||
Line 49: | Line 49: | ||
==References== | ==References== | ||
− | |||
− | |||
<references/> | <references/> | ||
Revision as of 20:07, 3 April 2011
Not to be confused with Longest palindromic substring.
The longest palindromic subsequence problem is the problem of finding the longest subsequence of a string (a subsequence is obtained by deleting some of the characters from a string without reordering the remaining characters) which is also a palindrome. In general, the longest palindromic subsequence is not unique. For example, the string alfalfa has two palindromic subsequences of length 5: alala and afafa. However, it does not have any palindromic subsequences longer than five characters. Therefore alala and afafa are both considred longest palindromic subsequences of alfalfa.
Contents
Precise statement
Three variations of this problem may be distinguished:
- Find the maximum possible length for a palindromic subsequence.
- Find some palindromic subsequence of maximal length.
- Find all longest palindromic subsequences.
Theorem: Returning all longest palindromic subsequences cannot be accomplished in worst-case polynomial time.
Proof[1]: Consider a string made up of ones, followed by
zeroes, and finally
ones. (Assume
is a multiple of 4, although it does not really matter.) Any palindromic substring either does not contain any zeroes, in which case its length is only up to
, or it contains at least one zero. If it contains at least one zero, it must be of the form
, but
and
must be equal. (This is because the middle of the palindrome must lie somewhere within the zeroes, otherwise there would be no zeroes on one side of it and at least one zero on the other side; but as long as the middle lies within the zeroes, there must be an equal number of ones on each side.) But
can only be up to
, and likewise with
, so again the palindrome cannot be longer than
characters. However, there are
palindromic substrings of length
; we can either take all the ones, or we can take all
zeroes, all
terminal ones, and
out of the
initial ones. Thus the output size is not polynomial in
, and then neither can the algorithm be in the worst case.
However, this does not rule out the existence of a polynomial-time algorithm for the first two variations on the problem. We now present such an algorithm.
Theoretical background
(Note: these Lemmas are "obvious" and their proofs will probably not help you intuitively understand how the algorithm works, so skip them if they are too heavy in mathematical notation for you.)
Lemma 1: Any palindromic subsequence of a string
is a common subsequence of
and its reverse
.
Proof: Since is a subsequence of
, its reverse
is a subsequence of
But
since
is a palindrome, so
is a subsequence of
, and hence a common subsequence.
Lemma 2: If there exists a common subsequence of length
of
and its reverse
, then there exists a palindromic subsequence
of
of length greater than or equal to
which is a supersequence of
.
Proof: Let denote the subsequence in
and
denote the subsequence in
. Let
denote a supersequence of
. Walk through the string
from left to right. that is, consider
as
goes from 0 to
. Let
denote
, so that
at all times. For each value of
:
- If
is in
then
is in
and
is in
.
- If
is not in
but
is in
, then, again,
is in
and
is in
.
- Otherwise,
is not in
and
is not in
.
After this has completed, is clearly a supersequence of
and a subsequence of
, and likewise
is a supersequence of
and a subsequence of
.
Furthermore, and
are reverses of each other, because whenever a character
is added to the end of
, the identical character
is added to the beginning of
, and vice versa.
Now consider the th character in
. This is
where
is the
th smallest index for which either
is in
or
is in
. This means that
is the
th largest index for which either
is in
or
is in
, since
and
are reverses of each other. Therefore,
is the
th character in
(characters near the beginning of
originate from near the beginning of
or the end of
). But the
th character in
is the
st character in
, because
and
are reverses of each other. Therefore
is palindromic.
Theorem: Any longest common subsequence of
and its reverse
is a longest palindromic subsequence of
.
Proof: Suppose is not palindromic. By Lemma 2, we know we can obtain a palindrome
that is a supersequence of
and a subsequence of
. This cannot be
itself since
is not palindromic. So
must be longer than
. By Lemma 1,
is a common subsequence of
and
. However, as
is longer than
, this contradicts
having been a longest common subsequence of
and
.
Likewise, suppose is a longest common subsequence of
and
and palindromic but it is not a longest palindromic subsequence of
. Then there again exists a longer palindromic subsequence of
, which gives a longer common subsequence of
and
, a contradiction.
Algorithm
A corollary of the Theorem is that a longest palindromic subsequence of can be found in
time simply by finding the longest common subsequence of
and its reverse.
Note that there exist more efficient algorithms for finding longest common subsequences, which also give more efficient means of computing longest palindromic subsequences.
Shortest palindromic supersequence
It can also be shown that the shortest palindromic supersequence of a string can be found by taking the shortest common supersequence of
and its reverse. The proof is left as an exercise to the reader.
References
- ↑ Jonathan T. Schneider (2010). Personal communication.
External links
- IOI '00 - Palindrome
- SPOJ:
- Palindrome 2000 (a duplicate of the problem above)
- Aibohphobia