Class LongestCommonSubsequence
 All Implemented Interfaces:
SimilarityScore<Integer>
public class LongestCommonSubsequence extends Object implements SimilarityScore<Integer>
The Longest common subsequence algorithm returns the length of the longest subsequence that two strings have in
common. Two strings that are entirely different, return a value of 0, and two strings that return a value
of the commonly shared length implies that the strings are completely the same in value and position.
Note. Generally this algorithm is fairly inefficient, as for length m, n of the input
CharSequence
's left
and right
respectively, the runtime of the
algorithm is O(m*n).
This implementation is based on the Longest Commons Substring algorithm from https://en.wikipedia.org/wiki/Longest_common_subsequence_problem.
For further reading see:
Lothaire, M. Applied combinatorics on words. New York: Cambridge U Press, 2005. 1213
 Since:
 1.0

Constructor Summary
Constructors Constructor Description LongestCommonSubsequence()

Method Summary
Modifier and Type Method Description Integer
apply(CharSequence left, CharSequence right)
Calculates longest common subsequence similarity score of twoCharSequence
's passed as input.CharSequence
logestCommonSubsequence(CharSequence left, CharSequence right)
Deprecated.Deprecated as of 1.2 due to a typo in the method name.CharSequence
longestCommonSubsequence(CharSequence left, CharSequence right)
Computes the longest common subsequence between the twoCharSequence
's passed as input.int[][]
longestCommonSubstringLengthArray(CharSequence left, CharSequence right)
Computes the lcsLengthArray for the sake of doing the actual lcs calculation.

Constructor Details

LongestCommonSubsequence
public LongestCommonSubsequence()


Method Details

apply
Calculates longest common subsequence similarity score of twoCharSequence
's passed as input. Specified by:
apply
in interfaceSimilarityScore<Integer>
 Parameters:
left
 first character sequenceright
 second character sequence Returns:
 longestCommonSubsequenceLength
 Throws:
IllegalArgumentException
 if either String inputnull

logestCommonSubsequence
Deprecated.Deprecated as of 1.2 due to a typo in the method name. UselongestCommonSubsequence(CharSequence, CharSequence)
instead. This method will be removed in 2.0.Computes the longest common subsequence between the twoCharSequence
's passed as input.Note, a substring and subsequence are not necessarily the same thing. Indeed,
abcxyzqrs
andxyzghfm
have both the same common substring and subsequence, namelyxyz
. However,axbyczqrs
andabcxyzqtv
have the longest common subsequencexyzq
because a subsequence need not have adjacent characters.For reference, we give the definition of a subsequence for the reader: a subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements.
 Parameters:
left
 first character sequenceright
 second character sequence Returns:
 The longest common subsequence found
 Throws:
IllegalArgumentException
 if either String inputnull

longestCommonSubsequence
Computes the longest common subsequence between the twoCharSequence
's passed as input.Note, a substring and subsequence are not necessarily the same thing. Indeed,
abcxyzqrs
andxyzghfm
have both the same common substring and subsequence, namelyxyz
. However,axbyczqrs
andabcxyzqtv
have the longest common subsequencexyzq
because a subsequence need not have adjacent characters.For reference, we give the definition of a subsequence for the reader: a subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements.
 Parameters:
left
 first character sequenceright
 second character sequence Returns:
 The longest common subsequence found
 Throws:
IllegalArgumentException
 if either String inputnull
 Since:
 1.2

longestCommonSubstringLengthArray
Computes the lcsLengthArray for the sake of doing the actual lcs calculation. This is the dynamic programming portion of the algorithm, and is the reason for the runtime complexity being O(m*n), where m=left.length() and n=right.length(). Parameters:
left
 first character sequenceright
 second character sequence Returns:
 lcsLengthArray
