Package org.apache.commons.collections4.sequence


package org.apache.commons.collections4.sequence
Compares two sequences of objects.

The two sequences can hold any object type, as only the equals method is used to compare the elements of the sequences. It is guaranteed that the comparisons will always be done as o1.equals(o2) where o1 belongs to the first sequence and o2 belongs to the second sequence. This can be important if subclassing is used for some elements in the first sequence and the equals method is specialized.

Comparison can be seen from two points of view: either as giving the smallest modification allowing to transform the first sequence into the second one, or as giving the longest sequence which is a subsequence of both initial sequences. The equals method is used to compare objects, so any object can be put into sequences. Modifications include deleting, inserting or keeping one object, starting from the beginning of the first sequence. Like most algorithms of the same type, objects transpositions are not supported. This means that if a sequence (A, B) is compared to (B, A), the result will be either the sequence of three commands delete A, keep B, insert A or the sequence insert B, keep A, delete B.

The package uses a very efficient comparison algorithm designed by Eugene W. Myers and described in his paper: An O(ND) Difference Algorithm and Its Variations. This algorithm produces the shortest possible edit script containing all the commands needed to transform the first sequence into the second one. The entry point for the user to this algorithm is the SequencesComparator class.

As explained in Gene Myers paper, the edit script is equivalent to all other representations and contains all the needed information either to perform the transformation, of course, or to retrieve the longest common subsequence for example.

If the user needs very fine-grained access to the comparison result, he needs to go through this script by providing a visitor implementing the CommandVisitor interface.

Sometimes however, a more synthetic approach is needed. If the user prefers to see the differences between the two sequences as global replacement operations acting on complete subsequences of the original sequences, he will provide an object implementing the simple ReplacementsHandler interface, using an instance of the ReplacementsFinder class as a command converting layer between his object and the edit script. The number of objects which are common to both initial arrays and hence are skipped between each call to the user handleReplacement method is also provided. This allows the user to keep track of the current index in both arrays if he needs so.

  • Class
    Description
    This interface should be implemented by user object to walk through EditScript objects.
    Command representing the deletion of one object of the first sequence.
    Abstract base class for all commands used to transform an objects sequence into another one.
    This class gathers all the commands needed to transform one objects sequence into another objects sequence.
    Command representing the insertion of one object of the second sequence.
    Command representing the keeping of one object present in both sequences.
    This class handles sequences of replacements resulting from a comparison.
    This interface is devoted to handle synchronized replacement sequences.
    This class allows to compare two objects sequences.