Class AlphabetConverter

java.lang.Object
org.apache.commons.text.AlphabetConverter

public final class AlphabetConverter
extends Object

Convert from one alphabet to another, with the possibility of leaving certain characters unencoded.

The target and do not encode languages must be in the Unicode BMP, but the source language does not.

The encoding will all be of a fixed length, except for the 'do not encode' chars, which will be of length 1

Sample usage

 Character[] originals;   // a, b, c, d
 Character[] encoding;    // 0, 1, d
 Character[] doNotEncode; // d

 AlphabetConverter ac = AlphabetConverter.createConverterFromChars(originals,
 encoding, doNotEncode);

 ac.encode("a");    // 00
 ac.encode("b");    // 01
 ac.encode("c");    // 0d
 ac.encode("d");    // d
 ac.encode("abcd"); // 00010dd
 

#ThreadSafe# AlphabetConverter class methods are thread-safe as they do not change internal state.

Since:
1.0
  • Method Details

    • encode

      public String encode​(String original) throws UnsupportedEncodingException
      Encode a given string.
      Parameters:
      original - the string to be encoded
      Returns:
      The encoded string, null if the given string is null
      Throws:
      UnsupportedEncodingException - if chars that are not supported are encountered
    • decode

      public String decode​(String encoded) throws UnsupportedEncodingException
      Decode a given string.
      Parameters:
      encoded - a string that has been encoded using this AlphabetConverter
      Returns:
      The decoded string, null if the given string is null
      Throws:
      UnsupportedEncodingException - if unexpected characters that cannot be handled are encountered
    • getEncodedCharLength

      public int getEncodedCharLength()
      Get the length of characters in the encoded alphabet that are necessary for each character in the original alphabet.
      Returns:
      The length of the encoded char
    • getOriginalToEncoded

      public Map<Integer,​String> getOriginalToEncoded()
      Get the mapping from integer code point of source language to encoded string. Use to reconstruct converter from serialized map.
      Returns:
      The original map
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • equals

      public boolean equals​(Object obj)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • createConverterFromMap

      public static AlphabetConverter createConverterFromMap​(Map<Integer,​String> originalToEncoded)
      Create a new converter from a map.
      Parameters:
      originalToEncoded - a map returned from getOriginalToEncoded()
      Returns:
      The reconstructed AlphabetConverter
      See Also:
      getOriginalToEncoded()
    • createConverterFromChars

      public static AlphabetConverter createConverterFromChars​(Character[] original, Character[] encoding, Character[] doNotEncode)
      Create an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of chars representing the original alphabet
      encoding - an array of chars representing the alphabet to be used for encoding
      doNotEncode - an array of chars to be encoded using the original alphabet - every char here must appear in both the previous params
      Returns:
      The AlphabetConverter
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed
    • createConverter

      public static AlphabetConverter createConverter​(Integer[] original, Integer[] encoding, Integer[] doNotEncode)
      Create an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of ints representing the original alphabet in codepoints
      encoding - an array of ints representing the alphabet to be used for encoding, in codepoints
      doNotEncode - an array of ints representing the chars to be encoded using the original alphabet - every char here must appear in both the previous params
      Returns:
      The AlphabetConverter
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed