java.lang.Object
org.apache.commons.compress.harmony.pack200.Codec
org.apache.commons.compress.harmony.pack200.BHSDCodec

public final class BHSDCodec extends Codec
A BHSD codec is a means of encoding integer values as a sequence of bytes or vice versa using a specified "BHSD" encoding mechanism. It uses a variable-length encoding and a modified sign representation such that small numbers are represented as a single byte, whilst larger numbers take more bytes to encode. The number may be signed or unsigned; if it is unsigned, it can be weighted towards positive numbers or equally distributed using a one's complement. The Codec also supports delta coding, where a sequence of numbers is represented as a series of first-order differences. So a delta encoding of the integers [1..10] would be represented as a sequence of 10x1s. This allows the absolute value of a coded integer to fall outside of the 'small number' range, whilst still being encoded as a single byte.

A BHSD codec is configured with four parameters:

B
The maximum number of bytes that each value is encoded as. B must be a value between [1..5]. For a pass-through coding (where each byte is encoded as itself, aka Codec.BYTE1, B is 1 (each byte takes a maximum of 1 byte).
H
The radix of the integer. Values are defined as a sequence of values, where value n is multiplied by H^<sup>n</sup>. So the number 1234 may be represented as the sequence 4 3 2 1 with a radix (H) of 10. Note that other permutations are also possible; 43 2 1 will also encode 1234. The co-parameter L is defined as 256-H. This is important because only the last value in a sequence may be < L; all prior values must be > L.
S
Whether the codec represents signed values (or not). This may have 3 values; 0 (unsigned), 1 (signed, one's complement) or 2 (signed, two's complement)
D
Whether the codec represents a delta encoding. This may be 0 (no delta) or 1 (delta encoding). A delta encoding of 1 indicates that values are cumulative; a sequence of 1 1 1 1 1 will represent the sequence 1 2 3 4 5. For this reason, the codec supports two variants of decode; one with and one without a last parameter. If the codec is a non-delta encoding, then the value is ignored if passed. If the codec is a delta encoding, it is a run-time error to call the value without the extra parameter, and the previous value should be returned. (It was designed this way to support multi-threaded access without requiring a new instance of the Codec to be cloned for each use.)

Codecs are notated as (B,H,S,D) and either D or S,D may be omitted if zero. Thus Codec.BYTE1 is denoted (1,256,0,0) or (1,256). The toString() method prints out the condensed form of the encoding. Often, the last character in the name (Codec.BYTE1, Codec.UNSIGNED5) gives a clue as to the B value. Those that start with U (Codec.UDELTA5, Codec.UNSIGNED5) are unsigned; otherwise, in most cases, they are signed. The presence of the word Delta (Codec.DELTA5, Codec.UDELTA5) indicates a delta encoding is used.

  • Field Summary

    Fields inherited from class org.apache.commons.compress.harmony.pack200.Codec

    BCI5, BRANCH5, BYTE1, CHAR3, DELTA5, lastBandLength, MDELTA5, SIGNED5, UDELTA5, UNSIGNED5
  • Constructor Summary

    Constructors
    Constructor
    Description
    BHSDCodec(int b, int h)
    Constructs an unsigned, non-delta Codec with the given B and H values.
    BHSDCodec(int b, int h, int s)
    Constructs a non-delta Codec with the given B, H and S values.
    BHSDCodec(int b, int h, int s, int d)
    Constructs a Codec with the given B, H, S and D values.
  • Method Summary

    Modifier and Type
    Method
    Description
    long
    Returns the cardinality of this codec; that is, the number of distinct values that it can contain.
    int
    Decodes a sequence of bytes from the given input stream, returning the value as a long.
    int
    decode(InputStream in, long last)
    Decodes a sequence of bytes from the given input stream, returning the value as a long.
    int[]
    decodeInts(int n, InputStream in)
    Decodes a sequence of n values from in.
    int[]
    decodeInts(int n, InputStream in, int firstValue)
    Decodes a sequence of n values from in.
    byte[]
    encode(int value)
    Encodes a single value into a sequence of bytes.
    byte[]
    encode(int value, int last)
    Encodes a single value into a sequence of bytes.
    boolean
    encodes(long value)
    True if this encoding can code the given value
    boolean
     
    int
    Gets the B.
    int
    Gets the H.
    int
    Gets the L.
    int
    Gets the S.
    int
     
    boolean
    Returns true if this codec is a delta codec
    boolean
    Returns true if this codec is a signed codec
    long
    Returns the largest value that this codec can represent.
    long
    Returns the smallest value that this codec can represent.
    Returns the codec in the form (1,256) or (1,64,1,1).

    Methods inherited from class org.apache.commons.compress.harmony.pack200.Codec

    encode

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • BHSDCodec

      public BHSDCodec(int b, int h)
      Constructs an unsigned, non-delta Codec with the given B and H values.
      Parameters:
      b - the maximum number of bytes that a value can be encoded as [1..5]
      h - the radix of the encoding [1..256]
    • BHSDCodec

      public BHSDCodec(int b, int h, int s)
      Constructs a non-delta Codec with the given B, H and S values.
      Parameters:
      b - the maximum number of bytes that a value can be encoded as [1..5]
      h - the radix of the encoding [1..256]
      s - whether the encoding represents signed numbers (s=0 is unsigned; s=1 is signed with 1s complement; s=2 is signed with ?)
    • BHSDCodec

      public BHSDCodec(int b, int h, int s, int d)
      Constructs a Codec with the given B, H, S and D values.
      Parameters:
      b - the maximum number of bytes that a value can be encoded as [1..5]
      h - the radix of the encoding [1..256]
      s - whether the encoding represents signed numbers (s=0 is unsigned; s=1 is signed with 1s complement; s=2 is signed with ?)
      d - whether this is a delta encoding (d=0 is non-delta; d=1 is delta)
  • Method Details

    • cardinality

      public long cardinality()
      Returns the cardinality of this codec; that is, the number of distinct values that it can contain.
      Returns:
      the cardinality of this codec
    • decode

      Description copied from class: Codec
      Decodes a sequence of bytes from the given input stream, returning the value as a long. Note that this method can only be applied for non-delta encodings.
      Specified by:
      decode in class Codec
      Parameters:
      in - the input stream to read from
      Returns:
      the value as a long
      Throws:
      IOException - if there is a problem reading from the underlying input stream
      Pack200Exception - if the encoding is a delta encoding
    • decode

      public int decode(InputStream in, long last) throws IOException, Pack200Exception
      Description copied from class: Codec
      Decodes a sequence of bytes from the given input stream, returning the value as a long. If this encoding is a delta encoding (d=1) then the previous value must be passed in as a parameter. If it is a non-delta encoding, then it does not matter what value is passed in, so it makes sense for the value to be passed in by default using code similar to:
       long last = 0;
       while (condition) {
           last = codec.decode(in, last);
           // do something with last
       }
       
      Specified by:
      decode in class Codec
      Parameters:
      in - the input stream to read from
      last - the previous value read, which must be supplied if the codec is a delta encoding
      Returns:
      the value as a long
      Throws:
      IOException - if there is a problem reading from the underlying input stream
      Pack200Exception - if there is a problem decoding the value or that the value is invalid
    • decodeInts

      public int[] decodeInts(int n, InputStream in) throws IOException, Pack200Exception
      Description copied from class: Codec
      Decodes a sequence of n values from in. This should probably be used in most cases, since some codecs (such as PopulationCodec) only work when the number of values to be read is known.
      Overrides:
      decodeInts in class Codec
      Parameters:
      n - the number of values to decode
      in - the input stream to read from
      Returns:
      an array of int values corresponding to values decoded
      Throws:
      IOException - if there is a problem reading from the underlying input stream
      Pack200Exception - if there is a problem decoding the value or that the value is invalid
    • decodeInts

      public int[] decodeInts(int n, InputStream in, int firstValue) throws IOException, Pack200Exception
      Description copied from class: Codec
      Decodes a sequence of n values from in.
      Overrides:
      decodeInts in class Codec
      Parameters:
      n - the number of values to decode
      in - the input stream to read from
      firstValue - the first value in the band if it has already been read
      Returns:
      an array of int values corresponding to values decoded, with firstValue as the first value in the array.
      Throws:
      IOException - if there is a problem reading from the underlying input stream
      Pack200Exception - if there is a problem decoding the value or that the value is invalid
    • encode

      public byte[] encode(int value) throws Pack200Exception
      Description copied from class: Codec
      Encodes a single value into a sequence of bytes. Note that this method can only be used for non-delta encodings.
      Specified by:
      encode in class Codec
      Parameters:
      value - the value to encode
      Returns:
      the encoded bytes
      Throws:
      Pack200Exception - TODO
    • encode

      public byte[] encode(int value, int last) throws Pack200Exception
      Description copied from class: Codec
      Encodes a single value into a sequence of bytes.
      Specified by:
      encode in class Codec
      Parameters:
      value - the value to encode
      last - the previous value encoded (for delta encodings)
      Returns:
      the encoded bytes
      Throws:
      Pack200Exception - TODO
    • encodes

      public boolean encodes(long value)
      True if this encoding can code the given value
      Parameters:
      value - the value to check
      Returns:
      true if the encoding can encode this value
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • getB

      public int getB()
      Gets the B.
      Returns:
      the b
    • getH

      public int getH()
      Gets the H.
      Returns:
      the h
    • getL

      public int getL()
      Gets the L.
      Returns:
      the l
    • getS

      public int getS()
      Gets the S.
      Returns:
      the s
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • isDelta

      public boolean isDelta()
      Returns true if this codec is a delta codec
      Returns:
      true if this codec is a delta codec
    • isSigned

      public boolean isSigned()
      Returns true if this codec is a signed codec
      Returns:
      true if this codec is a signed codec
    • largest

      public long largest()
      Returns the largest value that this codec can represent.
      Returns:
      the largest value that this codec can represent.
    • smallest

      public long smallest()
      Returns the smallest value that this codec can represent.
      Returns:
      the smallest value that this codec can represent.
    • toString

      public String toString()
      Returns the codec in the form (1,256) or (1,64,1,1). Note that trailing zero fields are not shown.
      Overrides:
      toString in class Object