Package org.apache.commons.compress.compressors.pack200

package org.apache.commons.compress.compressors.pack200

Provides stream classes for compressing and decompressing streams using the Pack200 algorithm used to compress Java archives.

The streams of this package only work on JAR archives, i.e. a Pack200CompressorOutputStream expects to be wrapped around a stream that a valid JAR archive will be written to and a Pack200CompressorInputStream provides a stream to read from a JAR archive.

JAR archives compressed with Pack200 will in general be different from the original archive when decompressed again. For details see the API documentation of Pack200.

The streams of this package work on non-deflated streams, i.e. archives like those created with the --no-gzip option of the JDK's pack200 command line tool. If you want to work on deflated streams you must use an additional stream layer - for example by using Apache Commons Compress' gzip package.

The Pack200 API provided by the Java class library doesn't lend itself to real stream processing. Pack200CompressorInputStream will uncompress its input immediately and then provide an InputStream to a cached result. Likewise Pack200CompressorOutputStream will not write anything to the given OutputStream until finish or close is called - at which point the cached output written so far gets compressed.

Two different caching modes are available - "in memory", which is the default, and "temporary file". By default data is cached in memory but you should switch to the temporary file option if your archives are really big.

Given there always is an intermediate result the getBytesRead and getCount methods of Pack200CompressorInputStream are meaningless (read from the real stream or from the intermediate result?) and always return 0.

During development of the initial version several attempts have been made to use a real streaming API based for example on Piped(In|Out)putStream or explicit stream pumping like Commons Exec's InputStreamPumper but they have all failed because they rely on the output end to be consumed completely or else the (un)pack will block forever. Especially for Pack200InputStream it is very likely that it will be wrapped in a ZipArchiveInputStream which will never read the archive completely as it is not interested in the ZIP central directory data at the end of the JAR archive.