Class ZstdCompressorOutputStream.Builder

All Implemented Interfaces:
IOSupplier<ZstdCompressorOutputStream>
Enclosing class:
ZstdCompressorOutputStream

Builds a new ZstdCompressorOutputStream.

For example:


 ZstdCompressorOutputStream s = ZstdCompressorOutputStream.builder()
   .setPath(path)
   .setLevel(3)
   .setStrategy(0)
   .setWorkers(0)
   .get();
 
 

This class avoids making the underlying zstd classes part of the public or protected API.

Since:
1.28.0
See Also:
  • Constructor Details

  • Method Details

    • get

      Throws:
      IOException
    • setChainLog

      Sets the size of the multi-probe search table, as a power of 2.

      The value 0 means use the default chainLog.

      The resulting memory usage is (in C) (1 << (chainLog + 2)). The input must be between ZstdConstants.ZSTD_CHAINLOG_MIN and ZstdConstants.ZSTD_CHAINLOG_MAX. A larger tables result in better and slower compression. This parameter is useless for "fast" strategy but still useful when using "dfast" strategy, in which case it defines a secondary probe table.

      Parameters:
      chainLog - the size of the multi-probe search table, as a power of 2.
      Returns:
      this instance.
      See Also:
    • setChecksum

      Sets whether a 32-bits checksum of content is written at end of frame (defaults to false).

      The value false means no checksum.

      Parameters:
      checksum - Whether a 32-bits checksum of content is written at end of frame.
      Returns:
      this instance.
      See Also:
    • setCloseFrameOnFlush

      public ZstdCompressorOutputStream.Builder setCloseFrameOnFlush(boolean closeFrameOnFlush)
      Sets whether to close the frame on flush.

      This will guarantee that it can be ready fully if the process crashes before closing the stream. The downside is that this negatively affects the compression ratio.

      The value false means don't close on flush.

      Parameters:
      closeFrameOnFlush - whether to close the frame on flush.
      Returns:
      this instance.
      See Also:
    • setDict

      Sets an internal CDict from the given dict buffer.

      Decompression will have to use same dictionary.

      Using a dictionary
      • Loading a null (or 0-length) dictionary invalidates the previous dictionary, returning to no-dictionary mode.
      • A dictionary is sticky, it will be used for all future compressed frames. To return to the no-dictionary mode, load a null dictionary.
      • Loading a dictionary builds tables. This is a CPU consuming operation, with non-negligible impact on latency. Tables are dependent on compression parameters, and for this reason, compression parameters can no longer be changed after loading a dictionary.
      • The dictionary content will be copied internally.
      Parameters:
      dict - The dictionary buffer.
      Returns:
      this instance.
      See Also:
    • setHashLog

      Size of the initial probe table, as a power of 2.

      The value 0 means "use default hashLog".

      The resulting memory usage is (in C) (1 << (hashLog + 2)). This value must be between ZstdConstants.ZSTD_HASHLOG_MIN and ZstdConstants.ZSTD_HASHLOG_MAX. Using a larger table improves the compression ratio of strategies <= dFast, and improves speed of strategies > dFast.

      Parameters:
      hashLog - Size of the initial probe table, as a power of 2.
      Returns:
      this instance.
      See Also:
    • setJobSize

      Size of a compression job.

      This value is enforced only when workers >= 1. Each compression job is completed in parallel, so this value can indirectly impact the number of active threads. A value of 0 uses a default behavior, which is dynamically determined based on compression parameters. Job size must be a minimum of overlap size, or ZSTDMT_JOBSIZE_MIN (= 512 KB), whichever is largest. The minimum size is automatically and transparently enforced.

      This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro ZSTD_MULTITHREAD).

      Parameters:
      jobSize - Size of a compression job.
      Returns:
      this instance.
      See Also:
    • setLevel

      Sets compression parameters according to a pre-defined cLevel table, from 0 to 9.

      The exact compression parameters are dynamically determined, depending on both compression level and srcSize (when known). The default level is ZstdConstants.ZSTD_CLEVEL_DEFAULT. The special value 0 means default, which is controlled by ZstdConstants.ZSTD_CLEVEL_DEFAULT.

      • The value 0 means use the default, which is controlled by ZstdConstants.ZSTD_CLEVEL_DEFAULT
      • You may pass a negative compression level.
      • Setting a level does not automatically set all other compression parameters to defaults. Setting this value will eventually dynamically impact the compression parameters which have not been manually set. The manually set values are used.
      Parameters:
      level - The compression level, from 0 to 9, where the default is ZstdConstants.ZSTD_CLEVEL_DEFAULT.
      Returns:
      this instance
      See Also:
    • setMinMatch

      Sets minimum match size for long distance matcher.

      Zstd can still find matches of smaller size, by updating its search algorithm to look for this size and larger. Using larger values increase compression and decompression speed, but decrease the ratio. The value must be between ZstdConstants.ZSTD_MINMATCH_MIN and ZstdConstants.ZSTD_MINMATCH_MAX. Note that currently, for all strategies < btopt, effective minimum is 4. , for all strategies > fast, effective maximum is 6.

      The value 0 means use the default minMatchLength.

      Parameters:
      minMatch - minimum match size for long distance matcher.
      Returns:
      this instance.
      See Also:
    • setOverlapLog

      Sets the overlap size, as a fraction of window size.

      The overlap size is an amount of data reloaded from previous job at the beginning of a new job. It helps preserve compression ratio, while each job is compressed in parallel. This value is enforced only when workers >= 1. Larger values increase compression ratio, but decrease speed. Possible values range from 0 to 9:

      • 0 means "default" : value will be determined by the library, depending on strategy
      • 1 means "no overlap"
      • 9 means "full overlap", using a full window size.

      Each intermediate rank increases/decreases the load size by a factor 2:

      • 9: full window
      • 8: w / 2
      • 7: w / 4
      • 6: w / 8
      • 5: w / 16
      • 4: w / 32
      • 3: w / 64
      • 2: w / 128
      • 1: no overlap
      • 0: default

      The default value varies between 6 and 9, depending on the strategy.

      This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro ZSTD_MULTITHREAD).

      Parameters:
      overlapLog - the overlap size, as a fraction of window size.
      Returns:
      this instance.
      See Also:
    • setSearchLog

      Sets number of search attempts, as a power of 2.

      More attempts result in better and slower compression. This parameter is useless for "fast" and "dFast" strategies.

      The value 0 means use the default searchLog.

      Parameters:
      searchLog - number of search attempts, as a power of 2.
      Returns:
      this instance.
      See Also:
    • setStrategy

      Sets the ZSTD_strategy from the C enum definition.

      The higher the value of selected strategy, the more complex it is, resulting in stronger and slower compression.

      The value 0 means use the default strategy.

      • ZSTD_fast = 1
      • ZSTD_dfast = 2
      • ZSTD_greedy = 3
      • ZSTD_lazy = 4
      • ZSTD_lazy2 = 5
      • ZSTD_btlazy2 = 6
      • ZSTD_btopt = 7
      • ZSTD_btultra = 8
      • ZSTD_btultra2 = 9
      Parameters:
      strategy - the ZSTD_strategy from the C enum definition.
      Returns:
      this instance.
      See Also:
    • setTargetLength

      Sets a value that depends on the strategy, see ZSTD_c_targetLength.

      For strategies btopt, btultra and btultra2:

      • Length of Match considered "good enough" to stop search.
      • Larger values make compression stronger, and slower.

      For strategy fast:

      • Distance between match sampling.
      • Larger values make compression faster, and weaker.

      The value 0 means use the default targetLength.

      Parameters:
      targetLength - a value that depends on the strategy, see ZSTD_c_targetLength.
      Returns:
      this instance.
      See Also:
    • setWindowLog

      Sets maximum allowed back-reference distance, expressed as power of 2.

      This will set a memory budget for streaming decompression, with larger values requiring more memory and typically compressing more. This value be between ZstdConstants.ZSTD_WINDOWLOG_MIN and ZstdConstants.ZSTD_WINDOWLOG_MAX.

      Note: Using a windowLog greater than ZstdConstants.ZSTD_WINDOWLOG_LIMIT_DEFAULT requires explicitly allowing such size at streaming decompression stage.

      The value 0 means use the default windowLog.

      Parameters:
      windowLog - maximum allowed back-reference distance, expressed as power of 2.
      Returns:
      this instance.
      See Also:
    • setWorkers

      Sets how many threads will be spawned to compress in parallel.

      When workers >= 1, this triggers asynchronous mode when compressing which consumes input and flushes output if possible, but immediately gives back control to the caller, while compression is performed in parallel, within worker threads. More workers improve speed, but also increase memory usage. Compression is performed from the calling thread, and all invocations are blocking.

      The value 0 means "single-threaded mode", nothing is spawned.

      This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro ZSTD_MULTITHREAD).

      Parameters:
      workers - How many threads will be spawned to compress in parallel.
      Returns:
      this instance.
      See Also: