Class ZstdCompressorOutputStream.Builder
- All Implemented Interfaces:
IOSupplier<ZstdCompressorOutputStream>
- Enclosing class:
ZstdCompressorOutputStream
ZstdCompressorOutputStream
.
For example:
ZstdCompressorOutputStream s = ZstdCompressorOutputStream.builder()
.setPath(path)
.setLevel(3)
.setStrategy(0)
.setWorkers(0)
.get();
This class avoids making the underlying zstd
classes part of the public or protected API.
- Since:
- 1.28.0
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionget()
setChainLog
(int chainLog) Sets the size of the multi-probe search table, as a power of 2.setChecksum
(boolean checksum) Sets whether a 32-bits checksum of content is written at end of frame (defaults tofalse
).setCloseFrameOnFlush
(boolean closeFrameOnFlush) Sets whether to close the frame on flush.setDict
(byte[] dict) Sets an internalCDict
from the givendict
buffer.setHashLog
(int hashLog) Size of the initial probe table, as a power of 2.setJobSize
(int jobSize) Size of a compression job.setLevel
(int level) Sets compression parameters according to a pre-definedcLevel
table, from 0 to 9.setMinMatch
(int minMatch) Sets minimum match size for long distance matcher.setOverlapLog
(int overlapLog) Sets the overlap size, as a fraction of window size.setSearchLog
(int searchLog) Sets number of search attempts, as a power of 2.setStrategy
(int strategy) Sets theZSTD_strategy
from the C enum definition.setTargetLength
(int targetLength) Sets a value that depends on the strategy, seeZSTD_c_targetLength
.setWindowLog
(int windowLog) Sets maximum allowed back-reference distance, expressed as power of 2.setWorkers
(int workers) Sets how many threads will be spawned to compress in parallel.Methods inherited from class org.apache.commons.io.build.AbstractStreamBuilder
getBufferSize, getBufferSizeDefault, getCharSequence, getCharset, getCharsetDefault, getFile, getInputStream, getOpenOptions, getOutputStream, getPath, getRandomAccessFile, getReader, getWriter, setBufferSize, setBufferSize, setBufferSizeChecker, setBufferSizeDefault, setBufferSizeMax, setCharset, setCharset, setCharsetDefault, setOpenOptions
Methods inherited from class org.apache.commons.io.build.AbstractOriginSupplier
checkOrigin, getOrigin, hasOrigin, newByteArrayOrigin, newCharSequenceOrigin, newFileOrigin, newFileOrigin, newInputStreamOrigin, newOutputStreamOrigin, newPathOrigin, newPathOrigin, newRandomAccessFileOrigin, newRandomAccessFileOrigin, newReaderOrigin, newURIOrigin, newWriterOrigin, setByteArray, setCharSequence, setFile, setFile, setInputStream, setOrigin, setOutputStream, setPath, setPath, setRandomAccessFile, setRandomAccessFile, setReader, setURI, setWriter
Methods inherited from class org.apache.commons.io.build.AbstractSupplier
asThis
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.commons.io.function.IOSupplier
asSupplier, getUnchecked
-
Constructor Details
-
Builder
public Builder()Constructs a new builder ofZstdCompressorOutputStream
.
-
-
Method Details
-
get
- Throws:
IOException
-
setChainLog
Sets the size of the multi-probe search table, as a power of 2.The value
0
means use the default chainLog.The resulting memory usage is (in C)
(1 << (chainLog + 2))
. The input must be betweenZstdConstants.ZSTD_CHAINLOG_MIN
andZstdConstants.ZSTD_CHAINLOG_MAX
. A larger tables result in better and slower compression. This parameter is useless for "fast" strategy but still useful when using "dfast" strategy, in which case it defines a secondary probe table.- Parameters:
chainLog
- the size of the multi-probe search table, as a power of 2.- Returns:
- this instance.
- See Also:
-
setChecksum
Sets whether a 32-bits checksum of content is written at end of frame (defaults tofalse
).The value
false
means no checksum.- Parameters:
checksum
- Whether a 32-bits checksum of content is written at end of frame.- Returns:
- this instance.
- See Also:
-
setCloseFrameOnFlush
Sets whether to close the frame on flush.This will guarantee that it can be ready fully if the process crashes before closing the stream. The downside is that this negatively affects the compression ratio.
The value
false
means don't close on flush.- Parameters:
closeFrameOnFlush
- whether to close the frame on flush.- Returns:
- this instance.
- See Also:
-
setDict
Sets an internalCDict
from the givendict
buffer.Decompression will have to use same dictionary.
Using a dictionary- Loading a null (or 0-length) dictionary invalidates the previous dictionary, returning to no-dictionary mode.
- A dictionary is sticky, it will be used for all future compressed frames. To return to the no-dictionary mode, load a null dictionary.
- Loading a dictionary builds tables. This is a CPU consuming operation, with non-negligible impact on latency. Tables are dependent on compression parameters, and for this reason, compression parameters can no longer be changed after loading a dictionary.
- The dictionary content will be copied internally.
- Parameters:
dict
- The dictionary buffer.- Returns:
- this instance.
- See Also:
-
setHashLog
Size of the initial probe table, as a power of 2.The value
0
means "use default hashLog".The resulting memory usage is (in C)
(1 << (hashLog + 2))
. This value must be betweenZstdConstants.ZSTD_HASHLOG_MIN
andZstdConstants.ZSTD_HASHLOG_MAX
. Using a larger table improves the compression ratio of strategies <= dFast, and improves speed of strategies > dFast.- Parameters:
hashLog
- Size of the initial probe table, as a power of 2.- Returns:
- this instance.
- See Also:
-
setJobSize
Size of a compression job.This value is enforced only when
workers >= 1
. Each compression job is completed in parallel, so this value can indirectly impact the number of active threads. A value of 0 uses a default behavior, which is dynamically determined based on compression parameters. Job size must be a minimum of overlap size, or ZSTDMT_JOBSIZE_MIN (= 512 KB), whichever is largest. The minimum size is automatically and transparently enforced.This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro
ZSTD_MULTITHREAD
).- Parameters:
jobSize
- Size of a compression job.- Returns:
- this instance.
- See Also:
-
setLevel
Sets compression parameters according to a pre-definedcLevel
table, from 0 to 9.The exact compression parameters are dynamically determined, depending on both compression level and srcSize (when known). The default level is
ZstdConstants.ZSTD_CLEVEL_DEFAULT
. The special value 0 means default, which is controlled byZstdConstants.ZSTD_CLEVEL_DEFAULT
.- The value 0 means use the default, which is controlled by
ZstdConstants.ZSTD_CLEVEL_DEFAULT
- You may pass a negative compression level.
- Setting a level does not automatically set all other compression parameters to defaults. Setting this value will eventually dynamically impact the compression parameters which have not been manually set. The manually set values are used.
- Parameters:
level
- The compression level, from 0 to 9, where the default isZstdConstants.ZSTD_CLEVEL_DEFAULT
.- Returns:
- this instance
- See Also:
- The value 0 means use the default, which is controlled by
-
setMinMatch
Sets minimum match size for long distance matcher.Zstd can still find matches of smaller size, by updating its search algorithm to look for this size and larger. Using larger values increase compression and decompression speed, but decrease the ratio. The value must be between
ZstdConstants.ZSTD_MINMATCH_MIN
andZstdConstants.ZSTD_MINMATCH_MAX
. Note that currently, for all strategies <btopt
, effective minimum is 4. , for all strategies >fast
, effective maximum is6
.The value
0
means use the default minMatchLength.- Parameters:
minMatch
- minimum match size for long distance matcher.- Returns:
- this instance.
- See Also:
-
setOverlapLog
Sets the overlap size, as a fraction of window size.The overlap size is an amount of data reloaded from previous job at the beginning of a new job. It helps preserve compression ratio, while each job is compressed in parallel. This value is enforced only when workers >= 1. Larger values increase compression ratio, but decrease speed. Possible values range from 0 to 9:
- 0 means "default" : value will be determined by the library, depending on strategy
- 1 means "no overlap"
- 9 means "full overlap", using a full window size.
Each intermediate rank increases/decreases the load size by a factor 2:
- 9: full window
- 8: w / 2
- 7: w / 4
- 6: w / 8
- 5: w / 16
- 4: w / 32
- 3: w / 64
- 2: w / 128
- 1: no overlap
- 0: default
The default value varies between 6 and 9, depending on the strategy.
This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro
ZSTD_MULTITHREAD
).- Parameters:
overlapLog
- the overlap size, as a fraction of window size.- Returns:
- this instance.
- See Also:
-
setSearchLog
Sets number of search attempts, as a power of 2.More attempts result in better and slower compression. This parameter is useless for "fast" and "dFast" strategies.
The value
0
means use the default searchLog.- Parameters:
searchLog
- number of search attempts, as a power of 2.- Returns:
- this instance.
- See Also:
-
setStrategy
Sets theZSTD_strategy
from the C enum definition.The higher the value of selected strategy, the more complex it is, resulting in stronger and slower compression.
The value
0
means use the default strategy.ZSTD_fast = 1
ZSTD_dfast = 2
ZSTD_greedy = 3
ZSTD_lazy = 4
ZSTD_lazy2 = 5
ZSTD_btlazy2 = 6
ZSTD_btopt = 7
ZSTD_btultra = 8
ZSTD_btultra2 = 9
- Parameters:
strategy
- theZSTD_strategy
from the C enum definition.- Returns:
- this instance.
- See Also:
-
setTargetLength
Sets a value that depends on the strategy, seeZSTD_c_targetLength
.For strategies
btopt
,btultra
andbtultra2
:- Length of Match considered "good enough" to stop search.
- Larger values make compression stronger, and slower.
For strategy
fast
:- Distance between match sampling.
- Larger values make compression faster, and weaker.
The value
0
means use the default targetLength.- Parameters:
targetLength
- a value that depends on the strategy, seeZSTD_c_targetLength
.- Returns:
- this instance.
- See Also:
-
setWindowLog
Sets maximum allowed back-reference distance, expressed as power of 2.This will set a memory budget for streaming decompression, with larger values requiring more memory and typically compressing more. This value be between
ZstdConstants.ZSTD_WINDOWLOG_MIN
andZstdConstants.ZSTD_WINDOWLOG_MAX
.Note: Using a windowLog greater than
ZstdConstants.ZSTD_WINDOWLOG_LIMIT_DEFAULT
requires explicitly allowing such size at streaming decompression stage.The value
0
means use the default windowLog.- Parameters:
windowLog
- maximum allowed back-reference distance, expressed as power of 2.- Returns:
- this instance.
- See Also:
-
setWorkers
Sets how many threads will be spawned to compress in parallel.When workers >= 1, this triggers asynchronous mode when compressing which consumes input and flushes output if possible, but immediately gives back control to the caller, while compression is performed in parallel, within worker threads. More workers improve speed, but also increase memory usage. Compression is performed from the calling thread, and all invocations are blocking.
The value
0
means "single-threaded mode", nothing is spawned.This is a multi-threading parameters and is only active if multi-threading is enabled ( if the underlying native library is compiled with the build macro
ZSTD_MULTITHREAD
).- Parameters:
workers
- How many threads will be spawned to compress in parallel.- Returns:
- this instance.
- See Also:
-