Interface CountingBloomFilter
- All Superinterfaces:
BitMapExtractor
,BloomFilter
,CellExtractor
,IndexExtractor
- All Known Implementing Classes:
ArrayCountingBloomFilter
A counting Bloom filter is expected to function identically to a standard Bloom filter that is the merge of all the Bloom filters that have been added to and not later subtracted from the counting Bloom filter. The functional state of a CountingBloomFilter at the start and end of a series of merge and subsequent remove operations of the same Bloom filters, irrespective of remove order, is expected to be the same.
Removal of a filter that has not previously been merged results in an invalid state where the cells no longer represent a sum of merged Bloom filters. It is impossible to validate merge and remove exactly without explicitly storing all filters. Consequently such an operation may go undetected. The CountingBloomFilter maintains a state flag that is used as a warning that an operation was performed that resulted in invalid cells and thus an invalid state. For example this may occur if a cell for an index was set to negative following a remove operation.
Implementations should document the expected state of the filter after an operation that generates invalid cells, and any potential recovery options. An implementation may support a reversal of the operation to restore the state to that prior to the operation. In the event that invalid cells are adjusted to a valid range then it should be documented if there has been irreversible information loss.
Implementations may choose to throw an exception during an operation that generates invalid cells. Implementations should document the expected state of the filter after such an operation. For example are the cells not updated, partially updated or updated entirely before the exception is raised.
- Since:
- 4.5
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
CellExtractor.CellPredicate
-
Field Summary
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE
-
Method Summary
Modifier and TypeMethodDescriptionboolean
add
(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.copy()
Creates a new instance of the CountingBloomFilter with the same properties as the current one.int
Returns the maximum allowable value for a cell count in this Counting filter.default int
getMaxInsert
(BitMapExtractor bitMapExtractor) Determines the maximum number of times the BitMapExtractor could have been merged into this counting filter.default int
getMaxInsert
(BloomFilter bloomFilter) Determines the maximum number of times the Bloom filter could have been merged into this counting filter.int
getMaxInsert
(CellExtractor cellExtractor) Determines the maximum number of times the Cell Extractor could have been added.default int
getMaxInsert
(Hasher hasher) Determines the maximum number of times the Hasher could have been merged into this counting filter.default int
getMaxInsert
(IndexExtractor indexExtractor) Determines the maximum number of times the IndexExtractor could have been merged into this counting filter.boolean
isValid()
Returnstrue
if the internal state is valid.default boolean
merge
(BitMapExtractor bitMapExtractor) Merges the specified BitMap extractor into this Bloom filter.default boolean
merge
(BloomFilter other) Merges the specified Bloom filter into this Bloom filter.default boolean
Merges the specified Hasher into this Bloom filter.default boolean
merge
(IndexExtractor indexExtractor) Merges the specified index extractor into this Bloom filter.default boolean
remove
(BitMapExtractor bitMapExtractor) Removes the specified BitMapExtractor from this Bloom filter.default boolean
remove
(BloomFilter other) Removes the specified Bloom filter from this Bloom filter.default boolean
Removes the unique values from the specified hasher from this Bloom filter.default boolean
remove
(IndexExtractor indexExtractor) Removes the values from the specified IndexExtractor from the Bloom filter from this Bloom filter.boolean
subtract
(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.default IndexExtractor
The default implementation is a no-op since the counting bloom filter returns an unique IndexExtractor by default.Methods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairs, processBitMaps
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
cardinality, characteristics, clear, contains, contains, contains, contains, estimateIntersection, estimateN, estimateUnion, getShape, isEmpty, isFull
Methods inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
processCells, processIndices
Methods inherited from interface org.apache.commons.collections4.bloomfilter.IndexExtractor
asIndexArray
-
Method Details
-
add
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
other
will be incremented by their corresponding values in theother
.This method will return
true
if the filter is valid after the operation.- Parameters:
other
- the CellExtractor to add.- Returns:
true
if the addition was successful and the state is valid- See Also:
-
copy
Creates a new instance of the CountingBloomFilter with the same properties as the current one.- Specified by:
copy
in interfaceBloomFilter
- Returns:
- a copy of this CountingBloomFilter
-
getMaxCell
int getMaxCell()Returns the maximum allowable value for a cell count in this Counting filter.- Returns:
- the maximum allowable value for a cell count in this Counting filter.
-
getMaxInsert
Determines the maximum number of times the BitMapExtractor could have been merged into this counting filter.- Parameters:
bitMapExtractor
- the BitMapExtractor to provide the indices.- Returns:
- the maximum number of times the BitMapExtractor could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Bloom filter could have been merged into this counting filter.- Parameters:
bloomFilter
- the Bloom filter the check for.- Returns:
- the maximum number of times the Bloom filter could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Cell Extractor could have been added.- Parameters:
cellExtractor
- the extractor of cells.- Returns:
- the maximum number of times the CellExtractor could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Hasher could have been merged into this counting filter.- Parameters:
hasher
- the Hasher to provide the indices.- Returns:
- the maximum number of times the hasher could have been inserted.
-
getMaxInsert
Determines the maximum number of times the IndexExtractor could have been merged into this counting filter.To determine how many times an indexExtractor could have been added create a CellExtractor from the indexExtractor and check that
- Parameters:
indexExtractor
- the extractor to drive the count check.- Returns:
- the maximum number of times the IndexExtractor could have been inserted.
- See Also:
-
isValid
boolean isValid()Returnstrue
if the internal state is valid.This flag is a warning that an addition or subtraction of cells from this filter resulted in an invalid cell for one or more indexes. For example this may occur if a cell for an index was set to negative following a subtraction operation, or overflows the value specified by
getMaxCell()
following an addition operation.A counting Bloom filter that has an invalid state is no longer ensured to function identically to a standard Bloom filter instance that is the merge of all the Bloom filters that have been added to and not later subtracted from this counting Bloom filter.
Note: The change to an invalid state may or may not be reversible. Implementations are expected to document their policy on recovery from an addition or removal operation that generated an invalid state.
- Returns:
true
if the state is valid
-
merge
Merges the specified BitMap extractor into this Bloom filter.Specifically: all cells for the indexes identified by the
bitMapExtractor
will be incremented by 1.This method will return
true
if the filter is valid after the operation.- Specified by:
merge
in interfaceBloomFilter
- Parameters:
bitMapExtractor
- the BitMapExtractor- Returns:
true
if the removal was successful and the state is valid- See Also:
-
merge
Merges the specified Bloom filter into this Bloom filter.Specifically: all cells for the indexes identified by the
other
filter will be incremented by 1.Note: If the other filter is a counting Bloom filter the other filter's cells are ignored and it is treated as an IndexExtractor.
This method will return
true
if the filter is valid after the operation.- Specified by:
merge
in interfaceBloomFilter
- Parameters:
other
- the other Bloom filter- Returns:
true
if the removal was successful and the state is valid- See Also:
-
merge
Merges the specified Hasher into this Bloom filter.Specifically: all cells for the unique indexes identified by the
hasher
will be incremented by 1.This method will return
true
if the filter is valid after the operation.- Specified by:
merge
in interfaceBloomFilter
- Parameters:
hasher
- the hasher- Returns:
true
if the removal was successful and the state is valid- See Also:
-
merge
Merges the specified index extractor into this Bloom filter.Specifically: all unique cells for the indices identified by the
indexExtractor
will be incremented by 1.This method will return
true
if the filter is valid after the operation.Notes:
- If indices that are returned multiple times should be incremented multiple times convert the IndexExtractor to a CellExtractor and add that.
- Implementations should throw
IllegalArgumentException
and no other exception on bad input.
- Specified by:
merge
in interfaceBloomFilter
- Parameters:
indexExtractor
- the IndexExtractor- Returns:
true
if the removal was successful and the state is valid- See Also:
-
remove
Removes the specified BitMapExtractor from this Bloom filter.Specifically all cells for the indices produced by the
bitMapExtractor
will be decremented by 1.This method will return
true
if the filter is valid after the operation.- Parameters:
bitMapExtractor
- the BitMapExtractor to provide the indexes- Returns:
true
if the removal was successful and the state is valid- See Also:
-
remove
Removes the specified Bloom filter from this Bloom filter.Specifically: all cells for the indexes identified by the
other
filter will be decremented by 1.Note: If the other filter is a counting Bloom filter the other filter's cells are ignored and it is treated as an IndexExtractor.
This method will return
true
if the filter is valid after the operation.- Parameters:
other
- the other Bloom filter- Returns:
true
if the removal was successful and the state is valid- See Also:
-
remove
Removes the unique values from the specified hasher from this Bloom filter.Specifically all cells for the unique indices produced by the
hasher
will be decremented by 1.This method will return
true
if the filter is valid after the operation.- Parameters:
hasher
- the hasher to provide the indexes- Returns:
true
if the removal was successful and the state is valid- See Also:
-
remove
Removes the values from the specified IndexExtractor from the Bloom filter from this Bloom filter.Specifically all cells for the unique indices produced by the
hasher
will be decremented by 1.This method will return
true
if the filter is valid after the operation.Note: If indices that are returned multiple times should be decremented multiple times convert the IndexExtractor to a CellExtractor and subtract that.
- Parameters:
indexExtractor
- the IndexExtractor to provide the indexes- Returns:
true
if the removal was successful and the state is valid- See Also:
-
subtract
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
other
will be decremented by their corresponding values in theother
.This method will return true if the filter is valid after the operation.
- Parameters:
other
- the CellExtractor to subtract.- Returns:
true
if the subtraction was successful and the state is valid- See Also:
-
uniqueIndices
The default implementation is a no-op since the counting bloom filter returns an unique IndexExtractor by default.- Specified by:
uniqueIndices
in interfaceBloomFilter
- Specified by:
uniqueIndices
in interfaceCellExtractor
- Specified by:
uniqueIndices
in interfaceIndexExtractor
- Returns:
- this counting Bloom filter.
-