Class FastLoadedDiceRollerDiscreteSampler
- java.lang.Object
-
- org.apache.commons.rng.sampling.distribution.FastLoadedDiceRollerDiscreteSampler
-
- All Implemented Interfaces:
DiscreteSampler
,SharedStateDiscreteSampler
,SharedStateSampler<SharedStateDiscreteSampler>
public abstract class FastLoadedDiceRollerDiscreteSampler extends Object implements SharedStateDiscreteSampler
Distribution sampler that uses the Fast Loaded Dice Roller (FLDR). It can be used to sample fromn
values each with an associated relative weight. If all unique items are assigned the same weight it is more efficient to use theDiscreteUniformSampler
.Given a list
L
ofn
positive numbers, whereL[i]
represents the relative weight of thei
th side, FLDR returns integeri
with relative probabilityL[i]
.FLDR produces exact samples from the specified probability distribution.
- For integer weights, the probability of returning
i
is precisely equal to the rational numberL[i] / m
, wherem
is the sum ofL
. - For floating-points weights, each weight
L[i]
is converted to the corresponding rational numberp[i] / q[i]
wherep[i]
is a positive integer andq[i]
is a power of 2. The rational weights are then normalized (exactly) to sum to unity.
Note that if exact samples are not required then an alternative sampler that ignores very small relative weights may have improved sampling performance.
This implementation is based on the algorithm in:
Feras A. Saad, Cameron E. Freer, Martin C. Rinard, and Vikash K. Mansinghka. The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions. In AISTATS 2020: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research 108, Palermo, Sicily, Italy, 2020.
Sampling uses
UniformRandomProvider.nextInt()
as the source of random bits.
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description static FastLoadedDiceRollerDiscreteSampler
of(UniformRandomProvider rng, double[] weights)
Creates a sampler.static FastLoadedDiceRollerDiscreteSampler
of(UniformRandomProvider rng, double[] weights, int alpha)
Creates a sampler.static FastLoadedDiceRollerDiscreteSampler
of(UniformRandomProvider rng, long[] frequencies)
Creates a sampler.abstract FastLoadedDiceRollerDiscreteSampler
withUniformRandomProvider(UniformRandomProvider rng)
Create a new instance of the sampler with the same underlying state using the given uniform random provider as the source of randomness.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.commons.rng.sampling.distribution.DiscreteSampler
sample, samples, samples
-
-
-
-
Method Detail
-
withUniformRandomProvider
public abstract FastLoadedDiceRollerDiscreteSampler withUniformRandomProvider(UniformRandomProvider rng)
Create a new instance of the sampler with the same underlying state using the given uniform random provider as the source of randomness.- Specified by:
withUniformRandomProvider
in interfaceSharedStateSampler<SharedStateDiscreteSampler>
- Parameters:
rng
- Generator of uniformly distributed random numbers.- Returns:
- the sampler
-
of
public static FastLoadedDiceRollerDiscreteSampler of(UniformRandomProvider rng, long[] frequencies)
Creates a sampler.Note: The discrete distribution generating (DDG) tree requires
(n + 1) * k
entries wheren
is the number of categories,k == ceil(log2(m))
andm
is the sum of the observed frequencies. An exception is raised if this cannot be allocated as a single array.For reference the sum is limited to
Long.MAX_VALUE
and the valuek
to 63. The number of categories is limited to approximately((2^31 - 1) / k) = 34,087,042
when the sum of frequencies is large enough to create k=63.- Parameters:
rng
- Generator of uniformly distributed random numbers.frequencies
- Observed frequencies of the discrete distribution.- Returns:
- the sampler
- Throws:
IllegalArgumentException
- iffrequencies
is null or empty, a frequency is negative, the sum of all frequencies is either zero or aboveLong.MAX_VALUE
, or the size of the discrete distribution generating tree is too large.
-
of
public static FastLoadedDiceRollerDiscreteSampler of(UniformRandomProvider rng, double[] weights)
Creates a sampler.Weights are converted to rational numbers
p / q
whereq
is a power of 2. The numeratorsp
are scaled to use a common denominator before summing.All weights are used to create the sampler. Weights with a small magnitude relative to the largest weight can be excluded using the constructor method with the relative magnitude parameter
alpha
(seeof(UniformRandomProvider, double[], int)
).- Parameters:
rng
- Generator of uniformly distributed random numbers.weights
- Weights of the discrete distribution.- Returns:
- the sampler
- Throws:
IllegalArgumentException
- ifweights
is null or empty, a weight is negative, infinite orNaN
, the sum of all weights is zero, or the size of the discrete distribution generating tree is too large.- See Also:
of(UniformRandomProvider, double[], int)
-
of
public static FastLoadedDiceRollerDiscreteSampler of(UniformRandomProvider rng, double[] weights, int alpha)
Creates a sampler.Weights are converted to rational numbers
p / q
whereq
is a power of 2. The numeratorsp
are scaled to use a common denominator before summing.Note: The discrete distribution generating (DDG) tree requires
(n + 1) * k
entries wheren
is the number of categories,k == ceil(log2(m))
andm
is the sum of the weight numeratorsq
. An exception is raised if this cannot be allocated as a single array.For reference the value
k
is equal to or greater than the ratio of the largest to the smallest weight expressed as a power of 2. ForDouble.MAX_VALUE / Double.MIN_VALUE
this is ~2098. The valuek
increases with the sum of the weight numerators. A number of weights in excess of 1,000,000 with values equal toDouble.MAX_VALUE
would be required to raise an exception when the minimum weight isDouble.MIN_VALUE
.Weights with a small magnitude relative to the largest weight can be excluded using the relative magnitude parameter
alpha
. This will set any weight to zero if the magnitude is approximately 2alpha smaller than the largest weight. This comparison is made using only the exponent of the input weights. Thealpha
parameter is ignored if not above zero. Note that a smallalpha
parameter will exclude more weights than a largealpha
parameter.The alpha parameter can be used to exclude categories that have a very low probability of occurrence and will improve the construction performance of the sampler. The effect on sampling performance depends on the relative weights of the excluded categories; typically a high
alpha
is used to exclude categories that would be visited with a very low probability and the sampling performance is unchanged.Implementation Note
This method creates a sampler with exact samples from the specified probability distribution. It is recommended to use this method:
- if the weights are computed, for example from a probability mass function; or
- if the weights sum to an infinite value.
If the weights are computed from empirical observations then it is recommended to use the factory method
accepting frequencies
. This requires the total number of observations to be representable as a long integer.Note that if all weights are scaled by a power of 2 to be integers, and each integer can be represented as a positive 64-bit long value, then the sampler created using this method will match the output from a sampler created with the scaled weights converted to long values for the factory method
accepting frequencies
. This assumes the sum of the integer values does not overflow.It should be noted that the conversion of weights to rational numbers has a performance overhead during construction (sampling performance is not affected). This may be avoided by first converting them to integer values that can be summed without overflow. For example by scaling values by
2^62 / sum
and converting to long by casting or rounding.This approach may increase the efficiency of construction. The resulting sampler may no longer produce exact samples from the distribution. In particular any weights with a converted frequency of zero cannot be sampled.
- Parameters:
rng
- Generator of uniformly distributed random numbers.weights
- Weights of the discrete distribution.alpha
- Alpha parameter.- Returns:
- the sampler
- Throws:
IllegalArgumentException
- ifweights
is null or empty, a weight is negative, infinite orNaN
, the sum of all weights is zero, or the size of the discrete distribution generating tree is too large.- See Also:
of(UniformRandomProvider, long[])
-
-