Class ZipfDistribution
- java.lang.Object
-
- org.apache.commons.statistics.distribution.ZipfDistribution
-
- All Implemented Interfaces:
DiscreteDistribution
public final class ZipfDistribution extends Object
Implementation of the Zipf distribution.The probability mass function of \( X \) is:
\[ f(k; N, s) = \frac{1/k^s}{H_{N,s}} \]
for \( N \in \{1, 2, 3, \dots\} \) the number of elements, \( s \gt 0 \) the exponent characterizing the distribution, \( k \in \{1, 2, \dots, N\} \) the element rank, and \( H_{N,s} \) is the normalizing constant which corresponds to the generalized harmonic number of order N of s.
- See Also:
- Zipf distribution (Wikipedia)
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.commons.statistics.distribution.DiscreteDistribution
DiscreteDistribution.Sampler
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description DiscreteDistribution.Sampler
createSampler(org.apache.commons.rng.UniformRandomProvider rng)
Creates a sampler.double
cumulativeProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
.double
getExponent()
Gets the exponent parameter of this distribution.double
getMean()
Gets the mean of this distribution.int
getNumberOfElements()
Gets the number of elements parameter of this distribution.int
getSupportLowerBound()
Gets the lower bound of the support.int
getSupportUpperBound()
Gets the upper bound of the support.double
getVariance()
Gets the variance of this distribution.int
inverseCumulativeProbability(double p)
Computes the quantile function of this distribution.int
inverseSurvivalProbability(double p)
Computes the inverse survival probability function of this distribution.double
logProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm.static ZipfDistribution
of(int numberOfElements, double exponent)
Creates a Zipf distribution.double
probability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
.double
probability(int x0, int x1)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(x0 < X <= x1)
.double
survivalProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X > x)
.
-
-
-
Method Detail
-
of
public static ZipfDistribution of(int numberOfElements, double exponent)
Creates a Zipf distribution.- Parameters:
numberOfElements
- Number of elements.exponent
- Exponent.- Returns:
- the distribution
- Throws:
IllegalArgumentException
- ifnumberOfElements <= 0
orexponent <= 0
.
-
getNumberOfElements
public int getNumberOfElements()
Gets the number of elements parameter of this distribution.- Returns:
- the number of elements.
-
getExponent
public double getExponent()
Gets the exponent parameter of this distribution.- Returns:
- the exponent.
-
probability
public double probability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
. In other words, this method represents the probability mass function (PMF) for the distribution.- Parameters:
x
- Point at which the PMF is evaluated.- Returns:
- the value of the probability mass function at
x
.
-
logProbability
public double logProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm.- Parameters:
x
- Point at which the PMF is evaluated.- Returns:
- the logarithm of the value of the probability mass function at
x
.
-
cumulativeProbability
public double cumulativeProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
. In other, words, this method represents the (cumulative) distribution function (CDF) for this distribution.- Parameters:
x
- Point at which the CDF is evaluated.- Returns:
- the probability that a random variable with this distribution
takes a value less than or equal to
x
.
-
survivalProbability
public double survivalProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X > x)
. In other words, this method represents the complementary cumulative distribution function.By default, this is defined as
1 - cumulativeProbability(x)
, but the specific implementation may be more accurate.- Parameters:
x
- Point at which the survival function is evaluated.- Returns:
- the probability that a random variable with this
distribution takes a value greater than
x
.
-
getMean
public double getMean()
Gets the mean of this distribution.For number of elements \( N \) and exponent \( s \), the mean is:
\[ \frac{H_{N,s-1}}{H_{N,s}} \]
where \( H_{N,k} \) is the generalized harmonic number of order \( N \) of \( k \).
- Returns:
- the mean.
-
getVariance
public double getVariance()
Gets the variance of this distribution.For number of elements \( N \) and exponent \( s \), the variance is:
\[ \frac{H_{N,s-2}}{H_{N,s}} - \frac{H_{N,s-1}^2}{H_{N,s}^2} \]
where \( H_{N,k} \) is the generalized harmonic number of order \( N \) of \( k \).
- Returns:
- the variance.
-
getSupportLowerBound
public int getSupportLowerBound()
Gets the lower bound of the support. This method must return the same value asinverseCumulativeProbability(0)
, i.e. \( \inf \{ x \in \mathbb Z : P(X \le x) \gt 0 \} \). By convention,Integer.MIN_VALUE
should be substituted for negative infinity.The lower bound of the support is always 1.
- Returns:
- 1.
-
getSupportUpperBound
public int getSupportUpperBound()
Gets the upper bound of the support. This method must return the same value asinverseCumulativeProbability(1)
, i.e. \( \inf \{ x \in \mathbb Z : P(X \le x) = 1 \} \). By convention,Integer.MAX_VALUE
should be substituted for positive infinity.The upper bound of the support is the number of elements.
- Returns:
- number of elements.
-
createSampler
public DiscreteDistribution.Sampler createSampler(org.apache.commons.rng.UniformRandomProvider rng)
Creates a sampler.- Specified by:
createSampler
in interfaceDiscreteDistribution
- Parameters:
rng
- Generator of uniformly distributed numbers.- Returns:
- a sampler that produces random numbers according this distribution.
-
probability
public double probability(int x0, int x1)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(x0 < X <= x1)
. The default implementation uses the identityP(x0 < X <= x1) = P(X <= x1) - P(X <= x0)
Special cases:
- returns
0.0
ifx0 == x1
; - returns
probability(x1)
ifx0 + 1 == x1
;
- Specified by:
probability
in interfaceDiscreteDistribution
- Parameters:
x0
- Lower bound (exclusive).x1
- Upper bound (inclusive).- Returns:
- the probability that a random variable with this distribution
takes a value between
x0
andx1
, excluding the lower and including the upper endpoint.
- returns
-
inverseCumulativeProbability
public int inverseCumulativeProbability(double p)
Computes the quantile function of this distribution. For a random variableX
distributed according to this distribution, the returned value is:\[ x = \begin{cases} \inf \{ x \in \mathbb Z : P(X \le x) \ge p\} & \text{for } 0 \lt p \le 1 \\ \inf \{ x \in \mathbb Z : P(X \le x) \gt 0 \} & \text{for } p = 0 \end{cases} \]
If the result exceeds the range of the data type
int
, thenInteger.MIN_VALUE
orInteger.MAX_VALUE
is returned. In this case the result ofcumulativeProbability(x)
called using the returnedp
-quantile may not compute the originalp
.The default implementation returns:
DiscreteDistribution.getSupportLowerBound()
forp = 0
,DiscreteDistribution.getSupportUpperBound()
forp = 1
, or- the result of a binary search between the lower and upper bound using
cumulativeProbability(x)
. The bounds may be bracketed for efficiency.
- Specified by:
inverseCumulativeProbability
in interfaceDiscreteDistribution
- Parameters:
p
- Cumulative probability.- Returns:
- the smallest
p
-quantile of this distribution (largest 0-quantile forp = 0
). - Throws:
IllegalArgumentException
- ifp < 0
orp > 1
-
inverseSurvivalProbability
public int inverseSurvivalProbability(double p)
Computes the inverse survival probability function of this distribution. For a random variableX
distributed according to this distribution, the returned value is:\[ x = \begin{cases} \inf \{ x \in \mathbb Z : P(X \gt x) \le p\} & \text{for } 0 \le p \lt 1 \\ \inf \{ x \in \mathbb Z : P(X \gt x) \lt 1 \} & \text{for } p = 1 \end{cases} \]
If the result exceeds the range of the data type
int
, thenInteger.MIN_VALUE
orInteger.MAX_VALUE
is returned. In this case the result ofsurvivalProbability(x)
called using the returned(1-p)
-quantile may not compute the originalp
.By default, this is defined as
inverseCumulativeProbability(1 - p)
, but the specific implementation may be more accurate.The default implementation returns:
DiscreteDistribution.getSupportLowerBound()
forp = 1
,DiscreteDistribution.getSupportUpperBound()
forp = 0
, or- the result of a binary search between the lower and upper bound using
survivalProbability(x)
. The bounds may be bracketed for efficiency.
- Specified by:
inverseSurvivalProbability
in interfaceDiscreteDistribution
- Parameters:
p
- Cumulative probability.- Returns:
- the smallest
(1-p)
-quantile of this distribution (largest 0-quantile forp = 1
). - Throws:
IllegalArgumentException
- ifp < 0
orp > 1
-
-