org.apache.commons.statistics.distribution

Class HypergeometricDistribution

• java.lang.Object
• org.apache.commons.statistics.distribution.HypergeometricDistribution
• All Implemented Interfaces:
DiscreteDistribution

public final class HypergeometricDistribution
extends Object
Implementation of the hypergeometric distribution.

The probability mass function of $$X$$ is:

$f(k; N, K, n) = \frac{\binom{K}{k} \binom{N - K}{n-k}}{\binom{N}{n}}$

for $$N \in \{0, 1, 2, \dots\}$$ the population size, $$K \in \{0, 1, \dots, N\}$$ the number of success states, $$n \in \{0, 1, \dots, N\}$$ the number of samples, $$k \in \{\max(0, n+K-N), \dots, \min(n, K)\}$$ the number of successes, and

$\binom{a}{b} = \frac{a!}{b! \, (a-b)!}$

is the binomial coefficient.

Hypergeometric distribution (Wikipedia), Hypergeometric distribution (MathWorld)

• Nested classes/interfaces inherited from interface org.apache.commons.statistics.distribution.DiscreteDistribution

DiscreteDistribution.Sampler
• Method Summary

All Methods
Modifier and Type Method and Description
DiscreteDistribution.Sampler createSampler(UniformRandomProvider rng)
Creates a sampler.
double cumulativeProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x).
double getMean()
Gets the mean of this distribution.
int getNumberOfSuccesses()
Gets the number of successes parameter of this distribution.
int getPopulationSize()
Gets the population size parameter of this distribution.
int getSampleSize()
Gets the sample size parameter of this distribution.
int getSupportLowerBound()
Gets the lower bound of the support.
int getSupportUpperBound()
Gets the upper bound of the support.
double getVariance()
Gets the variance of this distribution.
int inverseCumulativeProbability(double p)
Computes the quantile function of this distribution.
int inverseSurvivalProbability(double p)
Computes the inverse survival probability function of this distribution.
double logProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns log(P(X = x)), where log is the natural logarithm.
static HypergeometricDistribution of(int populationSize, int numberOfSuccesses, int sampleSize)
Creates a hypergeometric distribution.
double probability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X = x).
double probability(int x0, int x1)
For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1).
double survivalProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X > x).
• Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• Method Detail

• of

public static HypergeometricDistribution of(int populationSize,
int numberOfSuccesses,
int sampleSize)
Creates a hypergeometric distribution.
Parameters:
populationSize - Population size.
numberOfSuccesses - Number of successes in the population.
sampleSize - Sample size.
Returns:
the distribution
Throws:
IllegalArgumentException - if numberOfSuccesses < 0, or populationSize <= 0 or numberOfSuccesses > populationSize, or sampleSize > populationSize.
• getPopulationSize

public int getPopulationSize()
Gets the population size parameter of this distribution.
Returns:
the population size.
• getNumberOfSuccesses

public int getNumberOfSuccesses()
Gets the number of successes parameter of this distribution.
Returns:
the number of successes.
• getSampleSize

public int getSampleSize()
Gets the sample size parameter of this distribution.
Returns:
the sample size.
• probability

public double probability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X = x). In other words, this method represents the probability mass function (PMF) for the distribution.
Parameters:
x - Point at which the PMF is evaluated.
Returns:
the value of the probability mass function at x.
• logProbability

public double logProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns log(P(X = x)), where log is the natural logarithm.
Parameters:
x - Point at which the PMF is evaluated.
Returns:
the logarithm of the value of the probability mass function at x.
• cumulativeProbability

public double cumulativeProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x). In other, words, this method represents the (cumulative) distribution function (CDF) for this distribution.
Parameters:
x - Point at which the CDF is evaluated.
Returns:
the probability that a random variable with this distribution takes a value less than or equal to x.
• survivalProbability

public double survivalProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X > x). In other words, this method represents the complementary cumulative distribution function.

By default, this is defined as 1 - cumulativeProbability(x), but the specific implementation may be more accurate.

Parameters:
x - Point at which the survival function is evaluated.
Returns:
the probability that a random variable with this distribution takes a value greater than x.
• getMean

public double getMean()
Gets the mean of this distribution.

For population size $$N$$, number of successes $$K$$, and sample size $$n$$, the mean is:

$n \frac{K}{N}$

Returns:
the mean.
• getVariance

public double getVariance()
Gets the variance of this distribution.

For population size $$N$$, number of successes $$K$$, and sample size $$n$$, the variance is:

$n \frac{K}{N} \frac{N-K}{N} \frac{N-n}{N-1}$

Returns:
the variance.
• getSupportLowerBound

public int getSupportLowerBound()
Gets the lower bound of the support. This method must return the same value as inverseCumulativeProbability(0), i.e. $$\inf \{ x \in \mathbb Z : P(X \le x) \gt 0 \}$$. By convention, Integer.MIN_VALUE should be substituted for negative infinity.

For population size $$N$$, number of successes $$K$$, and sample size $$n$$, the lower bound of the support is $$\max \{ 0, n + K - N \}$$.

Returns:
lower bound of the support
• getSupportUpperBound

public int getSupportUpperBound()
Gets the upper bound of the support. This method must return the same value as inverseCumulativeProbability(1), i.e. $$\inf \{ x \in \mathbb Z : P(X \le x) = 1 \}$$. By convention, Integer.MAX_VALUE should be substituted for positive infinity.

For number of successes $$K$$, and sample size $$n$$, the upper bound of the support is $$\min \{ n, K \}$$.

Returns:
upper bound of the support
• probability

public double probability(int x0,
int x1)
For a random variable X whose values are distributed according to this distribution, this method returns P(x0 < X <= x1). The default implementation uses the identity P(x0 < X <= x1) = P(X <= x1) - P(X <= x0)

Special cases:

• returns 0.0 if x0 == x1;
• returns probability(x1) if x0 + 1 == x1;
Specified by:
probability in interface DiscreteDistribution
Parameters:
x0 - Lower bound (exclusive).
x1 - Upper bound (inclusive).
Returns:
the probability that a random variable with this distribution takes a value between x0 and x1, excluding the lower and including the upper endpoint.
• inverseCumulativeProbability

public int inverseCumulativeProbability(double p)
Computes the quantile function of this distribution. For a random variable X distributed according to this distribution, the returned value is:

$x = \begin{cases} \inf \{ x \in \mathbb Z : P(X \le x) \ge p\} & \text{for } 0 \lt p \le 1 \\ \inf \{ x \in \mathbb Z : P(X \le x) \gt 0 \} & \text{for } p = 0 \end{cases}$

If the result exceeds the range of the data type int, then Integer.MIN_VALUE or Integer.MAX_VALUE is returned. In this case the result of cumulativeProbability(x) called using the returned p-quantile may not compute the original p.

The default implementation returns:

Specified by:
inverseCumulativeProbability in interface DiscreteDistribution
Parameters:
p - Cumulative probability.
Returns:
the smallest p-quantile of this distribution (largest 0-quantile for p = 0).
Throws:
IllegalArgumentException - if p < 0 or p > 1
• inverseSurvivalProbability

public int inverseSurvivalProbability(double p)
Computes the inverse survival probability function of this distribution. For a random variable X distributed according to this distribution, the returned value is:

$x = \begin{cases} \inf \{ x \in \mathbb Z : P(X \ge x) \le p\} & \text{for } 0 \le p \lt 1 \\ \inf \{ x \in \mathbb Z : P(X \ge x) \lt 1 \} & \text{for } p = 1 \end{cases}$

If the result exceeds the range of the data type int, then Integer.MIN_VALUE or Integer.MAX_VALUE is returned. In this case the result of survivalProbability(x) called using the returned (1-p)-quantile may not compute the original p.

By default, this is defined as inverseCumulativeProbability(1 - p), but the specific implementation may be more accurate.

The default implementation returns:

Specified by:
inverseSurvivalProbability in interface DiscreteDistribution
Parameters:
p - Cumulative probability.
Returns:
the smallest (1-p)-quantile of this distribution (largest 0-quantile for p = 1).
Throws:
IllegalArgumentException - if p < 0 or p > 1
• createSampler

public DiscreteDistribution.Sampler createSampler(UniformRandomProvider rng)
Creates a sampler.
Specified by:
createSampler in interface DiscreteDistribution
Parameters:
rng - Generator of uniformly distributed numbers.
Returns:
a sampler that produces random numbers according this distribution.