Math – The Commons Math User Guide

8 Probability Distributions

8.1 Overview

Standard distributions are now available in the Commons Statistics component.

Commons Math provides

an EnumeratedDistribution class that represents discrete distributions of a finite, enumerated set of values.
a MultivariateNormalDistribution interface that represents multivariate Gaussian distributions.

Inverse distribution functions can be computed using the inverseCumulativeProbability methods. For continuous f and p a probability, f.inverseCumulativeProbability(p) returns



            
inf{x in R | P(X≤x) ≥ p} for 0 < p < 1,
            
inf{x in R | P(X≤x) > 0} for p = 0.

where X is distributed as f.
For discrete f, the definition is the same, with Z (the integers) in place of R. Note that in the discrete case, the ≥ in the definition can make a difference when p is an attained value of the distribution.

8.2 Generating data like an input file

Using the EmpiricalDistribution class, you can generate data based on a given set of values:

double[] input = load("data.txt"); // Get some data.
int binCount = 500;
EmpiricalDistribution empDist = EmpiricalDistribution.from(binCount, input);
ContinuousDistribution.Sampler sampler = empDist.createSampler(RandomSource.MT.create());
double value = sampler.nextDouble();

The probability density function is estimated from the data passed as input. The estimation method is essentially the Variable Kernel Method with Gaussian smoothing. The created sampler will return random values whose probability distribution matches the empirical distribution (i.e. if you generate a large number of such values, their distribution should "look like" the distribution of the values in the input file. The input values are not stored in memory.