org.apache.commons.math3.stat.descriptive.rank
Class Percentile

java.lang.Object
  extended by org.apache.commons.math3.stat.descriptive.AbstractUnivariateStatistic
      extended by org.apache.commons.math3.stat.descriptive.rank.Percentile
All Implemented Interfaces:
Serializable, UnivariateStatistic, MathArrays.Function
Direct Known Subclasses:
Median

public class Percentile
extends AbstractUnivariateStatistic
implements Serializable

Provides percentile computation.

There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:

  1. Let n be the length of the (sorted) array and 0 < p <= 100 be the desired percentile.
  2. If n = 1 return the unique array element (regardless of the value of p); otherwise
  3. Compute the estimated percentile position pos = p * (n + 1) / 100 and the difference, d between pos and floor(pos) (i.e. the fractional part of pos).
  4. If pos < 1 return the smallest element in the array.
  5. Else if pos >= n return the largest element in the array.
  6. Else let lower be the element in position floor(pos) in the array and let upper be the next element in the array. Return lower + d * (upper - lower)

To compute percentiles, the data must be at least partially ordered. Input arrays are copied and recursively partitioned using an ordering definition. The ordering used by Arrays.sort(double[]) is the one determined by Double.compareTo(Double). This ordering makes Double.NaN larger than any other value (including Double.POSITIVE_INFINITY). Therefore, for example, the median (50th percentile) of {0, 1, 2, 3, 4, Double.NaN} evaluates to 2.5.

Since percentile estimation usually involves interpolation between array elements, arrays containing NaN or infinite values will often result in NaN or infinite values returned.

Since 2.2, Percentile uses only selection instead of complete sorting and caches selection algorithm state between calls to the various evaluate methods. This greatly improves efficiency, both for a single percentile and multiple percentile computations. To maximize performance when multiple percentiles are computed based on the same data, users should set the data array once using either one of the evaluate(double[], double) or setData(double[]) methods and thereafter evaluate(double) with just the percentile provided.

Note that this implementation is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the increment() or clear() method, it must be synchronized externally.

Version:
$Id: Percentile.java 1416643 2012-12-03 19:37:14Z tn $
See Also:
Serialized Form

Constructor Summary
Percentile()
          Constructs a Percentile with a default quantile value of 50.0.
Percentile(double p)
          Constructs a Percentile with the specific quantile value.
Percentile(Percentile original)
          Copy constructor, creates a new Percentile identical to the original
 
Method Summary
 Percentile copy()
          Returns a copy of the statistic with the same internal state.
static void copy(Percentile source, Percentile dest)
          Copies source to dest.
 double evaluate(double p)
          Returns the result of evaluating the statistic over the stored data.
 double evaluate(double[] values, double p)
          Returns an estimate of the pth percentile of the values in the values array.
 double evaluate(double[] values, int start, int length)
          Returns an estimate of the quantileth percentile of the designated values in the values array.
 double evaluate(double[] values, int begin, int length, double p)
          Returns an estimate of the pth percentile of the values in the values array, starting with the element in (0-based) position begin in the array and including length values.
 double getQuantile()
          Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
 void setData(double[] values)
          Set the data array.
 void setData(double[] values, int begin, int length)
          Set the data array.
 void setQuantile(double p)
          Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
 
Methods inherited from class org.apache.commons.math3.stat.descriptive.AbstractUnivariateStatistic
evaluate, evaluate, getData, getDataRef, test, test, test, test
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Percentile

public Percentile()
Constructs a Percentile with a default quantile value of 50.0.


Percentile

public Percentile(double p)
           throws MathIllegalArgumentException
Constructs a Percentile with the specific quantile value.

Parameters:
p - the quantile
Throws:
MathIllegalArgumentException - if p is not greater than 0 and less than or equal to 100

Percentile

public Percentile(Percentile original)
           throws NullArgumentException
Copy constructor, creates a new Percentile identical to the original

Parameters:
original - the Percentile instance to copy
Throws:
NullArgumentException - if original is null
Method Detail

setData

public void setData(double[] values)
Set the data array.

The stored value is a copy of the parameter array, not the array itself.

Overrides:
setData in class AbstractUnivariateStatistic
Parameters:
values - data array to store (may be null to remove stored data)
See Also:
AbstractUnivariateStatistic.evaluate()

setData

public void setData(double[] values,
                    int begin,
                    int length)
             throws MathIllegalArgumentException
Set the data array. The input array is copied, not referenced.

Overrides:
setData in class AbstractUnivariateStatistic
Parameters:
values - data array to store
begin - the index of the first element to include
length - the number of elements to include
Throws:
MathIllegalArgumentException - if values is null or the indices are not valid
See Also:
AbstractUnivariateStatistic.evaluate()

evaluate

public double evaluate(double p)
                throws MathIllegalArgumentException
Returns the result of evaluating the statistic over the stored data.

The stored array is the one which was set by previous calls to setData(double[])

Parameters:
p - the percentile value to compute
Returns:
the value of the statistic applied to the stored data
Throws:
MathIllegalArgumentException - if p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)

evaluate

public double evaluate(double[] values,
                       double p)
                throws MathIllegalArgumentException
Returns an estimate of the pth percentile of the values in the values array.

Calls to this method do not modify the internal quantile state of this statistic.

See Percentile for a description of the percentile estimation algorithm used.

Parameters:
values - input array of values
p - the percentile value to compute
Returns:
the percentile value or Double.NaN if the array is empty
Throws:
MathIllegalArgumentException - if values is null or p is invalid

evaluate

public double evaluate(double[] values,
                       int start,
                       int length)
                throws MathIllegalArgumentException
Returns an estimate of the quantileth percentile of the designated values in the values array. The quantile estimated is determined by the quantile property.

See Percentile for a description of the percentile estimation algorithm used.

Specified by:
evaluate in interface UnivariateStatistic
Specified by:
evaluate in interface MathArrays.Function
Specified by:
evaluate in class AbstractUnivariateStatistic
Parameters:
values - the input array
start - index of the first array element to include
length - the number of elements to include
Returns:
the percentile value
Throws:
MathIllegalArgumentException - if the parameters are not valid

evaluate

public double evaluate(double[] values,
                       int begin,
                       int length,
                       double p)
                throws MathIllegalArgumentException
Returns an estimate of the pth percentile of the values in the values array, starting with the element in (0-based) position begin in the array and including length values.

Calls to this method do not modify the internal quantile state of this statistic.

See Percentile for a description of the percentile estimation algorithm used.

Parameters:
values - array of input values
p - the percentile to compute
begin - the first (0-based) element to include in the computation
length - the number of array elements to include
Returns:
the percentile value
Throws:
MathIllegalArgumentException - if the parameters are not valid or the input array is null

getQuantile

public double getQuantile()
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).

Returns:
quantile

setQuantile

public void setQuantile(double p)
                 throws MathIllegalArgumentException
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).

Parameters:
p - a value between 0 < p <= 100
Throws:
MathIllegalArgumentException - if p is not greater than 0 and less than or equal to 100

copy

public Percentile copy()
Returns a copy of the statistic with the same internal state.

Specified by:
copy in interface UnivariateStatistic
Specified by:
copy in class AbstractUnivariateStatistic
Returns:
a copy of the statistic

copy

public static void copy(Percentile source,
                        Percentile dest)
                 throws NullArgumentException
Copies source to dest.

Neither source nor dest can be null.

Parameters:
source - Percentile to copy
dest - Percentile to copy to
Throws:
NullArgumentException - if either source or dest is null


Copyright © 2003-2013 The Apache Software Foundation. All Rights Reserved.