Class DescriptiveStatistics
- java.lang.Object
-
- org.apache.commons.math4.legacy.stat.descriptive.DescriptiveStatistics
-
- All Implemented Interfaces:
StatisticalSummary
- Direct Known Subclasses:
SynchronizedDescriptiveStatistics
public class DescriptiveStatistics extends Object implements StatisticalSummary
Maintains a dataset of values of a single variable and computes descriptive statistics based on stored data.The
windowSize
property sets a limit on the number of values that can be stored in the dataset. The default value, INFINITE_WINDOW, puts no limit on the size of the dataset. This value should be used with caution, as the backing store will grow without bound in this case. For very large datasets,SummaryStatistics
, which does not store the dataset, should be used instead of this class. IfwindowSize
is not INFINITE_WINDOW and more values are added than can be stored in the dataset, new values are added in a "rolling" manner, with new values replacing the "oldest" values in the dataset.Note: this class is not threadsafe. Use
SynchronizedDescriptiveStatistics
if concurrent access from multiple threads is required.
-
-
Field Summary
Fields Modifier and Type Field Description static int
INFINITE_WINDOW
Represents an infinite window size.
-
Constructor Summary
Constructors Constructor Description DescriptiveStatistics()
Construct aDescriptiveStatistics
instance with an infinite window.DescriptiveStatistics(double[] initialDoubleArray)
Construct aDescriptiveStatistics
instance with an infinite window and the initial data values ininitialDoubleArray
.DescriptiveStatistics(int window)
Construct aDescriptiveStatistics
instance with the specified window.DescriptiveStatistics(Double[] initialDoubleArray)
Construct a DescriptiveStatistics instance with an infinite window and the initial data values ininitialDoubleArray
.DescriptiveStatistics(DescriptiveStatistics original)
Copy constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addValue(double v)
Adds the value to the dataset.double
apply(UnivariateStatistic stat)
Apply the given statistic to the data associated with this set of statistics.void
clear()
Resets all statistics and storage.DescriptiveStatistics
copy()
Returns a copy of this DescriptiveStatistics instance with the same internal state.static void
copy(DescriptiveStatistics source, DescriptiveStatistics dest)
Copies source to dest.double
getElement(int index)
Returns the element at the specified index.double
getGeometricMean()
Returns the geometric mean of the available values.UnivariateStatistic
getGeometricMeanImpl()
Returns the currently configured geometric mean implementation.double
getKurtosis()
Returns the Kurtosis of the available values.UnivariateStatistic
getKurtosisImpl()
Returns the currently configured kurtosis implementation.double
getMax()
Returns the maximum of the available values.UnivariateStatistic
getMaxImpl()
Returns the currently configured maximum implementation.double
getMean()
Returns the arithmetic mean of the available values.UnivariateStatistic
getMeanImpl()
Returns the currently configured mean implementation.double
getMin()
Returns the minimum of the available values.UnivariateStatistic
getMinImpl()
Returns the currently configured minimum implementation.long
getN()
Returns the number of available values.double
getPercentile(double p)
Returns an estimate for the pth percentile of the stored values.UnivariateStatistic
getPercentileImpl()
Returns the currently configured percentile implementation.double
getPopulationVariance()
Returns the population variance of the available values.double
getQuadraticMean()
Returns the quadratic mean, a.k.a.double
getSkewness()
Returns the skewness of the available values.UnivariateStatistic
getSkewnessImpl()
Returns the currently configured skewness implementation.double[]
getSortedValues()
Returns the current set of values in an array of double primitives, sorted in ascending order.double
getStandardDeviation()
Returns the standard deviation of the available values.double
getSum()
Returns the sum of the values that have been added to Univariate.UnivariateStatistic
getSumImpl()
Returns the currently configured sum implementation.double
getSumsq()
Returns the sum of the squares of the available values.UnivariateStatistic
getSumsqImpl()
Returns the currently configured sum of squares implementation.double[]
getValues()
Returns the current set of values in an array of double primitives.double
getVariance()
Returns the (sample) variance of the available values.UnivariateStatistic
getVarianceImpl()
Returns the currently configured variance implementation.int
getWindowSize()
Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.void
removeMostRecentValue()
Removes the most recent value from the dataset.double
replaceMostRecentValue(double v)
Replaces the most recently stored value with the given value.void
setGeometricMeanImpl(UnivariateStatistic geometricMeanImpl)
Sets the implementation for the geometric mean.void
setKurtosisImpl(UnivariateStatistic kurtosisImpl)
Sets the implementation for the kurtosis.void
setMaxImpl(UnivariateStatistic maxImpl)
Sets the implementation for the maximum.void
setMeanImpl(UnivariateStatistic meanImpl)
Sets the implementation for the mean.void
setMinImpl(UnivariateStatistic minImpl)
Sets the implementation for the minimum.void
setPercentileImpl(UnivariateStatistic percentileImpl)
Sets the implementation to be used bygetPercentile(double)
.void
setSkewnessImpl(UnivariateStatistic skewnessImpl)
Sets the implementation for the skewness.void
setSumImpl(UnivariateStatistic sumImpl)
Sets the implementation for the sum.void
setSumsqImpl(UnivariateStatistic sumsqImpl)
Sets the implementation for the sum of squares.void
setVarianceImpl(UnivariateStatistic varianceImpl)
Sets the implementation for the variance.void
setWindowSize(int windowSize)
WindowSize controls the number of values that contribute to the reported statistics.String
toString()
Generates a text report displaying univariate statistics from values that have been added.
-
-
-
Field Detail
-
INFINITE_WINDOW
public static final int INFINITE_WINDOW
Represents an infinite window size. When thegetWindowSize()
returns this value, there is no limit to the number of data values that can be stored in the dataset.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
DescriptiveStatistics
public DescriptiveStatistics()
Construct aDescriptiveStatistics
instance with an infinite window.
-
DescriptiveStatistics
public DescriptiveStatistics(int window) throws MathIllegalArgumentException
Construct aDescriptiveStatistics
instance with the specified window.- Parameters:
window
- the window size.- Throws:
MathIllegalArgumentException
- if window size is less than 1 but not equal toINFINITE_WINDOW
-
DescriptiveStatistics
public DescriptiveStatistics(double[] initialDoubleArray)
Construct aDescriptiveStatistics
instance with an infinite window and the initial data values ininitialDoubleArray
. IfinitialDoubleArray
isnull
, then this constructor corresponds to thedefault constructor
.- Parameters:
initialDoubleArray
- the initial double[].
-
DescriptiveStatistics
public DescriptiveStatistics(Double[] initialDoubleArray)
Construct a DescriptiveStatistics instance with an infinite window and the initial data values ininitialDoubleArray
. IfinitialDoubleArray
isnull
, then this constructor corresponds toDescriptiveStatistics()
.- Parameters:
initialDoubleArray
- the initial Double[].
-
DescriptiveStatistics
public DescriptiveStatistics(DescriptiveStatistics original) throws NullArgumentException
Copy constructor. Construct a newDescriptiveStatistics
instance that is a copy oforiginal
.- Parameters:
original
- DescriptiveStatistics instance to copy- Throws:
NullArgumentException
- if original is null
-
-
Method Detail
-
addValue
public void addValue(double v)
Adds the value to the dataset. If the dataset is at the maximum size (i.e., the number of stored elements equals the currently configured windowSize), the first (oldest) element in the dataset is discarded to make room for the new value.- Parameters:
v
- the value to be added
-
removeMostRecentValue
public void removeMostRecentValue() throws MathIllegalStateException
Removes the most recent value from the dataset.- Throws:
MathIllegalStateException
- if there are no elements stored
-
replaceMostRecentValue
public double replaceMostRecentValue(double v) throws MathIllegalStateException
Replaces the most recently stored value with the given value. There must be at least one element stored to call this method.- Parameters:
v
- the value to replace the most recent stored value- Returns:
- replaced value
- Throws:
MathIllegalStateException
- if there are no elements stored
-
getMean
public double getMean()
Returns the arithmetic mean of the available values.- Specified by:
getMean
in interfaceStatisticalSummary
- Returns:
- The mean or Double.NaN if no values have been added.
-
getGeometricMean
public double getGeometricMean()
Returns the geometric mean of the available values.See
GeometricMean
for details on the computing algorithm.- Returns:
- The geometricMean, Double.NaN if no values have been added, or if any negative values have been added.
-
getVariance
public double getVariance()
Returns the (sample) variance of the available values.This method returns the bias-corrected sample variance (using
n - 1
in the denominator). UsegetPopulationVariance()
for the non-bias-corrected population variance.- Specified by:
getVariance
in interfaceStatisticalSummary
- Returns:
- The variance, Double.NaN if no values have been added or 0.0 for a single value set.
-
getPopulationVariance
public double getPopulationVariance()
Returns the population variance of the available values.- Returns:
- The population variance, Double.NaN if no values have been added, or 0.0 for a single value set.
-
getStandardDeviation
public double getStandardDeviation()
Returns the standard deviation of the available values.- Specified by:
getStandardDeviation
in interfaceStatisticalSummary
- Returns:
- The standard deviation, Double.NaN if no values have been added or 0.0 for a single value set.
-
getQuadraticMean
public double getQuadraticMean()
Returns the quadratic mean, a.k.a. root-mean-square of the available values- Returns:
- The quadratic mean or
Double.NaN
if no values have been added.
-
getSkewness
public double getSkewness()
Returns the skewness of the available values. Skewness is a measure of the asymmetry of a given distribution.- Returns:
- The skewness, Double.NaN if less than 3 values have been added.
-
getKurtosis
public double getKurtosis()
Returns the Kurtosis of the available values. Kurtosis is a measure of the "peakedness" of a distribution.- Returns:
- The kurtosis, Double.NaN if less than 4 values have been added.
-
getMax
public double getMax()
Returns the maximum of the available values.- Specified by:
getMax
in interfaceStatisticalSummary
- Returns:
- The max or Double.NaN if no values have been added.
-
getMin
public double getMin()
Returns the minimum of the available values.- Specified by:
getMin
in interfaceStatisticalSummary
- Returns:
- The min or Double.NaN if no values have been added.
-
getN
public long getN()
Returns the number of available values.- Specified by:
getN
in interfaceStatisticalSummary
- Returns:
- The number of available values
-
getSum
public double getSum()
Returns the sum of the values that have been added to Univariate.- Specified by:
getSum
in interfaceStatisticalSummary
- Returns:
- The sum or Double.NaN if no values have been added
-
getSumsq
public double getSumsq()
Returns the sum of the squares of the available values.- Returns:
- The sum of the squares or Double.NaN if no values have been added.
-
clear
public void clear()
Resets all statistics and storage.
-
getWindowSize
public int getWindowSize()
Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.- Returns:
- The current window size or -1 if its Infinite.
-
setWindowSize
public void setWindowSize(int windowSize) throws MathIllegalArgumentException
WindowSize controls the number of values that contribute to the reported statistics. For example, if windowSize is set to 3 and the values {1,2,3,4,5} have been added in that order then the available values are {3,4,5} and all reported statistics will be based on these values. IfwindowSize
is decreased as a result of this call and there are more than the new value of elements in the current dataset, values from the front of the array are discarded to reduce the dataset towindowSize
elements.- Parameters:
windowSize
- sets the size of the window.- Throws:
MathIllegalArgumentException
- if window size is less than 1 but not equal toINFINITE_WINDOW
-
getValues
public double[] getValues()
Returns the current set of values in an array of double primitives. The order of addition is preserved. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.- Returns:
- returns the current set of numbers in the order in which they were added to this set
-
getSortedValues
public double[] getSortedValues()
Returns the current set of values in an array of double primitives, sorted in ascending order. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.- Returns:
- returns the current set of numbers sorted in ascending order
-
getElement
public double getElement(int index)
Returns the element at the specified index.- Parameters:
index
- The Index of the element- Returns:
- return the element at the specified index
-
getPercentile
public double getPercentile(double p) throws MathIllegalStateException, MathIllegalArgumentException
Returns an estimate for the pth percentile of the stored values.The implementation provided here follows the first estimation procedure presented here.
Preconditions:
0 < p ≤ 100
(otherwise anMathIllegalArgumentException
is thrown)- at least one value must be stored (returns
Double.NaN
otherwise)
- Parameters:
p
- the requested percentile (scaled from 0 - 100)- Returns:
- An estimate for the pth percentile of the stored data
- Throws:
MathIllegalStateException
- if percentile implementation has been overridden and the supplied implementation does not support setQuantileMathIllegalArgumentException
- if p is not a valid quantile
-
toString
public String toString()
Generates a text report displaying univariate statistics from values that have been added. Each statistic is displayed on a separate line.
-
apply
public double apply(UnivariateStatistic stat)
Apply the given statistic to the data associated with this set of statistics.- Parameters:
stat
- the statistic to apply- Returns:
- the computed value of the statistic.
-
getMeanImpl
public UnivariateStatistic getMeanImpl()
Returns the currently configured mean implementation.- Returns:
- the UnivariateStatistic implementing the mean
- Since:
- 1.2
-
setMeanImpl
public void setMeanImpl(UnivariateStatistic meanImpl)
Sets the implementation for the mean.
- Parameters:
meanImpl
- the UnivariateStatistic instance to use for computing the mean- Since:
- 1.2
-
getGeometricMeanImpl
public UnivariateStatistic getGeometricMeanImpl()
Returns the currently configured geometric mean implementation.- Returns:
- the UnivariateStatistic implementing the geometric mean
- Since:
- 1.2
-
setGeometricMeanImpl
public void setGeometricMeanImpl(UnivariateStatistic geometricMeanImpl)
Sets the implementation for the geometric mean.- Parameters:
geometricMeanImpl
- the UnivariateStatistic instance to use for computing the geometric mean- Since:
- 1.2
-
getKurtosisImpl
public UnivariateStatistic getKurtosisImpl()
Returns the currently configured kurtosis implementation.- Returns:
- the UnivariateStatistic implementing the kurtosis
- Since:
- 1.2
-
setKurtosisImpl
public void setKurtosisImpl(UnivariateStatistic kurtosisImpl)
Sets the implementation for the kurtosis.- Parameters:
kurtosisImpl
- the UnivariateStatistic instance to use for computing the kurtosis- Since:
- 1.2
-
getMaxImpl
public UnivariateStatistic getMaxImpl()
Returns the currently configured maximum implementation.- Returns:
- the UnivariateStatistic implementing the maximum
- Since:
- 1.2
-
setMaxImpl
public void setMaxImpl(UnivariateStatistic maxImpl)
Sets the implementation for the maximum.- Parameters:
maxImpl
- the UnivariateStatistic instance to use for computing the maximum- Since:
- 1.2
-
getMinImpl
public UnivariateStatistic getMinImpl()
Returns the currently configured minimum implementation.- Returns:
- the UnivariateStatistic implementing the minimum
- Since:
- 1.2
-
setMinImpl
public void setMinImpl(UnivariateStatistic minImpl)
Sets the implementation for the minimum.- Parameters:
minImpl
- the UnivariateStatistic instance to use for computing the minimum- Since:
- 1.2
-
getPercentileImpl
public UnivariateStatistic getPercentileImpl()
Returns the currently configured percentile implementation.- Returns:
- the UnivariateStatistic implementing the percentile
- Since:
- 1.2
-
setPercentileImpl
public void setPercentileImpl(UnivariateStatistic percentileImpl) throws MathIllegalArgumentException
Sets the implementation to be used bygetPercentile(double)
. The suppliedUnivariateStatistic
must provide asetQuantile(double)
method; otherwiseIllegalArgumentException
is thrown.- Parameters:
percentileImpl
- the percentileImpl to set- Throws:
MathIllegalArgumentException
- if the supplied implementation does not provide asetQuantile
method- Since:
- 1.2
-
getSkewnessImpl
public UnivariateStatistic getSkewnessImpl()
Returns the currently configured skewness implementation.- Returns:
- the UnivariateStatistic implementing the skewness
- Since:
- 1.2
-
setSkewnessImpl
public void setSkewnessImpl(UnivariateStatistic skewnessImpl)
Sets the implementation for the skewness.- Parameters:
skewnessImpl
- the UnivariateStatistic instance to use for computing the skewness- Since:
- 1.2
-
getVarianceImpl
public UnivariateStatistic getVarianceImpl()
Returns the currently configured variance implementation.- Returns:
- the UnivariateStatistic implementing the variance
- Since:
- 1.2
-
setVarianceImpl
public void setVarianceImpl(UnivariateStatistic varianceImpl)
Sets the implementation for the variance.- Parameters:
varianceImpl
- the UnivariateStatistic instance to use for computing the variance- Since:
- 1.2
-
getSumsqImpl
public UnivariateStatistic getSumsqImpl()
Returns the currently configured sum of squares implementation.- Returns:
- the UnivariateStatistic implementing the sum of squares
- Since:
- 1.2
-
setSumsqImpl
public void setSumsqImpl(UnivariateStatistic sumsqImpl)
Sets the implementation for the sum of squares.- Parameters:
sumsqImpl
- the UnivariateStatistic instance to use for computing the sum of squares- Since:
- 1.2
-
getSumImpl
public UnivariateStatistic getSumImpl()
Returns the currently configured sum implementation.- Returns:
- the UnivariateStatistic implementing the sum
- Since:
- 1.2
-
setSumImpl
public void setSumImpl(UnivariateStatistic sumImpl)
Sets the implementation for the sum.- Parameters:
sumImpl
- the UnivariateStatistic instance to use for computing the sum- Since:
- 1.2
-
copy
public DescriptiveStatistics copy()
Returns a copy of this DescriptiveStatistics instance with the same internal state.- Returns:
- a copy of this
-
copy
public static void copy(DescriptiveStatistics source, DescriptiveStatistics dest) throws NullArgumentException
Copies source to dest.Neither source nor dest can be null.
- Parameters:
source
- DescriptiveStatistics to copydest
- DescriptiveStatistics to copy to- Throws:
NullArgumentException
- if either source or dest is null
-
-