Class AggregateSummaryStatistics

  • All Implemented Interfaces:
    StatisticalSummary

    public class AggregateSummaryStatistics
    extends Object
    implements StatisticalSummary

    An aggregator for SummaryStatistics from several data sets or data set partitions. In its simplest usage mode, the client creates an instance via the zero-argument constructor, then uses createContributingStatistics() to obtain a SummaryStatistics for each individual data set / partition. The per-set statistics objects are used as normal, and at any time the aggregate statistics for all the contributors can be obtained from this object.

    Clients with specialized requirements can use alternative constructors to control the statistics implementations and initial values used by the contributing and the internal aggregate SummaryStatistics objects.

    A static aggregate(Collection) method is also included that computes aggregate statistics directly from a Collection of SummaryStatistics instances.

    When createContributingStatistics() is used to create SummaryStatistics instances to be aggregated concurrently, the created instances' SummaryStatistics.addValue(double) methods must synchronize on the aggregating instance maintained by this class. In multithreaded environments, if the functionality provided by aggregate(Collection) is adequate, that method should be used to avoid unnecessary computation and synchronization delays.

    Since:
    2.0
    • Constructor Detail

      • AggregateSummaryStatistics

        public AggregateSummaryStatistics()
        Initializes a new AggregateSummaryStatistics with default statistics implementations.
      • AggregateSummaryStatistics

        public AggregateSummaryStatistics​(SummaryStatistics prototypeStatistics)
                                   throws NullArgumentException
        Initializes a new AggregateSummaryStatistics with the specified statistics object as a prototype for contributing statistics and for the internal aggregate statistics. This provides for customized statistics implementations to be used by contributing and aggregate statistics.
        Parameters:
        prototypeStatistics - a SummaryStatistics serving as a prototype both for the internal aggregate statistics and for contributing statistics obtained via the createContributingStatistics() method. Being a prototype means that other objects are initialized by copying this object's state. If null, a new, default statistics object is used. Any statistic values in the prototype are propagated to contributing statistics objects and (once) into these aggregate statistics.
        Throws:
        NullArgumentException - if prototypeStatistics is null
        See Also:
        createContributingStatistics()
      • AggregateSummaryStatistics

        public AggregateSummaryStatistics​(SummaryStatistics prototypeStatistics,
                                          SummaryStatistics initialStatistics)
        Initializes a new AggregateSummaryStatistics with the specified statistics object as a prototype for contributing statistics and for the internal aggregate statistics. This provides for different statistics implementations to be used by contributing and aggregate statistics and for an initial state to be supplied for the aggregate statistics.
        Parameters:
        prototypeStatistics - a SummaryStatistics serving as a prototype both for the internal aggregate statistics and for contributing statistics obtained via the createContributingStatistics() method. Being a prototype means that other objects are initialized by copying this object's state. If null, a new, default statistics object is used. Any statistic values in the prototype are propagated to contributing statistics objects, but not into these aggregate statistics.
        initialStatistics - a SummaryStatistics to serve as the internal aggregate statistics object. If null, a new, default statistics object is used.
        See Also:
        createContributingStatistics()
    • Method Detail

      • getMax

        public double getMax()
        Returns the maximum of the available values.. This version returns the maximum over all the aggregated data.
        Specified by:
        getMax in interface StatisticalSummary
        Returns:
        The max or Double.NaN if no values have been added.
        See Also:
        StatisticalSummary.getMax()
      • getMin

        public double getMin()
        Returns the minimum of the available values.. This version returns the minimum over all the aggregated data.
        Specified by:
        getMin in interface StatisticalSummary
        Returns:
        The min or Double.NaN if no values have been added.
        See Also:
        StatisticalSummary.getMin()
      • getSum

        public double getSum()
        Returns the sum of the values that have been added to Univariate.. This version returns a sum of all the aggregated data.
        Specified by:
        getSum in interface StatisticalSummary
        Returns:
        The sum or Double.NaN if no values have been added
        See Also:
        StatisticalSummary.getSum()
      • getSecondMoment

        public double getSecondMoment()
        Returns a statistic related to the Second Central Moment. Specifically, what is returned is the sum of squared deviations from the sample mean among the all of the aggregated data.
        Returns:
        second central moment statistic
        See Also:
        SummaryStatistics.getSecondMoment()
      • createContributingStatistics

        public SummaryStatistics createContributingStatistics()
        Creates and returns a SummaryStatistics whose data will be aggregated with those of this AggregateSummaryStatistics.
        Returns:
        a SummaryStatistics whose data will be aggregated with those of this AggregateSummaryStatistics. The initial state is a copy of the configured prototype statistics.
      • aggregate

        public static StatisticalSummaryValues aggregate​(Collection<? extends StatisticalSummary> statistics)
        Computes aggregate summary statistics. This method can be used to combine statistics computed over partitions or subsamples - i.e., the StatisticalSummaryValues returned should contain the same values that would have been obtained by computing a single StatisticalSummary over the combined dataset.

        Returns null if the collection is empty or null.

        Parameters:
        statistics - collection of SummaryStatistics to aggregate
        Returns:
        summary statistics for the combined dataset