Class KMeansPlusPlusClusterer<T extends Clusterable>

    • Constructor Detail

      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        The euclidean distance will be used as default distance measure.

        Parameters:
        k - the number of clusters to split the data into
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        The euclidean distance will be used as default distance measure.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for.
        measure - the distance measure to use
        Throws:
        NotStrictlyPositiveException - if k <= 0.
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure,
                                       org.apache.commons.rng.UniformRandomProvider random)
        Build a clusterer.

        The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
        measure - the distance measure to use
        random - random generator to use for choosing initial centers
      • KMeansPlusPlusClusterer

        public KMeansPlusPlusClusterer​(int k,
                                       int maxIterations,
                                       DistanceMeasure measure,
                                       org.apache.commons.rng.UniformRandomProvider random,
                                       KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
        Build a clusterer.
        Parameters:
        k - the number of clusters to split the data into
        maxIterations - the maximum number of iterations to run the algorithm for.
        measure - the distance measure to use
        random - random generator to use for choosing initial centers
        emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
        Throws:
        NotStrictlyPositiveException - if k <= 0 or maxIterations <= 0.