java.lang.Object
- org.apache.commons.math4.legacy.ml.clustering.Clusterer<T>
- - org.apache.commons.math4.legacy.ml.clustering.KMeansPlusPlusClusterer<T>

Type Parameters:

T - type of the points to cluster

Direct Known Subclasses:

ElkanKMeansPlusPlusClusterer, MiniBatchKMeansClusterer
```
public class KMeansPlusPlusClusterer<T extends Clusterable>
extends Clusterer<T>
```
Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.

Since:

3.2

See Also:

K-means++ (wikipedia)

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class KMeansPlusPlusClusterer.EmptyClusterStrategy
Strategies to use for replacing an empty cluster.

Constructor Summary

Constructors
Constructor	Description
`KMeansPlusPlusClusterer(int k)`	Build a clusterer.
`KMeansPlusPlusClusterer(int k, int maxIterations)`	Build a clusterer.
`KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure)`	Build a clusterer.
`KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, org.apache.commons.rng.UniformRandomProvider random)`	Build a clusterer.
`KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, org.apache.commons.rng.UniformRandomProvider random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)`	Build a clusterer.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`List<CentroidCluster<T>>`	`cluster(Collection<T> points)`	Runs the K-means++ clustering algorithm.
`int`	`getMaxIterations()`	Returns the maximum number of iterations this instance will use.
`int`	`getNumberOfClusters()`	Return the number of clusters this instance will use.

Methods inherited from class org.apache.commons.math4.legacy.ml.clustering.Clusterer
distance, getDistanceMeasure

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - KMeansPlusPlusClusterer
```
public KMeansPlusPlusClusterer(int k)
```
    Build a clusterer.
    The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
    The euclidean distance will be used as default distance measure.
    
    Parameters:
    
    k - the number of clusters to split the data into
  - KMeansPlusPlusClusterer
```
public KMeansPlusPlusClusterer(int k,
                               int maxIterations)
```
    Build a clusterer.
    The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
    The euclidean distance will be used as default distance measure.
    
    Parameters:
    
    k - the number of clusters to split the data into
    
    maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
  - KMeansPlusPlusClusterer
```
public KMeansPlusPlusClusterer(int k,
                               int maxIterations,
                               DistanceMeasure measure)
```
    Build a clusterer.
    The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
    
    Parameters:
    
    k - the number of clusters to split the data into
    
    maxIterations - the maximum number of iterations to run the algorithm for.
    
    measure - the distance measure to use
    
    Throws:
    
    NotStrictlyPositiveException - if k <= 0.
  - KMeansPlusPlusClusterer
```
public KMeansPlusPlusClusterer(int k,
                               int maxIterations,
                               DistanceMeasure measure,
                               org.apache.commons.rng.UniformRandomProvider random)
```
    Build a clusterer.
    The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
    
    Parameters:
    
    k - the number of clusters to split the data into
    
    maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
    
    measure - the distance measure to use
    
    random - random generator to use for choosing initial centers
  - KMeansPlusPlusClusterer
```
public KMeansPlusPlusClusterer(int k,
                               int maxIterations,
                               DistanceMeasure measure,
                               org.apache.commons.rng.UniformRandomProvider random,
                               KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
```
    Build a clusterer.
    
    Parameters:
    
    k - the number of clusters to split the data into
    
    maxIterations - the maximum number of iterations to run the algorithm for.
    
    measure - the distance measure to use
    
    random - random generator to use for choosing initial centers
    
    emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
    
    Throws:
    
    NotStrictlyPositiveException - if k <= 0 or maxIterations <= 0.
- Method Detail
  - getNumberOfClusters
```
public int getNumberOfClusters()
```
    Return the number of clusters this instance will use.
    
    Returns:
    
    the number of clusters
  - getMaxIterations
```
public int getMaxIterations()
```
    Returns the maximum number of iterations this instance will use.
    
    Returns:
    
    the maximum number of iterations, or -1 if no maximum is set
  - cluster
```
public List<CentroidCluster<T>> cluster(Collection<T> points)
```
    Runs the K-means++ clustering algorithm.
    
    Specified by:
    
    cluster in class Clusterer<T extends Clusterable>
    
    Parameters:
    
    points - the points to cluster
    
    Returns:
    
    a list of clusters containing the points
    
    Throws:
    
    MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
    
    ConvergenceException - if an empty cluster is encountered and the empty cluster strategy is set to KMeansPlusPlusClusterer.EmptyClusterStrategy.ERROR

Class KMeansPlusPlusClusterer<T extends Clusterable>

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.commons.math4.legacy.ml.clustering.Clusterer

Methods inherited from class java.lang.Object

Constructor Detail

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

Method Detail

getNumberOfClusters

getMaxIterations

cluster