Class MiniBatchKMeansClusterer<T extends Clusterable>
- java.lang.Object
-
- org.apache.commons.math4.legacy.ml.clustering.Clusterer<T>
-
- org.apache.commons.math4.legacy.ml.clustering.KMeansPlusPlusClusterer<T>
-
- org.apache.commons.math4.legacy.ml.clustering.MiniBatchKMeansClusterer<T>
-
- Type Parameters:
T
- Type of the points to cluster.
public class MiniBatchKMeansClusterer<T extends Clusterable> extends KMeansPlusPlusClusterer<T>
Clustering algorithm based on KMeans.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.commons.math4.legacy.ml.clustering.KMeansPlusPlusClusterer
KMeansPlusPlusClusterer.EmptyClusterStrategy
-
-
Constructor Summary
Constructors Constructor Description MiniBatchKMeansClusterer(int k, int maxIterations, int batchSize, int initIterations, int initBatchSize, int maxNoImprovementTimes, DistanceMeasure measure, org.apache.commons.rng.UniformRandomProvider random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
Build a clusterer.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<CentroidCluster<T>>
cluster(Collection<T> points)
Runs the MiniBatch K-means clustering algorithm.-
Methods inherited from class org.apache.commons.math4.legacy.ml.clustering.KMeansPlusPlusClusterer
getMaxIterations, getNumberOfClusters
-
Methods inherited from class org.apache.commons.math4.legacy.ml.clustering.Clusterer
distance, getDistanceMeasure
-
-
-
-
Constructor Detail
-
MiniBatchKMeansClusterer
public MiniBatchKMeansClusterer(int k, int maxIterations, int batchSize, int initIterations, int initBatchSize, int maxNoImprovementTimes, DistanceMeasure measure, org.apache.commons.rng.UniformRandomProvider random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
Build a clusterer.- Parameters:
k
- Number of clusters to split the data into.maxIterations
- Maximum number of iterations to run the algorithm for all the points, The actual number of iterationswill be smaller thanmaxIterations * size / batchSize
, wheresize
is the number of points to cluster. Disabled if negative.batchSize
- Batch size for training iterations.initIterations
- Number of iterations allowed in order to find out the best initial centers.initBatchSize
- Batch size for initializing the clusters centers. A value of3 * batchSize
should be suitable in most cases.maxNoImprovementTimes
- Maximum number of iterations during which no improvement is occuring. A value of 10 is suitable in most cases.measure
- Distance measure.random
- Random generator.emptyStrategy
- Strategy for handling empty clusters that may appear during algorithm iterations.
-
-
Method Detail
-
cluster
public List<CentroidCluster<T>> cluster(Collection<T> points)
Runs the MiniBatch K-means clustering algorithm.- Overrides:
cluster
in classKMeansPlusPlusClusterer<T extends Clusterable>
- Parameters:
points
- Points to cluster (cannot benull
).- Returns:
- the clusters.
- Throws:
MathIllegalArgumentException
- if the number of points is smaller than the number of clusters.
-
-