org.apache.commons.math3.ml.clustering

## Class FuzzyKMeansClusterer<T extends Clusterable>

• Type Parameters:
T - type of the points to cluster

public class FuzzyKMeansClusterer<T extends Clusterable>
extends Clusterer<T>
Fuzzy K-Means clustering algorithm.

The Fuzzy K-Means algorithm is a variation of the classical K-Means algorithm, with the major difference that a single data point is not uniquely assigned to a single cluster. Instead, each point i has a set of weights uij which indicate the degree of membership to the cluster j.

The algorithm then tries to minimize the objective function:

J = ∑i=1..Ck=1..N uikmdik2

with dik being the distance between data point i and the cluster center k.

The algorithm requires two parameters:

• k: the number of clusters
• fuzziness: determines the level of cluster fuzziness, larger values lead to fuzzier clusters
• maxIterations: the maximum number of iterations
• epsilon: the convergence criteria, default is 1e-3

The fuzzy variant of the K-Means algorithm is more robust with regard to the selection of the initial cluster centers.

Since:
3.3
Version:
$Id: FuzzyKMeansClusterer.html 885258 2013-11-03 02:46:49Z tn$
• ### Constructor Detail

• #### FuzzyKMeansClusterer

public FuzzyKMeansClusterer(int k,
double fuzziness)
throws NumberIsTooSmallException
Creates a new instance of a FuzzyKMeansClusterer.

The euclidean distance will be used as default distance measure.

Parameters:
k - the number of clusters to split the data into
fuzziness - the fuzziness factor, must be > 1.0
Throws:
NumberIsTooSmallException - if fuzziness <= 1.0
• #### FuzzyKMeansClusterer

public FuzzyKMeansClusterer(int k,
double fuzziness,
int maxIterations,
DistanceMeasure measure)
throws NumberIsTooSmallException
Creates a new instance of a FuzzyKMeansClusterer.
Parameters:
k - the number of clusters to split the data into
fuzziness - the fuzziness factor, must be > 1.0
maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
measure - the distance measure to use
Throws:
NumberIsTooSmallException - if fuzziness <= 1.0
• #### FuzzyKMeansClusterer

public FuzzyKMeansClusterer(int k,
double fuzziness,
int maxIterations,
DistanceMeasure measure,
double epsilon,
RandomGenerator random)
throws NumberIsTooSmallException
Creates a new instance of a FuzzyKMeansClusterer.
Parameters:
k - the number of clusters to split the data into
fuzziness - the fuzziness factor, must be > 1.0
maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
measure - the distance measure to use
epsilon - the convergence criteria (default is 1e-3)
random - random generator to use for choosing initial centers
Throws:
NumberIsTooSmallException - if fuzziness <= 1.0
• ### Method Detail

• #### getK

public int getK()
Return the number of clusters this instance will use.
Returns:
the number of clusters
• #### getFuzziness

public double getFuzziness()
Returns the fuzziness factor used by this instance.
Returns:
the fuzziness factor
• #### getMaxIterations

public int getMaxIterations()
Returns the maximum number of iterations this instance will use.
Returns:
the maximum number of iterations, or -1 if no maximum is set
• #### getEpsilon

public double getEpsilon()
Returns the convergence criteria used by this instance.
Returns:
the convergence criteria
• #### getRandomGenerator

public RandomGenerator getRandomGenerator()
Returns the random generator this instance will use.
Returns:
the random generator
• #### getMembershipMatrix

public RealMatrix getMembershipMatrix()
Returns the nxk membership matrix, where n is the number of data points and k the number of clusters.

The element Ui,j represents the membership value for data point i to cluster j.

Returns:
the membership matrix
Throws:
MathIllegalStateException - if cluster(Collection) has not been called before