org.apache.commons.math4.ml.clustering

Class DBSCANClusterer<T extends Clusterable>

• Type Parameters:
T - type of the points to cluster

public class DBSCANClusterer<T extends Clusterable>
extends Clusterer<T>
DBSCAN (density-based spatial clustering of applications with noise) algorithm.

The DBSCAN algorithm forms clusters based on the idea of density connectivity, i.e. a point p is density connected to another point q, if there exists a chain of points pi, with i = 1 .. n and p1 = p and pn = q, such that each pair <pi, pi+1> is directly density-reachable. A point q is directly density-reachable from point p if it is in the ε-neighborhood of this point.

Any point that is not density-reachable from a formed cluster is treated as noise, and will thus not be present in the result.

The algorithm requires two parameters:

• eps: the distance that defines the ε-neighborhood of a point
• minPoints: the minimum number of density-connected points required to form a cluster
Since:
3.2
DBSCAN (wikipedia), A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
• Constructor Summary

Constructors
Constructor and Description
DBSCANClusterer(double eps, int minPts)
Creates a new instance of a DBSCANClusterer.
DBSCANClusterer(double eps, int minPts, DistanceMeasure measure)
Creates a new instance of a DBSCANClusterer.
• Method Summary

All Methods
Modifier and Type Method and Description
List<Cluster<T>> cluster(Collection<T> points)
Performs DBSCAN cluster analysis.
double getEps()
Returns the maximum radius of the neighborhood to be considered.
int getMinPts()
Returns the minimum number of points needed for a cluster.
• Methods inherited from class org.apache.commons.math4.ml.clustering.Clusterer

distance, getDistanceMeasure
• Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
• Constructor Detail

• DBSCANClusterer

public DBSCANClusterer(double eps,
int minPts)
throws NotPositiveException
Creates a new instance of a DBSCANClusterer.

The euclidean distance will be used as default distance measure.

Parameters:
eps - maximum radius of the neighborhood to be considered
minPts - minimum number of points needed for a cluster
Throws:
NotPositiveException - if eps < 0.0 or minPts < 0
• DBSCANClusterer

public DBSCANClusterer(double eps,
int minPts,
DistanceMeasure measure)
throws NotPositiveException
Creates a new instance of a DBSCANClusterer.
Parameters:
eps - maximum radius of the neighborhood to be considered
minPts - minimum number of points needed for a cluster
measure - the distance measure to use
Throws:
NotPositiveException - if eps < 0.0 or minPts < 0
• Method Detail

• getEps

public double getEps()
Returns the maximum radius of the neighborhood to be considered.
Returns:
• getMinPts

public int getMinPts()
Returns the minimum number of points needed for a cluster.
Returns:
minimum number of points needed for a cluster
• cluster

public List<Cluster<T>> cluster(Collection<T> points)
throws NullArgumentException
Performs DBSCAN cluster analysis.
Specified by:
cluster in class Clusterer<T extends Clusterable>
Parameters:
points - the points to cluster
Returns:
the list of clusters
Throws:
NullArgumentException - if the data points are null