org.apache.commons.math3.ml.clustering
Class DBSCANClusterer<T extends Clusterable>

java.lang.Object
  extended by org.apache.commons.math3.ml.clustering.Clusterer<T>
      extended by org.apache.commons.math3.ml.clustering.DBSCANClusterer<T>
Type Parameters:
T - type of the points to cluster

public class DBSCANClusterer<T extends Clusterable>
extends Clusterer<T>

DBSCAN (density-based spatial clustering of applications with noise) algorithm.

The DBSCAN algorithm forms clusters based on the idea of density connectivity, i.e. a point p is density connected to another point q, if there exists a chain of points pi, with i = 1 .. n and p1 = p and pn = q, such that each pair <pi, pi+1> is directly density-reachable. A point q is directly density-reachable from point p if it is in the ε-neighborhood of this point.

Any point that is not density-reachable from a formed cluster is treated as noise, and will thus not be present in the result.

The algorithm requires two parameters:

Since:
3.2
Version:
$Id: DBSCANClusterer.html 857555 2013-04-06 23:30:25Z luc $
See Also:
DBSCAN (wikipedia), A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise

Constructor Summary
DBSCANClusterer(double eps, int minPts)
          Creates a new instance of a DBSCANClusterer.
DBSCANClusterer(double eps, int minPts, DistanceMeasure measure)
          Creates a new instance of a DBSCANClusterer.
 
Method Summary
 List<Cluster<T>> cluster(Collection<T> points)
          Performs DBSCAN cluster analysis.
 double getEps()
          Returns the maximum radius of the neighborhood to be considered.
 int getMinPts()
          Returns the minimum number of points needed for a cluster.
 
Methods inherited from class org.apache.commons.math3.ml.clustering.Clusterer
distance, getDistanceMeasure
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DBSCANClusterer

public DBSCANClusterer(double eps,
                       int minPts)
                throws NotPositiveException
Creates a new instance of a DBSCANClusterer.

The euclidean distance will be used as default distance measure.

Parameters:
eps - maximum radius of the neighborhood to be considered
minPts - minimum number of points needed for a cluster
Throws:
NotPositiveException - if eps < 0.0 or minPts < 0

DBSCANClusterer

public DBSCANClusterer(double eps,
                       int minPts,
                       DistanceMeasure measure)
                throws NotPositiveException
Creates a new instance of a DBSCANClusterer.

Parameters:
eps - maximum radius of the neighborhood to be considered
minPts - minimum number of points needed for a cluster
measure - the distance measure to use
Throws:
NotPositiveException - if eps < 0.0 or minPts < 0
Method Detail

getEps

public double getEps()
Returns the maximum radius of the neighborhood to be considered.

Returns:
maximum radius of the neighborhood

getMinPts

public int getMinPts()
Returns the minimum number of points needed for a cluster.

Returns:
minimum number of points needed for a cluster

cluster

public List<Cluster<T>> cluster(Collection<T> points)
                                             throws NullArgumentException
Performs DBSCAN cluster analysis.

Specified by:
cluster in class Clusterer<T extends Clusterable>
Parameters:
points - the points to cluster
Returns:
the list of clusters
Throws:
NullArgumentException - if the data points are null


Copyright © 2003-2013 The Apache Software Foundation. All Rights Reserved.