net.sf.jdmf.algorithms.clustering
Class KMeansAlgorithm

java.lang.Object
  extended by net.sf.jdmf.algorithms.AbstractDataMiningAlgorithm
      extended by net.sf.jdmf.algorithms.clustering.KMeansAlgorithm
All Implemented Interfaces:
DataMiningAlgorithm

public class KMeansAlgorithm
extends AbstractDataMiningAlgorithm

Implements the k-means clustering algorithm as described in Data Mining: Practical Machine Learning Tools and Techniques (Second Edition) by Ian H. Witten and Eibe Frank (Morgan Kaufmann, 2005).

 1. Convert attributes to points (values are expected to be of type Double).
 2. Choose initial centroids based on the current choice strategy.
 3. Set new cluster centroids. For each point, calculate its distance from 
 current centroids and choose the cluster with the nearest centroid.
 4. Find new cluster centroids (this implementation uses the mean of all
 coordinates of points in each cluster).
 5. Calculate distance sum between old and new cluster centroids. If the sum
 is greater than or equals the minimum distance sum specified in the 
 algorithm's configuration, repeat 3-5.
 
The predicted number of clusters is specified in the input data.

Author:
quorthon
See Also:
ClusteringExample, ClusteringInputData, Cluster, ClusteringDataMiningModel

Constructor Summary
KMeansAlgorithm()
           
 
Method Summary
 DataMiningModel analyze(InputData inputData)
          Analyzes input data (attributes and decisions) and produces output data (rules, decision trees, clusters, ...).
 InitialCentroidChoiceStrategy getInitialCentroidChoiceStrategy()
           
 java.lang.Double getMinimumDistanceSumBetweenOldAndNewClusterCentroids()
           
 void setInitialCentroidChoiceStrategy(InitialCentroidChoiceStrategy initialCentroidChoiceStrategy)
           
 void setMinimumDistanceSumBetweenOldAndNewClusterCentroids(java.lang.Double minimumDistanceSumBetweenOldAndNewClusterCentroids)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KMeansAlgorithm

public KMeansAlgorithm()
Method Detail

analyze

public DataMiningModel analyze(InputData inputData)
Description copied from interface: DataMiningAlgorithm
Analyzes input data (attributes and decisions) and produces output data (rules, decision trees, clusters, ...).

See Also:
DataMiningAlgorithm.analyze(net.sf.jdmf.data.input.InputData)

getInitialCentroidChoiceStrategy

public InitialCentroidChoiceStrategy getInitialCentroidChoiceStrategy()

setInitialCentroidChoiceStrategy

public void setInitialCentroidChoiceStrategy(InitialCentroidChoiceStrategy initialCentroidChoiceStrategy)

getMinimumDistanceSumBetweenOldAndNewClusterCentroids

public java.lang.Double getMinimumDistanceSumBetweenOldAndNewClusterCentroids()

setMinimumDistanceSumBetweenOldAndNewClusterCentroids

public void setMinimumDistanceSumBetweenOldAndNewClusterCentroids(java.lang.Double minimumDistanceSumBetweenOldAndNewClusterCentroids)