net.sf.jdmf.algorithms.clustering
Class KMeansAlgorithm
java.lang.Object
net.sf.jdmf.algorithms.AbstractDataMiningAlgorithm
net.sf.jdmf.algorithms.clustering.KMeansAlgorithm
- All Implemented Interfaces:
- DataMiningAlgorithm
public class KMeansAlgorithm
- extends AbstractDataMiningAlgorithm
Implements the k-means clustering algorithm as described in Data Mining:
Practical Machine Learning Tools and Techniques (Second Edition)
by Ian H. Witten and Eibe Frank (Morgan Kaufmann, 2005).
1. Convert attributes to points (values are expected to be of type Double).
2. Choose initial centroids based on the current choice strategy.
3. Set new cluster centroids. For each point, calculate its distance from
current centroids and choose the cluster with the nearest centroid.
4. Find new cluster centroids (this implementation uses the mean of all
coordinates of points in each cluster).
5. Calculate distance sum between old and new cluster centroids. If the sum
is greater than or equals the minimum distance sum specified in the
algorithm's configuration, repeat 3-5.
The predicted number of clusters is specified in the input data.
- Author:
- quorthon
- See Also:
ClusteringExample
,
ClusteringInputData
,
Cluster
,
ClusteringDataMiningModel
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
KMeansAlgorithm
public KMeansAlgorithm()
analyze
public DataMiningModel analyze(InputData inputData)
- Description copied from interface:
DataMiningAlgorithm
- Analyzes input data (attributes and decisions) and produces output data
(rules, decision trees, clusters, ...).
- See Also:
DataMiningAlgorithm.analyze(net.sf.jdmf.data.input.InputData)
getInitialCentroidChoiceStrategy
public InitialCentroidChoiceStrategy getInitialCentroidChoiceStrategy()
setInitialCentroidChoiceStrategy
public void setInitialCentroidChoiceStrategy(InitialCentroidChoiceStrategy initialCentroidChoiceStrategy)
getMinimumDistanceSumBetweenOldAndNewClusterCentroids
public java.lang.Double getMinimumDistanceSumBetweenOldAndNewClusterCentroids()
setMinimumDistanceSumBetweenOldAndNewClusterCentroids
public void setMinimumDistanceSumBetweenOldAndNewClusterCentroids(java.lang.Double minimumDistanceSumBetweenOldAndNewClusterCentroids)