Comparing k-means, k-medoids and h-k-means. For a uni project.
Clustering of data sets is a widely used technique in data mining. When a data set is clustered, further analysis on that data set is made easier, because generalized rules can be made. This paper compares three different variations of clustering algorithms, based on k-means. The first is k-means itself, the second is k-medoids (PAM) and the third is h-k-means, which is a hybrid of the former two. The algorithms will be implemented based on the description given in [1], and compared on speed and accuracy with each other. The implemented algorithms were found to be flawed in some way.