Python 3 Implementations of normalized and unnormalized spectral clustering algorithms
- Python >= 3.6 (Earlier version might be applicable.)
- NumPy, Matplotlib, scikit-learn (Used for KMeans clustering and generating "moon" data), SciPy (Only the function scipy.spatial.distance.pdist is envoked to compute the pairwise distance between points.)
We implement three different versions of Spectral Clustering based on the paper "A Tutorial on Spectral Clustering" written by Ulrike von Luxburg. The dataset or adjacency matrix is stored in a NumPy array. To use the function,
from spectral_clustering import Spectral_Clustering
Depending on the RAM of the computer, this naive implementation of Spectral Clustering may not be scalable to a dataset with more than 5000 instances. However, we also furnish a pyspark implementation of Power Iteration Clustering, which is assumed to be scalable to the graph with millions of nodes and edges. To use the function for Power Iteration Clustering,
from PIC import Power_Iteration_Clustering
If you have some questions or detect some bugs, please feel free to email me. Thank you in advance!