Skip to content

v3.1.0

Compare
Choose a tag to compare
@LTLA LTLA released this 13 Sep 21:01
· 4 commits to master since this release
  • Replace PCA-partitioning initialization with the simpler variance-partitioning. The new InitializeVariancePartition class is faster and actually deterministic (no random initialization for approximate PCA), and is close enough to InitializePcaPartition. We also add our own improvement where the partition boundary is chosen to minimize the residual sum of squares of the child partitions.
  • Do not quit Hartigan-Wong prematurely when quick transfers do not converge. This gives the algorithm a chance to perform more optimal transfers to get to a better solution, possibly allowing convergence of quick transfers in subsequent iterations. We add options to set the number of quick transfer iterations as well as to re-enable the premature quitting upon convergence failure.
  • Add a parallelize() function to handle the parallelization choice instead of defining the KMEANS_CUSTOM_PARALLEL macro by default. This avoids issues with nested macros in arbitrary user code.
  • Mitigate numerical instability of Hartigan-Wong by recomputing the centroids and WCSS loss before every optimal transfer iteration. This avoids accumulation of errors in the losses/centroids after many transfers.