Skip to content

Releases: LTLA/CppKmeans

v3.1.2

23 Dec 23:33
Compare
Choose a tag to compare
  • Refactored RefineHartiganWong to greatly simplify the update logic.
  • Fixed InitializeVariancePartition to work with non-int types for Matrix_::index_type.

v3.1.1

28 Nov 07:55
f5b41de
Compare
Choose a tag to compare
  • Minor bugfix for Hartigan-Wong's update history.

v3.1.0

13 Sep 21:01
Compare
Choose a tag to compare
  • Replace PCA-partitioning initialization with the simpler variance-partitioning. The new InitializeVariancePartition class is faster and actually deterministic (no random initialization for approximate PCA), and is close enough to InitializePcaPartition. We also add our own improvement where the partition boundary is chosen to minimize the residual sum of squares of the child partitions.
  • Do not quit Hartigan-Wong prematurely when quick transfers do not converge. This gives the algorithm a chance to perform more optimal transfers to get to a better solution, possibly allowing convergence of quick transfers in subsequent iterations. We add options to set the number of quick transfer iterations as well as to re-enable the premature quitting upon convergence failure.
  • Add a parallelize() function to handle the parallelization choice instead of defining the KMEANS_CUSTOM_PARALLEL macro by default. This avoids issues with nested macros in arbitrary user code.
  • Mitigate numerical instability of Hartigan-Wong by recomputing the centroids and WCSS loss before every optimal transfer iteration. This avoids accumulation of errors in the losses/centroids after many transfers.

v3.0.2

28 Aug 05:38
Compare
Choose a tag to compare
  • Added getters to easily edit options in existing instances of Initialize and Refine subclasses.
  • Use the subpar library for a centralized parallelization scheme. This now sets the KMEANS_CUSTOM_PARALLEL macro by default to subpar::parallelize(), with all the associated improvements.

v3.0.1

11 Jun 14:55
Compare
Choose a tag to compare
  • Templated the SimpleMatrix's dimension index type, for completeness.
  • Bugfix for a <random> includes.
  • Ensure all distance calculations are performed in Float_ for consistent precision.

v3.0.0

07 Jun 14:48
Compare
Choose a tag to compare
  • All Initialize and Refine constructors accept a corresponding Options class that parameterizes its run() calls. This is more intuitive than the previous parameter setters and the Defaults nested structs.
  • The central Kmeans class has been removed in favor of a compute() function that accepts Initialize and Refine instances. This is easier to understand and the user is forced to explicitly choose algorithms.
  • All functions have been generalized to accept any input matrix that satisfies a MockMatrix (compile-time) contract. This allows us to support, e.g., sparse or file-backed matrices in the future.
  • Empty clusters do not cause status to be set to 1, as these are now just ignored. Similarly, status is not set to 3 when there are more requested clusters than observations, as the extra clusters are just ignored.
  • Added a version file for downstream projects and set minimum required versions for all dependencies.

v2.0.0

05 Sep 20:55
Compare
Choose a tag to compare
  • Implemented Lloyd's algorithm for refinement.
  • Implemented a simple mini-batch algorithm for refinement.
  • All refinement algorithms now inherit from a base class.
  • Added PCA partitioning as an initialization method.
  • All initialization algorithms now inherit from a base class.
  • Switch to the aarand library for random distributions.
  • Add CMake install targets.
  • Add LICENSE.

v1.0.0

21 Jul 07:41
6ee4c9d
Compare
Choose a tag to compare

Now that we have code coverage, it's time for our first release 🚀 🎉