Releases: LTLA/CppKmeans
Releases · LTLA/CppKmeans
v3.1.2
v3.1.1
v3.1.0
- Replace PCA-partitioning initialization with the simpler variance-partitioning. The new
InitializeVariancePartition
class is faster and actually deterministic (no random initialization for approximate PCA), and is close enough toInitializePcaPartition
. We also add our own improvement where the partition boundary is chosen to minimize the residual sum of squares of the child partitions. - Do not quit Hartigan-Wong prematurely when quick transfers do not converge. This gives the algorithm a chance to perform more optimal transfers to get to a better solution, possibly allowing convergence of quick transfers in subsequent iterations. We add options to set the number of quick transfer iterations as well as to re-enable the premature quitting upon convergence failure.
- Add a
parallelize()
function to handle the parallelization choice instead of defining theKMEANS_CUSTOM_PARALLEL
macro by default. This avoids issues with nested macros in arbitrary user code. - Mitigate numerical instability of Hartigan-Wong by recomputing the centroids and WCSS loss before every optimal transfer iteration. This avoids accumulation of errors in the losses/centroids after many transfers.
v3.0.2
v3.0.1
v3.0.0
- All
Initialize
andRefine
constructors accept a correspondingOptions
class that parameterizes itsrun()
calls. This is more intuitive than the previous parameter setters and theDefaults
nested structs. - The central
Kmeans
class has been removed in favor of a compute() function that accepts Initialize and Refine instances. This is easier to understand and the user is forced to explicitly choose algorithms. - All functions have been generalized to accept any input matrix that satisfies a
MockMatrix
(compile-time) contract. This allows us to support, e.g., sparse or file-backed matrices in the future. - Empty clusters do not cause
status
to be set to 1, as these are now just ignored. Similarly,status
is not set to 3 when there are more requested clusters than observations, as the extra clusters are just ignored. - Added a version file for downstream projects and set minimum required versions for all dependencies.
v2.0.0
- Implemented Lloyd's algorithm for refinement.
- Implemented a simple mini-batch algorithm for refinement.
- All refinement algorithms now inherit from a base class.
- Added PCA partitioning as an initialization method.
- All initialization algorithms now inherit from a base class.
- Switch to the aarand library for random distributions.
- Add CMake install targets.
- Add LICENSE.