Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clustering & Minor Patches in JS, Rust, & Java SDKs #503

Merged
merged 39 commits into from
Oct 29, 2024
Merged

Conversation

ashvardanian
Copy link
Contributor

@ashvardanian ashvardanian commented Oct 10, 2024

Many data scientists embark on their journey by implementing K-Means clustering, much like app developers starting with a calculator. But despite K-Means’ popularity, most implementations overlook the power of SIMD on modern CPUs. Efficient vector math, especially with single- and double-precision floating-point vectors, is challenging due to the computational cost of accuracy. Meanwhile, float16, bfloat16, and smaller types can fail under uneven distributions or when computing centroids for large clusters. So, what’s Unum’s solution? Mixed precision!

Thanks to strong community support and sponsorship from @sutoiku (LinkedIn, Website), we're introducing a high-performance K-Means implementation! It utilizes any numeric type for distance calculations, switching to float64 for centroid updates, a technique that boosts performance and enables billion-scale clustering on a single machine.

@ashvardanian ashvardanian changed the title JS, Rust, and Java Patches Clustering & Minor Patches in JS, Rust, & Java SDKs Oct 10, 2024
I got an error when I loaded and searched with load() or view().

Code Example:

```js
// Saved with `index.save('index.usearch');` in another script.
index.load('index.usearch');
const results = index.search(new Float32Array([0.2, 0.6, 0.4]), 10);
```
@ashvardanian
Copy link
Contributor Author

@abetomo, the last PR seems to break the CI. Any ideas, why?

@abetomo
Copy link
Contributor

abetomo commented Oct 12, 2024

abetomo and others added 7 commits October 14, 2024 03:19
The test itself succeeds, but fails with the following error when deleting the index file created by save() in afterEach().

```
error: "EBUSY: resource busy or locked, unlink 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\usearch.test.index'"
```

Since it is only in Winodws that it fails, we will skip it on Winodws for now.
We will continue to investigate the solution.
Add Rust and Android CI build
The index read by `view()` is read-only.
When I did a `remove()` on that index, it crashed.

---------

Co-authored-by: Ash Vardanian <[email protected]>
---------

Co-authored-by: Mikhail Bautin <[email protected]>
@ashvardanian ashvardanian merged commit 7986d8e into main Oct 29, 2024
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants