Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing large data in kmeans #5228

Merged
merged 15 commits into from
Feb 14, 2023

Conversation

cjnolet
Copy link
Member

@cjnolet cjnolet commented Feb 10, 2023

This is somewhat of an intermediate solution to fix an issue w/ kmeans that is currently prohibiting the algorithm from working when n_rows * n_cols > 2^32-1.

For now, using 64-bit indexing allows the algorithm to work. We need to dig in further on the raft side to figure out where the numerical issue is happening. So far, some of this has to do w/ the new APIs we created and how it's handling indexing.

@cjnolet cjnolet added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 10, 2023
@cjnolet cjnolet requested a review from a team as a code owner February 10, 2023 22:11
@cjnolet cjnolet self-assigned this Feb 10, 2023
@cjnolet cjnolet requested review from a team as code owners February 10, 2023 22:11
@github-actions github-actions bot added CMake CUDA/C++ Cython / Python Cython or Python issue labels Feb 10, 2023
cpp/src/kmeans/kmeans_mg_impl.cuh Outdated Show resolved Hide resolved
@ajschmidt8 ajschmidt8 requested a review from a team as a code owner February 13, 2023 18:57
@cjnolet cjnolet force-pushed the bug-2304-kmeans_64bit_int branch from 10ece34 to 74431dd Compare February 13, 2023 22:11
@ajschmidt8 ajschmidt8 removed the request for review from a team February 13, 2023 22:14
@cjnolet cjnolet added bug Something isn't working and removed improvement Improvement / enhancement to an existing function labels Feb 14, 2023
@dantegd
Copy link
Member

dantegd commented Feb 14, 2023

/merge

@rapids-bot rapids-bot bot merged commit dc38afc into rapidsai:branch-23.04 Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CMake CUDA/C++ Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants