Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix indexing of PCA to use safer types #4255

Merged
merged 15 commits into from
Nov 5, 2021

Conversation

lowener
Copy link
Contributor

@lowener lowener commented Sep 30, 2021

Related to #4105.

This PR fixes the types used for indexing and sizes in the PCA/TSVD C++ code.

@lowener lowener requested review from a team as code owners September 30, 2021 19:52
@github-actions github-actions bot added CMake CUDA/C++ Cython / Python Cython or Python issue labels Sep 30, 2021
@lowener lowener added breaking Breaking change improvement Improvement / enhancement to an existing function labels Sep 30, 2021
@github-actions github-actions bot removed the CMake label Oct 1, 2021
@cjnolet
Copy link
Member

cjnolet commented Oct 12, 2021

@lowener, it would be helpful to also change the raft branch in this PR to your branch from the corresponding raft pr so we can verify the code works before merging the raft side.

@github-actions github-actions bot added the CMake label Oct 12, 2021
@lowener
Copy link
Contributor Author

lowener commented Oct 13, 2021

rerun tests

1 similar comment
@lowener
Copy link
Contributor Author

lowener commented Oct 14, 2021

rerun tests

@lowener
Copy link
Contributor Author

lowener commented Oct 27, 2021

rerun tests

@lowener
Copy link
Contributor Author

lowener commented Oct 27, 2021

rerun tests

@lowener
Copy link
Contributor Author

lowener commented Oct 27, 2021

As a note, the CI failed (after 5 hours) on a timeout on cuml.test.test_nearest_neighbors.test_ivfpq_pred[8-512-4000-True-4-32-8]: Failed: Timeout >300.0s

And a build fail with

01:14:55   conda.CondaMultiError: Downloaded bytes did not match Content-Length
01:14:55   url: https://conda.anaconda.org/nvidia/linux-64/cudatoolkit-11.0.221-h6bb024c_0.tar.bz2
01:14:55   target_path: /opt/conda/envs/rapids/pkgs/cudatoolkit-11.0.221-h6bb024c_0.tar.bz2
01:14:55   Content-Length: 999268491
01:14:55   downloaded bytes: 79332612```

@lowener
Copy link
Contributor Author

lowener commented Oct 27, 2021

rerun tests

1 similar comment
@lowener
Copy link
Contributor Author

lowener commented Oct 28, 2021

rerun tests

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be careful to keep the precision relevant to the upper bounds of the individual arguments. For example, arguments like n_parts and n_stream will never go beyond 32-bit so we should use an int or uint32_t there. I'm also inclined to keep the precision of loop counters relevant to the upper bounds of their types, but that's more of a nitpick on host (there's a performance impact on device).

Overall it looks great, the changes I've requested are mostly repetitive and mechanical.

I think we can keep n_components as 64-bit since its upper-bound is related to n_cols/n_rows.

cpp/cmake/thirdparty/get_raft.cmake Outdated Show resolved Hide resolved
cpp/include/cuml/decomposition/sign_flip_mg.hpp Outdated Show resolved Hide resolved
cpp/include/cuml/decomposition/tsvd_mg.hpp Outdated Show resolved Hide resolved
cpp/include/cuml/decomposition/params.hpp Show resolved Hide resolved
cpp/include/cuml/decomposition/params.hpp Show resolved Hide resolved
cpp/include/cuml/decomposition/sign_flip_mg.hpp Outdated Show resolved Hide resolved
cpp/include/cuml/decomposition/tsvd_mg.hpp Outdated Show resolved Hide resolved
cpp/src/pca/pca_mg.cu Outdated Show resolved Hide resolved
cpp/src/pca/pca_mg.cu Outdated Show resolved Hide resolved
cpp/src/pca/pca_mg.cu Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@4775124). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.12    #4255   +/-   ##
===============================================
  Coverage                ?   86.04%           
===============================================
  Files                   ?      231           
  Lines                   ?    18714           
  Branches                ?        0           
===============================================
  Hits                    ?    16103           
  Misses                  ?     2611           
  Partials                ?        0           
Flag Coverage Δ
dask 47.00% <0.00%> (?)
non-dask 78.74% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4775124...fdf4254. Read the comment docs.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cjnolet
Copy link
Member

cjnolet commented Nov 5, 2021

@gpucibot merge

1 similar comment
@cjnolet
Copy link
Member

cjnolet commented Nov 5, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0342103 into rapidsai:branch-21.12 Nov 5, 2021
@lowener lowener deleted the 21.12-indexing-pca branch November 6, 2021 16:54
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
Related to rapidsai#4105.

This PR fixes the types used for indexing and sizes in the PCA/TSVD C++ code.

Authors:
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#4255
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake CUDA/C++ Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants