Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use grid stride pairwise distance and fusedL2NN kernels #3891

Closed
wants to merge 10 commits into from

Conversation

mdoijade
Copy link
Contributor

No description provided.

@mdoijade mdoijade requested a review from a team as a code owner May 24, 2021 14:12
@dantegd
Copy link
Member

dantegd commented May 24, 2021

The errors in CI seem to be related to the updates in the RAFT PR if I’m not mistaken (just checking that CI is working correctly here)?:

cuml.test.test_metrics.test_pairwise_distances_sklearn_comparison[matrix_size0-cityblock]
cuml.test.test_metrics.test_pairwise_distances_sklearn_comparison[matrix_size0-l1]
cuml.test.test_metrics.test_pairwise_distances_sklearn_comparison[matrix_size0-manhattan]

@mdoijade
Copy link
Contributor Author

@dantegd not sure if this is issue in the RAFT PR as the all pytest passes on "gpuCI/cuml/gpu/cuda/11.2/python/3.8/centos7" but those 3 tests are failing on "gpuCI/cuml/gpu/cuda/11.2/python/3.8/ubuntu18.04" "gpuCI/cuml/gpu/cuda/11.0/python/3.7/ubuntu16.04".
Trying to repro locally, will update soon.

@teju85
Copy link
Member

teju85 commented May 25, 2021

rerun tests

@mdoijade
Copy link
Contributor Author

@teju85 @dantegd
I cannot reproduce this issue locally, tried on 2 ubuntu machines.
but I see that when I run --run_quality L1 and sqrt-L2 are having precision issue but that is for fp32 and not fp64. and this is not related to this PR I can repro this on a older commit as well.
I think I can fix that for now by reducing the precision in pytest test_metrics.py?

@mdoijade mdoijade requested a review from a team as a code owner May 26, 2021 15:16
@github-actions github-actions bot added the Cython / Python Cython or Python issue label May 26, 2021
@dantegd
Copy link
Member

dantegd commented May 28, 2021

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.06@95efa25). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.06    #3891   +/-   ##
===============================================
  Coverage                ?   85.07%           
===============================================
  Files                   ?      228           
  Lines                   ?    17775           
  Branches                ?        0           
===============================================
  Hits                    ?    15122           
  Misses                  ?     2653           
  Partials                ?        0           
Flag Coverage Δ
dask 48.18% <0.00%> (?)
non-dask 77.26% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95efa25...ef1012d. Read the comment docs.

@teju85 teju85 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Perf Related to runtime performance of the underlying code labels Jun 2, 2021
Copy link
Member

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

Just a note for other reviewers: Before merging, we just need to revert the get_raft.cmake.

@teju85
Copy link
Member

teju85 commented Jun 2, 2021

@dantegd can we get python-side approval for this PR, please?

@mdoijade please revert the cmake changes as described above.

@@ -940,7 +940,7 @@ def test_pairwise_distances_sklearn_comparison(metric: str, matrix_size):
Y = rng.random_sample(matrix_size)

# For fp64, compare at 10 decimals, (5 places less than the ~15 max)
compare_precision = 10
compare_precision = 7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we need this changes asap that rapidsai/raft#232 has been merged, @mdoijade can you update the PR to remove the change in get_raft so we can merge?

@dantegd
Copy link
Member

dantegd commented Jun 2, 2021

@teju85 didn't see your comment before leaving the review saying the same. Will approve immediately as the get_raft change is rolled back, just wanted to avoid an accidental merge

@github-actions github-actions bot removed the CUDA/C++ label Jun 2, 2021
@github-actions github-actions bot removed the CMake label Jun 2, 2021
@mdoijade
Copy link
Contributor Author

mdoijade commented Jun 2, 2021

@teju85 didn't see your comment before leaving the review saying the same. Will approve immediately as the get_raft change is rolled back, just wanted to avoid an accidental merge

@dantegd I've reverted the get_raft change from this PR.

@mdoijade mdoijade changed the title [WIP] use mdoijade raft fork for grid stride pairwise distance and fusedL2NN kernels use grid stride pairwise distance and fusedL2NN kernels Jun 2, 2021
@dantegd
Copy link
Member

dantegd commented Jun 2, 2021

@gpucibot merge

@dantegd
Copy link
Member

dantegd commented Jun 2, 2021

@teju85 @mdoijade various distance tests failed:

cuml.test.test_metrics.test_pairwise_distances_sklearn_comparison[matrix_size0-cityblock]
cuml.test.test_metrics.test_pairwise_distances_sklearn_comparison[matrix_size0-manhattan]

Given timing, we should revert the changes in the RAFT side and do the distances for 21.08

rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Jun 2, 2021
After the merge of #232, a few different tests failed in rapidsai/cuml#3891, given the timing I think it'd be best to target 232 (again) to 21.08 after triaging the issues.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Divye Gala (https://github.com/divyegala)
  - Brad Rees (https://github.com/BradReesWork)

URL: #246
@caryr35 caryr35 changed the base branch from branch-21.06 to branch-21.08 June 3, 2021 21:23
@mdoijade mdoijade closed this Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Perf Related to runtime performance of the underlying code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants