-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]Optimize cugraph-DGL csc codepath #3977
[REVIEW]Optimize cugraph-DGL csc codepath #3977
Conversation
/ok to test |
# Note: We transfer tensors to CPU here to avoid the overhead of | ||
# transferring them in each iteration of the for loop below. | ||
major_offsets_cpu = major_offsets.to("cpu").numpy() | ||
label_hop_offsets_cpu = label_hop_offsets.to("cpu").numpy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tingyu66 , This is the main optimization because transferring b/w CPU ->GPU 1 tensor at a time was slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for resolving the .item()
overhead. 👍
@@ -22,7 +22,6 @@ | |||
get_allocation_counts_dask_lazy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oorliu ,
The only dgl specific args for our benchmarking efforts are:
--reverse_edges \
--sampling_target_framework cugraph_dgl_csr
/ok to test |
benchmarks/cugraph/standalone/bulk_sampling/cugraph_bulk_sampling.py
Outdated
Show resolved
Hide resolved
…ing.py Co-authored-by: Alex Barghi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
else: | ||
# FIXME: Update these arguments when CSC mode is fixed in cuGraph-PyG (release 24.02) | ||
sampling_kwargs = { | ||
"deduplicate_sources": True, | ||
"prior_sources_behavior": "exclude", | ||
"renumber": True, | ||
"compression": "COO", | ||
"compress_per_hop": False, | ||
"use_legacy_names": False, | ||
"include_hop_column": True | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this setting also work for cugraph-dgl COO code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have to test, I just focussed on CSC
code path to ensure we have success there (Given it is the fastest one) but I dont see why it will not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But prior_sources_behavior
needs to be True for dgl, right? I think COO path would be useful in the future for debugging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tingyu66 , I have not spent time exploring COO
code path, do you think I should focus on it , do we expect it to have equivalent speed (maybe after optimizations) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will have to add something like sampling_target_framework=='cugraph_dgl_coo'
with prior_sources_behavior =True
. The above path is for cugraph-pyG
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think getting COO up to speed is a priority, but it would be useful to include/document the parameter combinations needed for COO path, unless we no longer support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add that as a followup:
Filed #3981 to track.
/merge |
/ok to test |
/ok to test |
This PR optimizes
cugraph-DGL
csc codepath and adds an end to end benchmark using cugraph-dgl.With PR :
MAIN:
E2E Benchmarks:
PR:
MAIN: