-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MG PLC algos intermittent hang #2607
Fix MG PLC algos intermittent hang #2607
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #2607 +/- ##
===============================================
Coverage ? 60.07%
===============================================
Files ? 112
Lines ? 6154
Branches ? 0
===============================================
Hits ? 3697
Misses ? 2457
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am requesting to delete cudf_result
, looks good otherwise.
# the same PLC graph, the current iteration might try to cache | ||
# the past iteration's futures and this can cause a hang if some | ||
# of those futures get released midway | ||
del result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removing result
is still a good practice as when cudf_result
is completed we will still have result
in memory so from a memory relief standpoint it makes sense.
|
||
ddf = dask_cudf.from_delayed(cudf_result).persist() | ||
wait(ddf) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we also del cudf_result
here to save on memory . We have a renumbered call after this so result
does not go out of scope immediately, hence its important .
rerun tests |
1 similar comment
rerun tests |
rerun tests |
rerun tests |
@gpucibot merge |
Dask doesn't always release some of the inactive futures fast enough. This can be problematic when running the same
algo
several times with the samePLC graph
because those futures can be cache in the next iteration causing a hang if some get released midway.This PR manually delete inactive futures.
closes #2568