Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT-unspill: fix potential deadlock #501

Merged
merged 1 commit into from
Jan 27, 2021

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Jan 27, 2021

Fixes a deadlock where multiple threads accesses ProxifyHostFile.maybe_evict() but none of them can acquire both the ProxifyHostFile lock and the ProxyObject lock simultaneously.

Should fix_ rapidsai/gpu-bdb#162 (comment)

@madsbk madsbk requested a review from a team as a code owner January 27, 2021 16:37
@madsbk madsbk added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 27, 2021
@codecov-io
Copy link

codecov-io commented Jan 27, 2021

Codecov Report

Merging #501 (a755803) into branch-0.18 (32d9d33) will increase coverage by 0.93%.
The diff coverage is 95.01%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18     #501      +/-   ##
===============================================
+ Coverage        90.42%   91.35%   +0.93%     
===============================================
  Files               15       18       +3     
  Lines             1128     1435     +307     
===============================================
+ Hits              1020     1311     +291     
- Misses             108      124      +16     
Impacted Files Coverage Δ
dask_cuda/cli/dask_cuda_worker.py 96.92% <ø> (ø)
dask_cuda/cuda_worker.py 78.75% <75.00%> (+1.73%) ⬆️
dask_cuda/device_host_file.py 90.90% <80.00%> (-7.96%) ⬇️
dask_cuda/get_device_memory_objects.py 89.04% <89.04%> (ø)
dask_cuda/proxify_device_objects.py 91.83% <91.83%> (ø)
dask_cuda/proxy_object.py 89.80% <95.04%> (+2.00%) ⬆️
dask_cuda/explicit_comms/comms.py 99.02% <100.00%> (+0.01%) ⬆️
dask_cuda/explicit_comms/utils.py 100.00% <100.00%> (ø)
dask_cuda/local_cuda_cluster.py 81.17% <100.00%> (+0.68%) ⬆️
dask_cuda/proxify_host_file.py 100.00% <100.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2739aa...a755803. Read the comment docs.

@madsbk madsbk added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jan 27, 2021
try:
hostfile.maybe_evict(self.__sizeof__())
finally:
hostfile.lock.release()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for future reading: hostfile.lock is an RLock, so this block is ok.

Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @madsbk !

@pentschev pentschev added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Jan 27, 2021
@pentschev
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a215714 into rapidsai:branch-0.18 Jan 27, 2021
@madsbk madsbk deleted the jit-unspill-deadlock-fix branch January 28, 2021 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants