Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interruptible execution #4463

Merged
merged 13 commits into from
Feb 8, 2022
Merged

Conversation

achirkin
Copy link
Contributor

@achirkin achirkin commented Dec 22, 2021

Cooperative-style interruptible C++ threads.

This proposal attempts to make cuml experience more responsive by allowing easier way to interrupt/cancel long running cuml tasks. It replaces calls cudaStreamSynchronize with raft::interruptible::synchronize, which serve as a cancellation points in the algorithms. With a small extra hook on the python side, Ctrl+C requests now can interrupt the execution (almost) immediately. At this moment, I adapted just a few models as a proof-of-concept.

Example:

import sklearn.datasets
import cuml.svm

X, y = sklearn.datasets.fetch_olivetti_faces(return_X_y=True)
model = cuml.svm.SVC()
print("Data loaded; fitting... (try Ctrl+C now)")
try:
    model.fit(X, y)
    print("Done! Score:", model.score(X, y))
except Exception as e:
    print("Canceled!")
    print(e)

Implementation details

rapidsai/raft#433

Adoption costs

From the changeset in this PR you can see that I introduce two types of changes:

  1. Change cudaStreamSynchronize to either handle.sync_thread or raft::interruptible::synchronize
  2. Wrap the cython calls with cuda_interruptible and nogil

Change (1) is straightforward and can mostly be automated.

Change (2) is a bit more involved. You definitely have to wrap a C++ call with interruptibleCpp to make Ctrl+C work, but that is also rather simple. The tricky part is adding nogil, because you have to make sure there is no python objects within with nogil block. However, nogil does not seem to be strictly required for the signal handler to successfully interrupt the C++ thread. It worked in my tests without nogil as well. Yet, I chose to add nogil in the code where possible, because in theory it should reduce the interrupt latency and enable more multithreading.

Motivation

In general, this proposal makes executing threads (and thus algos/models) more controllable. The main use cases I see:

  1. Being able to Ctrl+C the running model using signal handlers.
  2. Stopping the thread programmatically, e.g. we can create the tests of sort "if running for more than n seconds, stop and fail".

Resolves #4384

@github-actions github-actions bot added CMake CUDA/C++ Cython / Python Cython or Python issue labels Dec 22, 2021
@achirkin achirkin added Experimental Used to denote experimental features feature request New feature or request non-breaking Non-breaking change labels Dec 22, 2021
@achirkin achirkin changed the title [POC] Interruptible execution Interruptible execution Jan 12, 2022
@achirkin
Copy link
Contributor Author

rerun tests

@achirkin achirkin marked this pull request as ready for review January 12, 2022 14:01
@achirkin achirkin requested review from a team as code owners January 12, 2022 14:01
@achirkin achirkin added the 3 - Ready for Review Ready for review by team label Jan 12, 2022
@achirkin
Copy link
Contributor Author

rerun tests

@achirkin achirkin changed the base branch from branch-22.02 to branch-22.04 January 26, 2022 08:04
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, and glad to see the docs were updated as well!

@achirkin achirkin removed the Experimental Used to denote experimental features label Feb 2, 2022
rapids-bot bot pushed a commit to rapidsai/raft that referenced this pull request Feb 8, 2022
### Cooperative-style interruptible C++ threads.

This proposal introduces `raft::interruptible` introducing three functions:
```C++
static void synchronize(rmm::cuda_stream_view stream);
static void yield();
static void cancel(std::thread::id thread_id);
```
`synchronize` and `yield` serve as cancellation points for the executing CPU thread. `cancel` allows to throw an async exception in a target CPU thread, which is observed in the nearest cancellation point. Altogether, these allow to cancel a long-running job without killing the OS process.

The key to make this work is an obvious observation that the CPU spends most of the time waiting on `cudaStreamSynchronize`. By replacing that with `interruptible::synchronize`, we introduce cancellation points in all critical places in code. If that is not enough in some edge cases (the cancellation points are too far apart), a developer can use `yield` to ensure that a cancellation request is received sooner rather than later.

#### Implementation

##### C++

`raft::interruptible` keeps an `std::atomic_flag` in the thread-local storage in each thread, which tells whether the thread can continue executing (being in non-cancelled state). [`cancel`](https://github.com/rapidsai/raft/blob/6948cab96483ddc7047b1ae0a162574e32bcd8f0/cpp/include/raft/interruptible.hpp#L122) clears this flag, and [`yield`](https://github.com/rapidsai/raft/blob/6948cab96483ddc7047b1ae0a162574e32bcd8f0/cpp/include/raft/interruptible.hpp#L194-L204) checks it and resets to the signalled state (throwing a `raft::interrupted_exception` exception if necessary). [`synchronize`](https://github.com/rapidsai/raft/blob/6948cab96483ddc7047b1ae0a162574e32bcd8f0/cpp/include/raft/interruptible.hpp#L206-L217) implements a spinning lock querying the state of the stream and `yield`ing on each iteration. I also add an overload [`sync_stream`](https://github.com/rapidsai/raft/blob/ee99523ff6a8257ec213e5ad15292f2132a2a687/cpp/include/raft/handle.hpp#L133) to the raft handle type, to make it easier to modify the behavior of all synchronization calls in raft and cuml.

##### python
This proposal adds a context manager [`cuda_interruptible`](https://github.com/rapidsai/raft/blob/36e8de5f73e9ec7e604b38a4290ac82bc35be4b7/python/raft/common/interruptible.pyx#L28) to handle Ctrl+C requests during C++ calls (using posix signals). `cuda_interruptible` simply calls `raft::interruptible::cancel` on the target C++ thread.

#### Motivation
See rapidsai/cuml#4463

Resolves rapidsai/cuml#4384

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #433
@github-actions github-actions bot removed the CMake label Feb 8, 2022
@cjnolet
Copy link
Member

cjnolet commented Feb 8, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 7e7832d into rapidsai:branch-22.04 Feb 8, 2022
@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.04@f95e93b). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-22.04    #4463   +/-   ##
===============================================
  Coverage                ?   85.71%           
===============================================
  Files                   ?      236           
  Lines                   ?    19364           
  Branches                ?        0           
===============================================
  Hits                    ?    16597           
  Misses                  ?     2767           
  Partials                ?        0           
Flag Coverage Δ
dask 46.46% <0.00%> (?)
non-dask 78.62% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f95e93b...6f4543e. Read the comment docs.

vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
### Cooperative-style interruptible C++ threads.

This proposal attempts to make cuml experience more responsive by allowing easier way to interrupt/cancel long running cuml tasks. It replaces calls `cudaStreamSynchronize` with `raft::interruptible::synchronize`, which serve as a cancellation points in the algorithms. With a small extra hook on the python side, Ctrl+C requests now can interrupt the execution (almost) immediately. At this moment, I adapted just a few models as a proof-of-concept.

Example:
```python
import sklearn.datasets
import cuml.svm

X, y = sklearn.datasets.fetch_olivetti_faces(return_X_y=True)
model = cuml.svm.SVC()
print("Data loaded; fitting... (try Ctrl+C now)")
try:
    model.fit(X, y)
    print("Done! Score:", model.score(X, y))
except Exception as e:
    print("Canceled!")
    print(e)
```
#### Implementation details
rapidsai/raft#433

#### Adoption costs
From the changeset in this PR you can see that I introduce two types of changes:
  1. Change `cudaStreamSynchronize` to either `handle.sync_thread` or `raft::interruptible::synchronize`
  2. Wrap the cython calls with  [`cuda_interruptible`](https://github.com/rapidsai/raft/blob/36e8de5f73e9ec7e604b38a4290ac82bc35be4b7/python/raft/common/interruptible.pyx#L28) and `nogil`

Change (1) is straightforward and can mostly be automated.

Change (2) is a bit more involved. You definitely have to wrap a C++ call with `interruptibleCpp` to make `Ctrl+C` work, but that is also rather simple. The tricky part is adding `nogil`, because you have to make sure there is no python objects within `with nogil` block. However, `nogil` does not seem to be strictly required for the signal handler to successfully interrupt the C++ thread. It worked in my tests without `nogil` as well. Yet, I chose to add `nogil` in the code where possible, because in theory it should reduce the interrupt latency and enable more multithreading.

#### Motivation
In general, this proposal makes executing threads (and thus algos/models) more controllable. The main use cases I see:

  1. Being able to Ctrl+C the running model using signal handlers.
  2. Stopping the thread programmatically, e.g. we can create the tests of sort "if running for more than n seconds, stop and fail".

Resolves rapidsai#4384

Authors:
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#4463
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CUDA/C++ Cython / Python Cython or Python issue feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Interruptible execution
3 participants