Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually open and close the cuFile driver #160

Merged

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Jan 26, 2023

cuFile is supposed to open and close the driver automatically but because of a bug in CUDA 11.8, it sometimes segfault.
Closes #159

cc. @vuule

@madsbk madsbk added bug Something isn't working improvement Improves an existing functionality non-breaking Introduces a non-breaking change and removed improvement Improves an existing functionality labels Jan 26, 2023
@madsbk madsbk changed the title manually open and close the cuFile driver Manually open and close the cuFile driver Jan 26, 2023
@madsbk madsbk marked this pull request as ready for review January 26, 2023 10:22
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good except for the suspicious cerr output

Comment on lines +80 to +81
std::cerr << "Unable to close GDS file driver: " << cufileop_status_error(error.err)
<< std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a temporary change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, to avoid throwing an exception in the destructor, we print a warning instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, no exceptions in the dtor.
Posted the comment because libcudf does not do any std output, thought the same might hold for kvikio

@madsbk
Copy link
Member Author

madsbk commented Jan 26, 2023

Notice, the CI error is unrelated and fixed in #156

@quasiben
Copy link
Member

rerun tests

@quasiben
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 3fb9556 into rapidsai:branch-23.02 Jan 26, 2023
@madsbk madsbk deleted the explicit_init_cufile_driver branch January 30, 2023 09:51
vuule pushed a commit to vuule/kvikio that referenced this pull request Nov 8, 2023
…dules-defining-getattr

Always find the fast types/function corresponding to slow types/functions
rapids-bot bot pushed a commit that referenced this pull request Nov 1, 2024
Changes:

- Adding Python bindings to `cuFileDriverOpen()` and `cuFileDriverClose()`. 
- We now [only open the cufile driver explicitly](#160) in CUDA versions older than v12.2.
- Introducing `kvikio.cufile_driver.initialize()`, which open the cuFile driver and close it again at module exit.
- Let CI fail if KvikIO wasn't built with cuFile support.
  * Except on cuda11.8+arm64; cuFile didn't support arm until cuda v12.4.
- Some refactor and clean up!

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #514
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash/segfault on exit when running libcudf tests with kvikIO and CUDA 11.8
3 participants