Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] How to check if GDS is enabled #8838

Closed
a3634 opened this issue Jul 23, 2021 · 4 comments
Closed

[QST] How to check if GDS is enabled #8838

a3634 opened this issue Jul 23, 2021 · 4 comments
Labels
question Further information is requested

Comments

@a3634
Copy link

a3634 commented Jul 23, 2021

Hi. I read the document about GPUDirect storage. Since there is not an example so I want to ask it here.

First, is this the right code to enable GDS?
'''
os.environ['LIBCUDF_CUFILE_POLICY'] = 'ALWAYS'
'''
Second, the document says when the variable is set to 'ALWAYS', it would throw an exception when GDS fails. Then if cudf.read_csv() works that would mean GDS is successful?

@a3634 a3634 added Needs Triage Need team to review and classify question Further information is requested labels Jul 23, 2021
@quasiben
Copy link
Member

  1. Yes, either setting os.environ or on the CLI:

LIBCUDF_CUFILE_POLICY=ALWAYS python script.py

  1. I believe that is correct, GDS successfully read in the file but it's worth point out the note below in that doc:

NOTE: current GDS integration is not fully optimized and enabling GDS will not lead to performance improvements in all cases.

@vuule has a PR to update the docs with all the GDS-enabled formats: #8805

@beckernick beckernick removed the Needs Triage Need team to review and classify label Jul 23, 2021
@vuule
Copy link
Contributor

vuule commented Jul 23, 2021

If you set the variable to 'ALWAYS', the cuFile library will always be used for IO. In addition, the cufile's compatibility mode will be enabled. This means that cuFile will do host reads/writes + memcopy if GDS is not available on the given drive/system.
So, there's no setting that will guarantee that direct GDS reads/writes are being performed. If this is what you want to check, currently the only way is to enable logging in cufile.json and look at the output after running the test.
We are planning to add logging on the cudf side exactly for this reason - make it easier to check the GDS behavior.

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@quasiben
Copy link
Member

I think we can close this. We also have moved to having GDS on by default:
#9722

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants