-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial draft of policies and guidelines for libcudf usage. #11853
Initial draft of policies and guidelines for libcudf usage. #11853
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small clarification request.
|
||
Users are therefore responsible for passing valid data into such APIs. | ||
_Note that this policy does not mean that libcudf performs no validation whatsoever_. | ||
libcudf APIs should still perform any validation that does not require introspection, such as checking whether input column/table sizes or dtypes are appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extends to output sizes and results as well.
In general there is no validation that an operation will produce a column that fits in the 2GB limit. There are a few minor exceptions where the validation does not require an extra kernel launch.
Also, operations that may produce overflow (integer or floating-point arithmetic) are also not checked. Overflow produces undefined results. For example, integer overflow to negative values is not guaranteed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent here was not to provide exhaustive lists of things we will and will not check, but rather to provide some examples of cases to illustrate the difference between introspective and non-introspective operations. I am wary of trying to maintain a complete list that will inevitably go out of date the day after it is published. Do you think it's worth trying to keep a larger list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose I would like to keep a list of things that keep being brought up (like these two here) so that I can point to this document to say this is by design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this does not need to be exhaustive, but it would be nice if this policy guide can serve as a mini-FAQ of "why is libcudf designed this way?" I think that the concrete examples @davidwendt mentioned help illustrate the philosophy of the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some, and I've reworded this section to reflect that we want to keep a list but make no promises that it's exhaustive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to see us writing down some things that have been tribal knowledge for a while now.
Co-authored-by: Jake Hemstad <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments. Thanks!
Co-authored-by: Bradley Dice <[email protected]>
@davidwendt if you are satisfied with how I've addressed your comments feel free to merge this once you approve. |
@gpucibot merge |
Description
This PR adds a section to the developer documentation about various libcudf design decisions that affect users. These policies are important for us to document and communicate consistently. I am not sure what the best place for this information is, but I think the developer docs are a good place to start since until we address #11481 we don't have a great way to publish any non-API user-facing libcudf documentation. I've created this draft PR to solicit feedback from other libcudf devs about other policies that we should be documenting in a similar manner. Once everyone is happy with the contents, I would suggest that we merge this into the dev docs for now and then revisit a better place once we've tackled #11481.
Partly addresses #5505, #1781.
Resolves #4511.
Checklist