Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assert for non-empty nulls #13071

Merged
merged 20 commits into from
Apr 24, 2023
Merged

Assert for non-empty nulls #13071

merged 20 commits into from
Apr 24, 2023

Conversation

razajafri
Copy link
Contributor

@razajafri razajafri commented Apr 5, 2023

Added assert check to ColumnView creation to ensure there are no non-empty nulls at the time of creation. This also contains changes to bypass this check by setting -da:ai.rapids.cudf.AssertEmptyNulls.

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Added assert check to ColumnView creation and
appropriate changes to bypass the check for tests
@razajafri razajafri added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 5, 2023
@razajafri razajafri self-assigned this Apr 5, 2023
@razajafri razajafri requested a review from a team as a code owner April 5, 2023 19:04
@github-actions github-actions bot added the Java Affects Java cuDF API. label Apr 5, 2023
java/pom.xml Outdated Show resolved Hide resolved
@razajafri razajafri requested a review from ttnghia April 6, 2023 17:47
@razajafri
Copy link
Contributor Author

@revans2

Would you mind taking a look at this when you have time. This addresses the issue of imported data having non-empty nulls. I just had to add another system property for the unit-tests to pass. PTAL

@razajafri
Copy link
Contributor Author

There seem to be sync problems from the fork and the latest changes are not showing here.

@razajafri razajafri requested a review from revans2 April 18, 2023 18:18
@razajafri
Copy link
Contributor Author

@revans2 @ttnghia

I think I have addressed all your concerns, PTAL

java/pom.xml Outdated Show resolved Hide resolved
@razajafri
Copy link
Contributor Author

CI failure has nothing to do with the changes in this PR.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit.

@razajafri
Copy link
Contributor Author

@revans2 I wanted to see what you think about this.

Considering this is something that CUDF requires and assertions in Java aren't enabled by default, should we transition this to a hard check and throw an Exception instead of using an assert? We can do this at a later time too, I just wanted to know if we need a follow-on issue for this.

@revans2
Copy link
Contributor

revans2 commented Apr 20, 2023

@revans2 I wanted to see what you think about this.

Considering this is something that CUDF requires and assertions in Java aren't enabled by default, should we transition this to a hard check and throw an Exception instead of using an assert? We can do this at a later time too, I just wanted to know if we need a follow-on issue for this.

We want the assertion off in production because checking if they are empty involves launching at least one kernel and is rather expensive. Having it be an assertion lets us verify that it is doing the right thing in tests, but we get the performance in production. It also lets us enable it in production if we think there is a problem. That said, in some cases where we know that we would get data corruption if we get a non-empty null, it is probably worth doing a check and then fixing the problem, at least until we feel comfortable enough that we have fixed all of the issues.

@razajafri
Copy link
Contributor Author

rebuild

@razajafri
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 777c1f4 into rapidsai:branch-23.06 Apr 24, 2023
@razajafri razajafri deleted the asserts branch April 24, 2023 17:55
rapids-bot bot pushed a commit that referenced this pull request Apr 25, 2023
…FileTest and CudaFatalTest (#13213)

As part of #13071 changes were made to add tests that will fail unless assertions are off for those tests. This PR adds `ColumnViewNonEmptyNullsTest` to the list of skipped tests

Authors:
  - Raza Jafri (https://github.com/razajafri)

Approvers:
  - Jason Lowe (https://github.com/jlowe)

URL: #13213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants