Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intermediate type checking in expression parsing #14445

Merged
merged 4 commits into from
Nov 18, 2023

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Nov 17, 2023

Description

When parsing expressions, device data references are reused if there are multiple that are identical. Equality is determined by comparing the fields of the reference, but previously the data type was omitted. For column and literal references, this is OK because the data_index uniquely identifies the reference. For intermediates, however, the index is not sufficient to disambiguate because an expression could reuse a given location even if the operation produces a different data type. Therefore, the data type must be part of the equality operator.

Resolves #14409

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vyasr vyasr added bug Something isn't working 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Nov 17, 2023
@vyasr vyasr self-assigned this Nov 17, 2023
@vyasr vyasr requested a review from a team as a code owner November 17, 2023 20:28
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This totally makes sense. I agree with this fix and would support it going into 23.12. Thanks @vyasr.

@vyasr
Copy link
Contributor Author

vyasr commented Nov 17, 2023

/merge

@aocsa
Copy link
Contributor

aocsa commented Nov 17, 2023

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

@vyasr
Copy link
Contributor Author

vyasr commented Nov 17, 2023

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

@aocsa
Copy link
Contributor

aocsa commented Nov 17, 2023

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

Sure I can do that.

@rapids-bot rapids-bot bot merged commit 723c565 into rapidsai:branch-23.12 Nov 18, 2023
65 checks passed
@vyasr vyasr deleted the fix/issue_14409 branch November 18, 2023 02:41
@aocsa
Copy link
Contributor

aocsa commented Nov 21, 2023

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

Sure I can do that.

@vyasr, I have created a PR that introduces unit test described as before. You can find it here. Good thing, I did not identify any new issues.

@vyasr
Copy link
Contributor Author

vyasr commented Nov 21, 2023

Thanks! I'll have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Inconsistency in cudfAST Evaluation for Complex Expressions
4 participants