Fix intermediate type checking in expression parsing #14445

vyasr · 2023-11-17T20:28:52Z

Description

When parsing expressions, device data references are reused if there are multiple that are identical. Equality is determined by comparing the fields of the reference, but previously the data type was omitted. For column and literal references, this is OK because the data_index uniquely identifies the reference. For intermediates, however, the index is not sufficient to disambiguate because an expression could reuse a given location even if the operation produces a different data type. Therefore, the data type must be part of the equality operator.

Resolves #14409

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

bdice

This totally makes sense. I agree with this fix and would support it going into 23.12. Thanks @vyasr.

vyasr · 2023-11-17T21:40:03Z

/merge

aocsa · 2023-11-17T22:27:26Z

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

vyasr · 2023-11-17T22:37:43Z

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

aocsa · 2023-11-17T22:47:16Z

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

Sure I can do that.

aocsa · 2023-11-21T17:40:46Z

Regarding the issue at #14409, do you think it might be a good idea to consider implementing unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, and so on? I believe it would be beneficial to introduce these pathological tests to conduct comprehensive testing and stress the AST, ultimately aiding in the identification and resolution of any potential issues. cc @felipeblazing

Yes absolutely. Would you be willing to contribute some new tests? You can follow the pattern in this PR. I'm afraid I won't have time to add them here, but if you opened a PR with such tests and it revealed issue I would be happy to help debug them later.

Sure I can do that.

@vyasr, I have created a PR that introduces unit test described as before. You can find it here. Good thing, I did not identify any new issues.

vyasr · 2023-11-21T21:15:05Z

Thanks! I'll have a look.

vyasr added 3 commits November 14, 2023 14:23

Add test demonstrating the issue

b90fa22

Fix underlying bug

a74fec1

Fix test to validate correctly and add some comments

fc49484

vyasr added bug Something isn't working 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Nov 17, 2023

vyasr self-assigned this Nov 17, 2023

vyasr requested a review from a team as a code owner November 17, 2023 20:28

vyasr requested review from robertmaynard and nvdbaranec November 17, 2023 20:28

vyasr mentioned this pull request Nov 17, 2023

[BUG] Inconsistency in cudfAST Evaluation for Complex Expressions #14409

Closed

davidwendt approved these changes Nov 17, 2023

View reviewed changes

bdice approved these changes Nov 17, 2023

View reviewed changes

Merge branch 'branch-23.12' into fix/issue_14409

56e34b2

rapids-bot bot merged commit 723c565 into rapidsai:branch-23.12 Nov 18, 2023
65 checks passed

vyasr deleted the fix/issue_14409 branch November 18, 2023 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix intermediate type checking in expression parsing #14445

Fix intermediate type checking in expression parsing #14445

vyasr commented Nov 17, 2023

bdice left a comment

vyasr commented Nov 17, 2023

aocsa commented Nov 17, 2023

vyasr commented Nov 17, 2023

aocsa commented Nov 17, 2023

aocsa commented Nov 21, 2023

vyasr commented Nov 21, 2023

Fix intermediate type checking in expression parsing #14445

Fix intermediate type checking in expression parsing #14445

Conversation

vyasr commented Nov 17, 2023

Description

Checklist

bdice left a comment

Choose a reason for hiding this comment

vyasr commented Nov 17, 2023

aocsa commented Nov 17, 2023

vyasr commented Nov 17, 2023

aocsa commented Nov 17, 2023

aocsa commented Nov 21, 2023

vyasr commented Nov 21, 2023