Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49683][SQL] Block trim collation #48336

Closed
wants to merge 11 commits into from

Conversation

jovanpavl-db
Copy link
Contributor

What changes were proposed in this pull request?

Trim collation is currently in implementation phase. These change blocks all paths from using it and afterwards trim collation gets enabled for different expressions it will be gradually whitelisted.

Why are the changes needed?

Trim collation is currently in implementation phase. These change blocks all paths from using it and afterwards trim collation gets enabled for different expressions it will be gradually whitelisted.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

No additional tests, just added field that's not used.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Oct 3, 2024
@HyukjinKwon HyukjinKwon changed the title [SPARK-49683][SQL]Block trim collation [SPARK-49683][SQL] Block trim collation Oct 4, 2024
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uros-db @mihailom-db Could you review this PR, please.

Copy link
Contributor

@uros-db uros-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@HyukjinKwon
Copy link
Member

Merged to master.

MaxGekk pushed a commit that referenced this pull request Oct 17, 2024
### What changes were proposed in this pull request?
Simplifying the AbstractStringType hierarchy.

### Why are the changes needed?
The addition of trim-sensitive collation (#48336) highlighted the complexity of extending the existing AbstractStringType structure. Besides adding a new parameter to all types inheriting from AbstractStringType, it caused changing the logic of every subclass as well as changing the name of a derived class StringTypeAnyCollation into StringTypeWithCaseAccentSensitivity which could again be subject to change if we keep adding new specifiers.

Looking ahead, the introduction of support for indeterminate collation would further complicate these types. To address this, the proposed changes simplify the design by consolidating common logic into a single base class. This base class will handle core functionality such as trim or indeterminate collation, while a derived class, StringTypeWithCollation (previously awkwardly called StringTypeWithCaseAccentSensitivity), will manage collation specifiers.

This approach allows for easier future extensions: fundamental checks can be handled in the base class, while any new specifiers can be added as optional fields in StringTypeWithCollation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
With existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #48459 from stefankandic/refactorStringTypes.

Authored-by: Stefan Kandic <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?
Trim collation is currently in implementation phase. These change blocks all paths from using it and afterwards trim collation gets enabled for different expressions it will be gradually whitelisted.

### Why are the changes needed?
Trim collation is currently in implementation phase. These change blocks all paths from using it and afterwards trim collation gets enabled for different expressions it will be gradually whitelisted.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No additional tests, just added field that's not used.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#48336 from jovanpavl-db/block-collation-trim.

Lead-authored-by: Jovan Pavlovic <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?
Simplifying the AbstractStringType hierarchy.

### Why are the changes needed?
The addition of trim-sensitive collation (apache#48336) highlighted the complexity of extending the existing AbstractStringType structure. Besides adding a new parameter to all types inheriting from AbstractStringType, it caused changing the logic of every subclass as well as changing the name of a derived class StringTypeAnyCollation into StringTypeWithCaseAccentSensitivity which could again be subject to change if we keep adding new specifiers.

Looking ahead, the introduction of support for indeterminate collation would further complicate these types. To address this, the proposed changes simplify the design by consolidating common logic into a single base class. This base class will handle core functionality such as trim or indeterminate collation, while a derived class, StringTypeWithCollation (previously awkwardly called StringTypeWithCaseAccentSensitivity), will manage collation specifiers.

This approach allows for easier future extensions: fundamental checks can be handled in the base class, while any new specifiers can be added as optional fields in StringTypeWithCollation.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
With existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48459 from stefankandic/refactorStringTypes.

Authored-by: Stefan Kandic <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants