Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31503][SQL] fix the SQL string of the TRIM functions #28281

Closed
wants to merge 1 commit into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

override the sql method of StringTrim, StringTrimLeft and StringTrimRight, to use the standard SQL syntax.

Why are the changes needed?

The current implementation is wrong. It gives you a SQL string that returns different result.

Does this PR introduce any user-facing change?

No

How was this patch tested?

new tests

@cloud-fan
Copy link
Contributor Author

cc @dongjoon-hyun @HyukjinKwon

@dongjoon-hyun
Copy link
Member

Thank you for pinging me, @cloud-fan .
Then, does this affect branch-2.4, too?

-- !query
SELECT trim(BOTH 'xyz' FROM 'yxTomxx'), trim('xyz' FROM 'yxTomxx')
-- !query schema
struct<trim(yxTomxx, xyz):string,trim(yxTomxx, xyz):string>
struct<TRIM(BOTH xyz FROM yxTomxx):string,TRIM(BOTH xyz FROM yxTomxx):string>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@SparkQA
Copy link

SparkQA commented Apr 21, 2020

Test build #121568 has finished for PR 28281 at commit e3e2f14.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Apr 21, 2020

retest this please

@cloud-fan
Copy link
Contributor Author

cloud-fan commented Apr 21, 2020

@dongjoon-hyun I think this bug exists the first day we add TRIM or the sql method. We should fix it in 2.4 as well.

@SparkQA
Copy link

SparkQA commented Apr 21, 2020

Test build #121575 has finished for PR 28281 at commit e3e2f14.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 21, 2020

Test build #121578 has finished for PR 28281 at commit 47f9d03.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @cloud-fan and @maropu .
Merged to master/3.0.

dongjoon-hyun pushed a commit that referenced this pull request Apr 21, 2020
### What changes were proposed in this pull request?

override the `sql` method of `StringTrim`, `StringTrimLeft` and `StringTrimRight`, to use the standard SQL syntax.

### Why are the changes needed?

The current implementation is wrong. It gives you a SQL string that returns different result.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

new tests

Closes #28281 from cloud-fan/sql.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit b209b5f)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Could you make a backporting PR to branch-2.4?

@dongjoon-hyun
Copy link
Member

cc @holdenk since she is the release manager for 2.4.6.

cloud-fan added a commit to cloud-fan/spark that referenced this pull request Apr 22, 2020
override the `sql` method of `StringTrim`, `StringTrimLeft` and `StringTrimRight`, to use the standard SQL syntax.

The current implementation is wrong. It gives you a SQL string that returns different result.

No

new tests

Closes apache#28281 from cloud-fan/sql.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit b209b5f)
Signed-off-by: Dongjoon Hyun <[email protected]>
@cloud-fan
Copy link
Contributor Author

The backport PR: #28299

dongjoon-hyun pushed a commit that referenced this pull request Apr 22, 2020
backport #28281 to 2.4

This backport has one difference: there is no `EXTRACT(... FROM ...)` SQL syntax in 2.4, so this PR just uses the common function call syntax.

Closes #28299 from cloud-fan/pick.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants