Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-824475 Support Spark 3.4 #510

Merged
merged 1 commit into from
May 22, 2023
Merged

SNOW-824475 Support Spark 3.4 #510

merged 1 commit into from
May 22, 2023

Conversation

sfc-gh-mrui
Copy link
Contributor

@sfc-gh-mrui sfc-gh-mrui commented May 9, 2023

SNOW-824475 Support Spark 3.4

The main changes for Spark 3.4 affects Spark Connector are:

  1. Change the last argument for Cast
    • Old type: ansiEnabled: Boolean
    • New Type: evalMode: EvalMode.Value
    • Currently, there are 3 modes: LEGACY, ANSI, TRY
    • Support to pushdown, if the mode is LEGACY. It will make SC to be in consistent behaviour.
  2. Added a JoinHint argument for ScalarSubquery.
    • Spark Connector only pushdown if joinCont is not empty. So the new parameter can be ignored.
  3. PromotePrecision is removed from Spark 3.4
    • So SC only needs to remove the process for this pan node.
  4. Added a new argument ansiEnabled for Round.
    • Support to pushdown, if the ansiEnabled is false. It will make SC to be in consistent behaviour.

@codecov
Copy link

codecov bot commented May 9, 2023

Codecov Report

Merging #510 (c569caf) into master (78294d0) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #510      +/-   ##
==========================================
- Coverage   89.59%   89.52%   -0.07%     
==========================================
  Files          52       52              
  Lines        4487     4487              
  Branches      744      741       -3     
==========================================
- Hits         4020     4017       -3     
- Misses        467      470       +3     

@sfc-gh-mrui sfc-gh-mrui force-pushed the mrui_spark34 branch 2 times, most recently from 400c86f to c1d7b56 Compare May 22, 2023 20:43
@sfc-gh-mrui sfc-gh-mrui changed the title Spark 3.4 PoC SNOW-732463 Support Spark 3.4 May 22, 2023
@sfc-gh-mrui sfc-gh-mrui marked this pull request as ready for review May 22, 2023 21:00
@sfc-gh-mrui sfc-gh-mrui changed the title SNOW-732463 Support Spark 3.4 SNOW-824475 Support Spark 3.4 May 22, 2023
Copy link
Contributor

@sfc-gh-bli sfc-gh-bli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sfc-gh-mrui sfc-gh-mrui requested a review from sfc-gh-zli May 22, 2023 22:53
|( SELECT * FROM ( SELECT * FROM ( $test_table2 ) AS
|"SF_CONNECTOR_QUERY_ALIAS" ) AS "SUBQUERY_0" WHERE
|( "SUBQUERY_0"."O" IS NOT NULL ) ) AS "SUBQUERY_1"
""".stripMargin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we issue two SQLs now instead of one for this push down? Is that a performance issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is test-only and it doesn't cause any perf problem for production.

  1. The generated query for different spark may be different.
  2. To make the test case to work for different Spark versions, the expectedMultiplyQueries includes the query for spark 3.4 and previous version. If the query matches either one, the test case will succeed.

@sfc-gh-mrui sfc-gh-mrui merged commit 463678d into master May 22, 2023
arthurli1126 pushed a commit to arthurli1126/spark-snowflake that referenced this pull request Jul 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants