Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1865904 fix query gen when nested cte node is partitioned #2816

Conversation

sfc-gh-aalam
Copy link
Contributor

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-1865904

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
  3. Please describe how your code solves the related issue.

    Update query generation for with query block where cte query is updated if the child plan is re-resolved during optimization process.

@sfc-gh-aalam sfc-gh-aalam added the NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md label Dec 30, 2024
@sfc-gh-aalam sfc-gh-aalam marked this pull request as ready for review January 2, 2025 19:14
@sfc-gh-aalam sfc-gh-aalam requested review from a team as code owners January 2, 2025 19:14
Copy link
Contributor

@sfc-gh-helmeleegy sfc-gh-helmeleegy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of nits. Thanks!

for parent in parents:
replace_child(parent, child, temp_table_selectable, self._query_generator)

nodes_to_reset = list(parents)
while nodes_to_reset:
node = nodes_to_reset.pop()
if node in updated_nodes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give some more detailed description about what is the problem if we skip the node update here ? the cte optimizaiton is doing a level to level update, so it can skip the node update if it is already updated, but for large query breakdown, i recall you were doing a dfs, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider a tree below

     node1
        |
     node2
   /       \
node3      node4

suppose we start by re-resolving node3 -> node2 -> node1. In this process, node2 and node1 are marked in updated_nodes. Now, when we go updating the ancestors of node4, re-resolution of node2 and node1 would be skipped. This is not ideal if node4 update can also trigger a re-update of node2 and node1.

For example, this could be problematic is when node3 and node4 before the update had referenced_cte map. After the first update node3 -> node2 -> node1, node2 will be resolved with an older version of node4. If after a re-resolved node4 there are no referenced_ctes, then node2 will not be updated with this information.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it is because you are doing a dfs, if you are doing updated in the order of node3, node4 then node2, last node1 you shouldn't run into problems.

@@ -185,7 +185,6 @@ def compile(self) -> Dict[PlanQueryType, List[Query]]:
error_type=type(e).__name__,
error_message=str(e),
)
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a bug before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. I just noticed an unnecessary pass so I cleaned it up.

@sfc-gh-aalam sfc-gh-aalam enabled auto-merge (squash) January 3, 2025 18:33
@sfc-gh-aalam sfc-gh-aalam merged commit a79fe9f into main Jan 3, 2025
40 checks passed
@sfc-gh-aalam sfc-gh-aalam deleted the aalam-SNOW-1865904-fix-query-gen-when-nested-cte-node-is-partitioned branch January 3, 2025 19:20
@github-actions github-actions bot locked and limited conversation to collaborators Jan 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants