Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1887892: improve process pool + thread pool logic, add retry logic #2924

Open
wants to merge 4 commits into
base: dev/data-source
Choose a base branch
from

Conversation

sfc-gh-yuwang
Copy link
Collaborator

@sfc-gh-yuwang sfc-gh-yuwang commented Jan 23, 2025

…not wait, add retry logic

  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-1887892

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
  3. Please describe how your code solves the related issue.

    this PR meant to do two things:
    i. decouple process pool download and thread pool upload/copy into table
    ii. add retry logic for both process pool and thread pool
    test for partition download will be added in other PR

Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

@sfc-gh-yuwang sfc-gh-yuwang marked this pull request as ready for review January 24, 2025 18:55
@sfc-gh-yuwang sfc-gh-yuwang requested review from a team as code owners January 24, 2025 18:55
@sfc-gh-yuwang sfc-gh-yuwang requested review from sfc-gh-aalam and sfc-gh-jrose and removed request for a team January 24, 2025 18:55
src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
)
for i, query in enumerate(partitioned_queries)
]
for future in as_completed(process_pool_futures):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious how does this as_completed work?

let's say I have 10 futures, and 7th job finishes first, would as_completed(process_pool_futures) return 7th job first?

src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
Copy link

Seems like your changes contain some Local Testing changes, please request review from @snowflakedb/local-testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants