Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Scheduler panic routing errors #4097

Merged
merged 1 commit into from
Nov 4, 2022
Merged

Conversation

yukkit
Copy link
Contributor

@yukkit yukkit commented Nov 3, 2022

Which issue does this PR close?

Closes #4096 .

Rationale for this change

The query output partition number may not equal current pipeline's output partition number, so send error to the partition of the query output will cause panic.

What changes are included in this PR?

Because the query output has at least one partition,so we send error that thrown by non-output pipelines to the first partition of the query output.

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Nov 3, 2022
@xudong963 xudong963 requested a review from tustvold November 3, 2022 12:33
@alamb
Copy link
Contributor

alamb commented Nov 3, 2022

Until the fix for #4100 is merged, clippy will likely fail on this PR as well

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you. Nicely done tracking this down 😅

Comment on lines 113 to 115
// The query output partition number may not equal current pipeline's output partition number,
// but the query output has at least one partition,
// so send error to the first partition of the query output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The query output partition number may not equal current pipeline's output partition number,
// but the query output has at least one partition
// so send error to the first partition of the query output.
// The query output partitioning may not match the current pipeline's
// but the query output has at least one partition
// so send error to the first partition of the query output.

Comment on lines 361 to 362
let scheduled: Vec<_> = stream.try_collect().await.unwrap_or(vec![]);
let expected = query.collect().await.unwrap_or(vec![]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let scheduled: Vec<_> = stream.try_collect().await.unwrap_or(vec![]);
let expected = query.collect().await.unwrap_or(vec![]);
let scheduled: Vec<_> = stream.try_collect().await.unwrap_or_default();
let expected = query.collect().await.unwrap_or_default();

@tustvold tustvold changed the title fix: panic when use Scheduler to execute plan fix: Scheduler panic routing errors Nov 3, 2022
@tustvold
Copy link
Contributor

tustvold commented Nov 4, 2022

I took the liberty of fixing the clippy lints and making a minor change to the docs so that we can get this in quickly for you, thanks again 👍

@yukkit
Copy link
Contributor Author

yukkit commented Nov 4, 2022

I took the liberty of fixing the clippy lints and making a minor change to the docs so that we can get this in quickly for you, thanks again 👍

Thank you, sorry, I didn't see your reply, I just resubmitted @tustvold

@tustvold
Copy link
Contributor

tustvold commented Nov 4, 2022

No worries at all 👍

FWIW it can help reviewers if you avoid rebasing reviewed PRs as reviewers can then easily see what has changed, everything then gets squashed on merge. It makes no difference on a PR this size, but thought I would mention it for future contributions 😄

@yukkit
Copy link
Contributor Author

yukkit commented Nov 4, 2022

No worries at all 👍

FWIW it can help reviewers if you avoid rebasing reviewed PRs as reviewers can then easily see what has changed, everything then gets squashed on merge. It makes no difference on a PR this size, but thought I would mention it for future contributions 😄

Thanks for your pointing👍

@tustvold tustvold merged commit 7e944ed into apache:master Nov 4, 2022
@ursabot
Copy link

ursabot commented Nov 4, 2022

Benchmark runs are scheduled for baseline = f61b43a and contender = 7e944ed. 7e944ed is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022
Ted-Jiang pushed a commit to Ted-Jiang/arrow-datafusion that referenced this pull request Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
4 participants