Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor QueryStageExec in preparation for implementing map-side shuffle #459

Merged
merged 6 commits into from
Jun 1, 2021

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented May 31, 2021

Which issue does this PR close?

Closes #458

Rationale for this change

What changes are included in this PR?

  • Moves logic from Ballista executor to QueryStageExec (which we can potentially move to DataFusion later on)
  • Puts some plumbing in place in preparation for supporting map-side shuffle
  • Implements a unit test

Are there any user-facing changes?

No

@andygrove andygrove self-assigned this May 31, 2021
@andygrove
Copy link
Member Author

@edrevo fyi

@codecov-commenter
Copy link

codecov-commenter commented May 31, 2021

Codecov Report

Merging #459 (c6ad3de) into master (c8ab5a4) will increase coverage by 0.30%.
The diff coverage is 68.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #459      +/-   ##
==========================================
+ Coverage   75.30%   75.60%   +0.30%     
==========================================
  Files         152      152              
  Lines       25275    25336      +61     
==========================================
+ Hits        19033    19156     +123     
+ Misses       6242     6180      -62     
Impacted Files Coverage Δ
ballista/rust/core/src/utils.rs 27.58% <0.00%> (+27.58%) ⬆️
ballista/rust/executor/src/executor.rs 0.00% <0.00%> (ø)
ballista/rust/scheduler/src/lib.rs 20.81% <0.00%> (ø)
ballista/rust/scheduler/src/planner.rs 66.91% <61.53%> (-0.74%) ⬇️
...lista/rust/core/src/execution_plans/query_stage.rs 75.78% <82.27%> (+32.93%) ⬆️
ballista/rust/core/src/serde/scheduler/mod.rs 58.92% <0.00%> (+44.64%) ⬆️
ballista/rust/core/src/memory_stream.rs 60.00% <0.00%> (+60.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8ab5a4...c6ad3de. Read the comment docs.

}

impl QueryStageExec {
/// Create a new query stage
pub fn try_new(
job_id: String,
job_id: &str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think for "constructor methods" it is OK to take an owned String as the &str will be copied anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW another pattern that I like is job_id: impl Into<String> so that the caller can pass in an owned String if they have one or a &str if they don't (or anything else that knows how to turn itself into a String)

//TODO re-use code from RepartitionExec to split each batch into
// partitions and write to one IPC file per partition
// See https://github.com/apache/arrow-datafusion/issues/456
unimplemented!()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use DataFusionError:: NotImplemented

Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👍

@alamb alamb merged commit e5264f6 into apache:master Jun 1, 2021
@andygrove andygrove deleted the query-stage-refactor branch June 1, 2021 18:17
@houqp houqp added the ballista label Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ballista refactor QueryStageExec in preparation for map-side shuffle
6 participants