Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortExec No Longer Streams Correctly #1939

Closed
tustvold opened this issue Mar 7, 2022 · 4 comments · Fixed by #2312
Closed

SortExec No Longer Streams Correctly #1939

tustvold opened this issue Mar 7, 2022 · 4 comments · Fixed by #2312
Labels
bug Something isn't working

Comments

@tustvold
Copy link
Contributor

tustvold commented Mar 7, 2022

Describe the bug

https://github.com/apache/arrow-datafusion/pull/1596/files#diff-68811b72d27f9f5173223e0da1af2a467c2e4fff2f5f2237665fa29e1a6575c0L165 appears to have accidentally changed the behaviour of SortExec so that it no longer returns a stream that performs the sort operation, but instead performs the sort within ExecutionPlan::execute.

This effectively stalls out constructing the rest of the physical plan until the sort has completed, and prevents result streaming from working correctly.

To Reproduce

Run a query with a large SortExec, observe surprising amount of time spent in ExecutionPlan::execute

Expected behavior

ExecutionPlan::execute should return a stream of results, but should not block on those results being available

Additional context

This resulted in what looked like missing traces in IOx (https://github.com/influxdata/influxdb_iox/issues/3822) as it never actually finished constructing the physical plan from which to collect metrics 😅

FYI @yjshen

@tustvold tustvold added the bug Something isn't working label Mar 7, 2022
@yjshen
Copy link
Member

yjshen commented Mar 9, 2022

Thanks, @tustvold . I'll have time to work on this later in the week. 😅 Please let me know if it has a higher priority.

@tustvold
Copy link
Contributor Author

tustvold commented Mar 9, 2022

No rush, whenever you have time 😀

@tustvold
Copy link
Contributor Author

tustvold commented Apr 6, 2022

@yjshen Are you still planning to pick this one up, otherwise I can take a stab at it. It's causing me some excitement whilst experimenting with query scheduling, as the sort is taking place during the plan 😆

@yjshen
Copy link
Member

yjshen commented Apr 6, 2022

Please go ahead. I am occupied with enabling all TPC-DS queries in Blaze and getting performance gains these days. Sorry for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants