Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Patched DataFusion 45+ with unified execution plans #30

Draft
wants to merge 24 commits into
base: alamb/test_datasource_exec_base
Choose a base branch
from

Conversation

alamb
Copy link
Owner

@alamb alamb commented Feb 5, 2025

Patched DataFusion 45+ with unified execution plans

Based on influxdata#54

This branch contains the following PR:

The idea is to test what impact this will have on us upstream in influxdb_iox

alan910127 and others added 5 commits February 5, 2025 08:43
* fix: `List` of `FixedSizeList` coercion issue in SQL

* test: update sqllogictest result
…ng out of datafusion/core/datasource/listing (apache#14464)

* make datafusion_catalog_listing

* fix: this is a bit hacky

* fixes: prettier, taplo etc

* fixes: clippy

* minor: permalink commit hash -> main

* Tweak README

* fix:prettier + wasm

* prettier

* Put unit tests with code

---------

Co-authored-by: Andrew Lamb <[email protected]>
…ort in arrow instead (apache#14503)

* refactor: replace uses of arrow_buffer and arrow_array with reexport in arrow

* Remove arrow-buffer in common

* Remove dependency in core

* remove another ne

* remove from functions-nested

* remove from physical-expr

* remove from physical-expr-common

* Remove from physical-plan

* Remove from substrait

* fix datafusion-cli/Cargo.lock

---------

Co-authored-by: Ian Lai <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* Accept any uncorrelated plan when checking subquery correlation

For the purpose of decorrelation, an uncorrelated plan is a unit. No
verification needs to be performed on it.

* Extract variable

Extract variable from a long if condition involving a match. Improves
readability.

* Simplify control flow

Handle the unhandled case returning immediately. This adds additional
return point to the function, but removes subsequent if. At the point of
this additional return we know why we bail out (some unhandled
situation), later the None filter could be construed as a true
condition.

* Add more EXISTS SLT tests

* Support uncorrelated EXISTS

* fixup! Support uncorrelated EXISTS

* fixup! Support uncorrelated EXISTS
* chore(deps): Update sqlparser to `0.54.0`

* Update for API changes

* Turn multi-object name into an error

* Add test for unsupported join

* Update datafusion/sql/src/planner.rs

Co-authored-by: Jax Liu <[email protected]>

---------

Co-authored-by: Jax Liu <[email protected]>
findepi and others added 18 commits February 5, 2025 14:01
* Validate and unpack function arguments tersely

Add a `take_function_args` helper that provides convenient unpacking of
function arguments along with validation that the provided argument
count matches the expected.  A few functions are updated to leverage the
new pattern to demonstrate its usefulness.

* Add example in rust doc

Co-authored-by: Andrew Lamb <[email protected]>

* fix fmt

* Export function utils publicly

this exports only the newly added take_function_args function. all other
utils members are pub(crate)

* use compact format pattern

Co-authored-by: Matthijs Brobbel <[email protected]>

* fix example

* fixup! fix example

* fix license header

Co-authored-by: Oleks V <[email protected]>

* Name args in nvl2 and use take_function_args in execution too

---------

Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Matthijs Brobbel <[email protected]>
Co-authored-by: Oleks V <[email protected]>
This commit fixes the following edge cases in the array_slice function
so that it's semantics match DuckDB:

  - When begin < 0 and -begin > length, begin is clamped to the
    beginning of the list.
  - When step < 0 and begin = end, then the result should be a list with
    the single element found at index begin/end.

Fixes apache#10548
* add fetch info to CoalescePartitionsExec

* use Statistics with_fetch API on CoalescePartitionsExec

* check limit_reached only if fetch is assigned
# Conflicts:
#	datafusion/sqllogictest/test_files/aggregate.slt
#	datafusion/sqllogictest/test_files/limit.slt
#	datafusion/sqllogictest/test_files/union.slt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants