[ENH] Add CI check for query logical plan regressions #911

charlesbluca · 2022-11-09T16:58:17Z

Is your feature request related to a problem? Please describe.
Frequently, when bumping our DataFusion version, we run into failures in test_queries.py as a result of "regressions" introduced through the upstream changes.

In actuality, these "regressions" often end up being optimal changes to the generated logical plans that expose issues with the Python API. However, in some cases, the changes to the logical plans are suboptimal, and it would be good to identify them (even if they don't result in failures) before merging in a version bump.

Describe the solution you'd like
It would be nice if there was some element to CI that clearly identified how a PR changed the logical plans associated with each query.

This could look like a CI check that raises a failure if the plans are different between main and the PR branch, or a comment that gets posted on PRs with a summary of the diff between the old and new plans. Don't have a strong preference right now, though I would like something that doesn't cause too much confusion to newer contributors who are unfamiliar with the query testing.

Describe alternatives you've considered
An alternative that should be pursued in parallel to this is improving the coverage of our Rust API testing; this would allow us to have a better idea of what more granular queries should produce, which in general will make it easier to identify why a change in the Rust code causes a regression in the query plans.

Additional context
Thinking about this because of #903, which caused several passing queries to start failing again.

The text was updated successfully, but these errors were encountered:

charlesbluca added enhancement New feature or request datafusion Related to work in DataFusion labels Nov 9, 2022

charlesbluca mentioned this issue Nov 9, 2022

Upgrade to DataFusion 14.0.0 #903

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add CI check for query logical plan regressions #911

[ENH] Add CI check for query logical plan regressions #911

charlesbluca commented Nov 9, 2022

[ENH] Add CI check for query logical plan regressions #911

[ENH] Add CI check for query logical plan regressions #911

Comments

charlesbluca commented Nov 9, 2022