Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Local Dynamic Filter test cases #11379

Closed

Conversation

manupatteri
Copy link
Member

@manupatteri manupatteri commented Mar 9, 2022

Description

Dynamic Filter(DF) is a common functionality in multiple connectors. Since functionality is similar, similar tests are present in different connectors. This is part of an effort to extract such common tests and make it available at a common layer. This will allow newer connectors or connectors where DF is getting introduced to write tests quickly and correctly.

Since the original issue was created multiple PRs were already merged which takes care of different aspects of this.(For eg: #9193).

Hence I am re-raising a new PR which takes care of common tests for local dynamic filters only. AFAIK it is available mainly in Memory Connector and Hive Connector.

Is this change a fix, improvement, new feature, refactoring, or other?
This is refactoring of dynamic filter test cases. Fix for #5776

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)
It's a refactoring of the test cases in testing package.

How would you describe this change to a non-technical end user or system administrator?
There is no functional change. Just re-organizing tests for Dynamic Filtering to manage it better.

Related issues, pull requests, and links

Fixes #5776

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

(x) No release notes entries required.
( ) Release notes entries required with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Mar 9, 2022
@manupatteri manupatteri force-pushed the refactor_dynamic_filter_new branch from 49e267a to 8b7bc18 Compare March 9, 2022 04:37
@hashhar hashhar requested review from raunaqmorarka and sopel39 March 9, 2022 07:45
Comment on lines 14 to 32
.addExtraProperty("dynamic-filtering.small-broadcast.max-distinct-values-per-driver", "100")
.addExtraProperty("dynamic-filtering.small-broadcast.range-row-limit-per-driver", "100")
.addExtraProperty("dynamic-filtering.large-broadcast.max-distinct-values-per-driver", "100")
.addExtraProperty("dynamic-filtering.large-broadcast.range-row-limit-per-driver", "100000")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these configs be contributed by AbstractLocalDynamicFilteringTest. That way all subclasses can use the constant defined there. I don't think these values would need to vary per connector.

@hashhar
Copy link
Member

hashhar commented Mar 9, 2022

CI failure is related - new files have missing license header. you can add those by running ./mvnw license:format.

@raunaqmorarka
Copy link
Member

I don't think the same test would make sense for both memory and hive connectors. In memory connector, MemoryPageSourceProvider implements dynamic filtering at row level for every value that it encounters. Memory connector is the only connector which performs dynamic filtering at a row level without requiring any special layout.
In hive connector, HivePageSourceProvider can perform dynamic filtering only if the probe side table is partitioned or bucketed or we're able to successfully prune row groups/stripes in parquet/orc files. So the tests here require appropriate layout for the probe table.
IMO we should just figure out what can be made common when the scenario of adding DF to some connector actually arises (e.g. we extracted BaseDynamicPartitionPruningTest when we added dynamic partition pruning to iceberg)

@manupatteri manupatteri force-pushed the refactor_dynamic_filter_new branch from 8b7bc18 to e7749c8 Compare March 9, 2022 10:04
@manupatteri
Copy link
Member Author

Ok. If that's the decision, then I can close this PR. Please let me know @sopel39 @hashhar. Thanks @raunaqmorarka.

@manupatteri
Copy link
Member Author

CI failure is related - new files have missing license header. you can add those by running ./mvnw license:format.

Yeah. I realized this later. I think it's fixed now. Now I am getting something which looks a system error. I will just do a git push again to trigger a re-run.

@manupatteri manupatteri force-pushed the refactor_dynamic_filter_new branch from e7749c8 to 40ad2ac Compare March 11, 2022 10:26
}

@Test(timeOut = 30_000, dataProvider = "joinDistributionTypes")
public void testJoinDynamicFilteringMultiJoin(OptimizerConfig.JoinDistributionType joinDistributionType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not about local dynamic filtering, but about how scheduler schedules stages (and it doesn't lock). So it's not local DF test. This maybe isn't the best place for it, but I couldn't find a better one

@sopel39
Copy link
Member

sopel39 commented Mar 11, 2022

I agree with @raunaqmorarka

@manupatteri
Copy link
Member Author

Ok. Then I will go ahead and close this PR.
@sopel39 I see that Hive (non local DF) tests are already somewhat refactored. Do you want to close #5776 or does it need any more change?

@sopel39
Copy link
Member

sopel39 commented Mar 14, 2022

Do you want to close #5776 or does it need any more change?

Yes, I've closed #5576

@manupatteri
Copy link
Member Author

Closing this PR as per discussion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Extract abstract dynamic filtering tests for connectors
4 participants