Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary repartitions on joins with SR-enabled key formats #6648

Open
vcrfxia opened this issue Nov 19, 2020 · 0 comments
Open

Avoid unnecessary repartitions on joins with SR-enabled key formats #6648

vcrfxia opened this issue Nov 19, 2020 · 0 comments

Comments

@vcrfxia
Copy link
Contributor

vcrfxia commented Nov 19, 2020

In order to ensure records with logically equivalent keys join successfully when using Schema-Registry-enabled key formats (which may result in different serialized bytes for logically equivalent data keys, due to potential differences in user vs ksqlDB schemas), ksqlDB forces repartitions on both sides of such joins to ensure data are properly co-partitioned (see #6635 for context). However, there's room for improvement since repartitions are currently forced even if not strictly necessary (e.g., if there was already another repartition upstream of the join). We should enhance the join logic to avoid unnecessary repartitions in these cases. Doing so would require passing information from the leaves back towards the root of the join tree, so pre-join repartition nodes will know whether any repartitions have already taken place upstream, thus obviating the need for an additional repartition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant