Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of join predicate should verify that the table operand is the key #965

Open
rodesai opened this issue Mar 15, 2018 · 5 comments
Labels

Comments

@rodesai
Copy link
Contributor

rodesai commented Mar 15, 2018

We should validate that the table operand of the join predicate is the table's key column. KSQL cannot guarantee correctly computed joins if the table operand is a non-key column. Table records with matching values for the column may be assigned to different partitions from the stream. It also usually doesn't make semantic sense to do this. The only exception would be where the column is an alternate key for the table. Even then we can't repartition to handle the join because there is no way for KSQL to guarantee ordering, or for the user to specify that the other column is partitioned the same as the declared key (which would make ordering a non-issue).

@rodesai rodesai added the bug label Mar 16, 2018
@apurvam
Copy link
Contributor

apurvam commented Mar 16, 2018

This is related to #749

@big-andy-coates
Copy link
Contributor

This would be potentially expensive to check on each record. An alternative semantics for WITH (Key='foo') is discussed in #804 that does away with this issue.

@rodesai
Copy link
Contributor Author

rodesai commented Apr 11, 2018

@big-andy-coates this is something we should check when building the topology. Its orthogonal to the semantics we have for expressing what the key column is. However that works, we should still fail requests to start queries that try to join with a column thats not the key.

@rmoff
Copy link
Member

rmoff commented Jun 21, 2018

@rodesai are you ok to close this and track in #749 ? Or visa versa, I don't mind :)

@rodesai
Copy link
Contributor Author

rodesai commented Jun 21, 2018

@rmoff this is a different issue. #749 covers the case where the kafka message key has a different format than the key column. This issue covers raising an error if the join predicate references a column that is not the table key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants