-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More intuitive way to define Table KEY #804
Comments
Im +1 for this - it follows expected SQL semantics - https://www.w3schools.com/sql/sql_primarykey.asp |
It feels to me like there are two (potential) uses for the WITH KEY clause:
I think the second use, (i.e. current behaviour), has a lot of potential to introduce errors into user's applications and is less intuitive. We wouldn't want to add the overhead of checking that both copies of the key in the Kafka record matched, but if there were to differ, then results would be 'undefined'. I'm a +1 on switching |
@hjafarpour can you shed light on why the current implementation is the way it is? |
The current 3-step process to rekey a topic for a table prevents people from unintentionally repartitioning because they don’t understand the implications. For usability, it makes sense to simplify this into 1-step and caveat repartitioning. |
Another thing to think about here is what if you do a Any fixing of the key field specification for the table should also naturally translate to the keys for tables created by |
See #2745, which drops the requirement for having a KEY defined on a table and makes it just an optimisation. The second part falls into work planned for structured keys. So... I'm going to close this. Feel free to reopen if you think it should still be open. |
Currently the specification of which field from the message-value in a Table contains the same value as the message-key is rather non-intuitive and confuses many users. Additionally, it is the case that it may be impossible to specify this metadata in a way which KSQL will accept, given recent tightening of the checks around presence of the
KEY='foo'
stanza - if, for example, there is no column in the message values which happens to duplicate the contents of the message key!There are 2 components to this request, which may make sense to consider together as an improvement to this situation:
CREATE TABLE foo { keycol int KEY, othercol varchar}
.keycol
is the alternative name by which i will refer to theROWKEY
. If there is a field inside of my message-value with this name (e.g. it's a JSON message with a 'keycol' entry) then the engine can read the value from either there or from the actual message key, whichever is most convenient/optimal".The text was updated successfully, but these errors were encountered: