-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle partition spec evolutions gracefully #202
Conversation
3c2e734
to
d4ba143
Compare
branch.ifPresent(appendOp::toBranch); | ||
entry.getValue().forEach(appendOp::appendFile); | ||
appendOp.set(COMMIT_ID_SNAPSHOT_PROP, commitState.currentCommitId().toString()); | ||
if (Objects.equals(entry.getKey(), maxSpecId)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only include the controlTopicOffsets
and vtts
properties in the last snapshot to be appended as part of this transaction. That's because we can only really be sure that these values are valid for the last snapshot to be appended.
AppendFiles appendOp = transaction.newAppend(); | ||
branch.ifPresent(appendOp::toBranch); | ||
entry.getValue().forEach(appendOp::appendFile); | ||
appendOp.set(COMMIT_ID_SNAPSHOT_PROP, commitState.currentCommitId().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note however that we include the commit-id snapshot property on each snapshot we add as part of this transaction. We don't have to do this but it could be helpful for debugging purposes and it's easy enough to do.
kafka-connect/src/main/java/io/tabular/iceberg/connect/channel/Coordinator.java
Outdated
Show resolved
Hide resolved
d4ba143
to
fa9e2fc
Compare
kafka-connect/src/main/java/io/tabular/iceberg/connect/channel/Coordinator.java
Outdated
Show resolved
Hide resolved
08e91e0
to
fc32130
Compare
fc32130
to
38a46fb
Compare
Can you just include a patched version of Iceberg that includes your upstream fix, rather than working around the issue here? |
74f3692
to
dbb96ef
Compare
That feels more hacky to me, I'd rather not resort to that just yet. Is you have a particular concern with this work around, please let me know! 😃 |
dbb96ef
to
73a7c5a
Compare
Currently the connector will get stuck if it ever encounters a batch of commit responses containing files with different partition specs.
This is because the append API currently does not support appending files with different partition specs.
There is an open PR to potentially address this limitation.
In the meantime, we can "hack" around this limitation by appending files with different specs as separate append operations in a single transaction.
It's not ideal but on balance, the trade-off should be worth it; partition spec evolution should not be a frequent event for most use cases.