Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for end-to-end acknowledgements to kafka source #7460

Closed
Tracked by #9856
bruceg opened this issue May 14, 2021 · 2 comments · Fixed by #7787
Closed
Tracked by #9856

Add support for end-to-end acknowledgements to kafka source #7460

bruceg opened this issue May 14, 2021 · 2 comments · Fixed by #7787
Assignees
Labels
domain: data model Anything related to Vector's internal data model source: kafka Anything `kafka` source related type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@bruceg
Copy link
Member

bruceg commented May 14, 2021

The kafka source needs to wait for a delivery acknowledgement before marking a batch of events as committed. This is complicated by the possibility of the events being acknowledged out-of-order but commits have to happen in order as an "offset" value. Some work will need to go into tracking this high water mark internally.

Ref #7336

@bruceg bruceg added type: enhancement A value-adding code change that enhances its existing functionality. source: kafka Anything `kafka` source related domain: data model Anything related to Vector's internal data model labels May 14, 2021
@bruceg bruceg self-assigned this May 14, 2021
@bruceg
Copy link
Member Author

bruceg commented May 31, 2021

I am finding this difficult to achieve due to lifetime issues in the rdkafka programming interface. I have filed issue fede1024/rust-rdkafka#368

I see @blt has also filed a related issue (maybe the same issue?) some time ago: fede1024/rust-rdkafka#89

@bruceg
Copy link
Member Author

bruceg commented Jun 4, 2021

@lukesteensen and I chatted about this today, and discussed two possible paths forward:

  1. We can continue to use the store_offset interface to acknowledge the read data. However, this requires changes to the interface exposed by the rdkafka crate. Although the maintainers of that crate have been responsive in the past, no comment has been made on the issue we raised to them. We will likely have to fork the crate to make the changes ourselves, and feed it back as a PR.
  2. The commit and store_offsets functions take a TopicPartitionList structure, which we could maintain and commit batches ourselves. This will likely add a bit of code overhead on our end, but requires no changes to rdkafka crate to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: data model Anything related to Vector's internal data model source: kafka Anything `kafka` source related type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant