-
-
Notifications
You must be signed in to change notification settings - Fork 196
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kgo: avoid pointer reuse in metadata across producers & consumers
Previously, on metadata update, we would construct *topicPartitionData (translate the metadata to our internal types) and then use that in some fancy way to update our own state. A change a while back separate the producer and consumer maps, fixing some bugs and simplifying things. At the time, I considered it fine to reuse the topicPartitionData -- I didn't realize the ramification of having things stored separately. As it turns out, a specific sequence of events can result in a bug, and this bug requires producing to and consuming from the same topic in the same client: * consume "foo", loading metadata successfully once * produce to "foo", causing another metadata load because the maps are separate * this second metadata load fails with partition errors and causes another metadata update * third metadata update sees the leader has changed and transfers leadership This will cause a panic. The second metadata load would add the new recBuf to the proper sink (broker). Then the same *topicPartitionData from the metadata update is used to for updating the consumer state (cursor & source). The *topicPartitionData has a load error, so we instead copy the old *topicPartitionData to the new one and save it. This copied the old records / recBufsIdx of -1. The producer was now using a recBuf that has a recBufsIdx of -1. There are two eventual scenarios here: 1) the next metadata update will re-add the recBuf to the sink because the index is -1. This is forever wasteful, but not problematic: the the recBuf is on the same sink twice, and removing it will only remove the first copy. The old sink will always attempt to drain and produce this recBuf, and producing this will just result in a bunch of wasted partition errors. The recBuf will also be drained by the proper sink, so, nothing problematic here. 2) the next metadata update (step three above) will move the partition to another broker. This will panic, because the recBufsIdx is -1, an invalid index. We fix this by instead mapping the kmsg structs into our own structs, and then map that _again_ into producer and consumer specific structs. Notably, the other half (cursor or records) is always nil. If we ever have reuse problems in the future, we will get much more obvious panics. Now, we absolutely ensure we do not share pointers for both the consumer side and the producer side. Fixes #190
- Loading branch information
Showing
2 changed files
with
113 additions
and
70 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters