You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a processor that has a persistence processor. It appears that it is not able to rebalance at all, and it causes my pods to be in a crash loop.
My topic used for persistence (-table) has cleanup.policy=compact. Apparently, we have 1.3 mil messages in this topic.
I've even tried to reduce the number of pods to 1, to see if there is some kind of concurrency problem, it staled with the last log 2019/12/09 13:33:23 Processor: dispatcher started for about 5 minutes (raising the memory usage to 5GB), and eventually started (no extra logging or something)
I have 4 pods, and the topic has 4 partitions. I have 3 processors (2 of them without the persistence, and they work fine). It looks like the processor is not able to rebalance. The input is about 25msg/s.
I believe the root issue is the same as already mentioned in other posts here, slow recovery of the huge -table. However, I am surprised by this constant rebalancing.
Hi @cioboteacristian, sorry for the late reponse. Your issue looks like the component is terminated by the container. The high memory usage could be a reason or the long startup time before it actually starts consuming.
The huge memory consumption is an issue we have to improve at some point but the recovery speed is actually limited to the network bandwidth and disk speed.
Anyway, as announced in #239 we are working on a refactored and improved version for goka. Although the recovery mechanism stayed the same more or less, it would be interesting if you are still facing those issues. Just use branch consumer-group or vendor tag v0.9.0-beta2 to try it out.
Let me know if there are any results or you have any problems.
Cheers!
I have a processor that has a persistence processor. It appears that it is not able to rebalance at all, and it causes my pods to be in a crash loop.
My topic used for persistence (
-table
) hascleanup.policy=compact
. Apparently, we have 1.3 mil messages in this topic.I've even tried to reduce the number of pods to 1, to see if there is some kind of concurrency problem, it staled with the last log
2019/12/09 13:33:23 Processor: dispatcher started
for about 5 minutes (raising the memory usage to 5GB), and eventually started (no extra logging or something)I have 4 pods, and the topic has 4 partitions. I have 3 processors (2 of them without the persistence, and they work fine). It looks like the processor is not able to rebalance. The input is about 25msg/s.
I believe the root issue is the same as already mentioned in other posts here, slow recovery of the huge
-table
. However, I am surprised by this constant rebalancing.Logs before dying:
The text was updated successfully, but these errors were encountered: