-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIFI-13597: fix: modified kafka manager to use poll in producer #360
Conversation
https://telecominfraproject.atlassian.net/browse/WIFI-13597 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this with very few devices: < 5 for example. We tried this method in the past, however we found messages were lingering in Kafka and not being received by other consumers on the bus. The fix looks good, just want to make sure we are not trading on problem for another.
I did experiments and the queue goes down to zero pretty often, that where Memory did not go above 700MB for the whole hour of test. I ran 1000 and 5000 APs tests for short time, all looked good. What is the indication of Kafka messages issue? |
What we have seen was that Kafka was a long delay between publish and consume: 30 seconds to 1 minute. If you are not seeing that, then this is a good fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
https://telecominfraproject.atlassian.net/browse/WIFI-13597 NOTE: This fix is port of Telecominfraproject/wlan-cloud-ucentralgw#360 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
https://telecominfraproject.atlassian.net/browse/WIFI-13597 NOTE: This fix is port of Telecominfraproject/wlan-cloud-ucentralgw#360 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
https://telecominfraproject.atlassian.net/browse/WIFI-13597 NOTE: This fix is port of Telecominfraproject/wlan-cloud-ucentralgw#360 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
https://telecominfraproject.atlassian.net/browse/WIFI-13597 NOTE: This fix is port of Telecominfraproject/wlan-cloud-ucentralgw#360 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
https://telecominfraproject.atlassian.net/browse/WIFI-13597 NOTE: This fix is port of Telecominfraproject/wlan-cloud-ucentralgw#360 Summary of changes: - Modified code in KafkaManager to use poll instead of flush for every messages sent. flush is used only on empty internal notification queue in idle times. Signed-off-by: Ivan Chvets <[email protected]>
Description
During latest experiments with large number of APs, we narrowed down that memory is consumed by Kafka internal queue on GW (producer). And with large number of messages producer cannot keep up with emptying this queue.
One noticeable suspect was identified in
flush()
call hereLooks like, flushing on every message slows down producer to 100 messages per second.
The solution was to use
poll()
to allow for faster message transmission in peak times.Related Jira: https://telecominfraproject.atlassian.net/browse/WIFI-13597
Summary of changes: