-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added async producer mode for kafka #4150
Conversation
Can you tell me a little bit about why the async producer is helpful? When would a user want to enable async? |
Hello,
We have a Telegraf fleet of 4 nodes that receive metrics from ~ 10K nodes or even more. Write rate to Kafka with sync producer is low in comparison with influx... eventually we start seeing errors like “internal agent didn’t manage to gather metrics during 10s, skipping flush interval”. And cpu and memory usage on the fleet is totally normal as well as network usage... we were desperate until I rewrote the code.... now all errors are gone and tick stacks works through Kafka. Telegraf consumer fleet reads metrics from Kafka and writes into InfluxDB.
I also saw other posts in the Internet where people stumble over Kafka output plugin write rate limit ~ 10KB per second if I’m not mistaken...
Could you please help us with my original request to merge it?
Thank you
Thank you
Sent from Mail.Ru app for iOS
Tuesday, May 15, 2018, 8:59 PM +0300 from [email protected] <[email protected]>:
…Can you tell me a little bit about why the async producer is helpful? When would a user want to enable async?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub , or mute the thread .
|
I think the main difference between this method and the current one is that you are not waiting for success or errors. I'd like to stop using the sync producer (which just wraps the async producer) completely, but then we need to have a configurable number of messages in flight and we still need to monitor the errors channel. Another potential performance improvement would be to use SerializeBatch to place the full batch into a single Kafka message. |
Hello,
So does it mean that you going to issue the fix?
Sent from Mail.Ru app for iOS
Wednesday, May 16, 2018, 1:47 AM +0300 from [email protected] <[email protected]>:
…I think the main difference between this method and the current one is that you are not waiting for success or errors. I'd like to stop using the sync producer (which just wraps the async producer) completely, but then we need to have a configurable number of messages in flight and we still need to monitor the errors channel.
Another potential performance improvement would be to use SerializeBatch to place the full batch into a single Kafka message.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub , or mute the thread .
|
I will try to work on this when I have some free time but I can't guarantee when that will be. If you would like to address the issues we may be able to add it sooner. To summarize we need to:
|
Hello Daniel,
We ran into another quite annoying issue with Kafka Plugin.
I hoped you could tell me where to look for a problem.
When Kafka is not available telegraf can't write to Kafka of source. But when it is back up and running Telegraf Kafka Plugin doesn't try to reconnect.
I believe this logic is outside of plugin itself... Could you please give me a hint, ideally a file name where reconnection is handled.
Or maybe you could share what needs to be added, modified in order to force output plugin reconnect on failures.
Thank you
…Среда, 16 мая 2018, 21:25 +03:00 от Daniel Nelson ***@***.***>:
I will try to work on this when I have some free time but I can't guarantee when that will be. If you would like to address the issues we may be able to add it sooner. To summarize we need to:
* Remove SyncProducer, only use AsyncProducer
* Check errors
* Investigate if we need to introduce options for configuring in flight messages, message batch sizes.
* Unittests
* Bonus: benchmarks
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub , or mute the thread .
|
@apollo-3 In 1.8.0 we switched to using With this change I don't believe we need to switch to the AsyncProducer directly. Can you test out the new release? Sorry about missing your last comment about not reconnecting, it should be automatic. If you are still having trouble in 1.8.0 can you open a new issue? |
Required for all PRs: