-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer_chunk_limit exceeded when using detach_process #930
Comments
Is this a fluent-plugin-kinesis issue or fluentd core issue? |
The Kinesis plugin extends BufferedOutput, and the file buffering is done by the core BufferedOutput class. That's why I reported the issue here - please let me know if my understanding is incorrect and this is actually an issue specific to the Kinesis plugin. |
Hm... so from your test, |
That's correct, it only happens when using |
Okay. That's clear for me. |
@cbroglie Where is the fluent-bench code? Is this a your test program? |
Yes that's my test program, I can create a gist. |
Thx! I will try to fix the problem with your program 👍 |
Pushed a slightly redacted version to https://github.com/cbroglie/fluent-bench. It requires Go 1.6 to build. |
@repeatedly I spent a little time looking into this, and the difference when running with
I tracked this further to the forwarder logic in process.rb: def new_forwarder(w, interval)
if interval < 0.2 # TODO interval
Forwarder.new(w)
else
DelayedForwarder.new(w, interval)
end
end
I tested modifying |
Seems to be performant enough, I was able to sustain 16k RPS after making the change with the following configuration: <match kinesis.**>
@type kinesis_producer
log_level info
region us-west-2
stream_name cbroglie-fluentd-test-10
aws_key_id xxx
aws_sec_key xxx
buffer_chunk_limit 500k
buffer_queue_limit 40000
flush_interval 1
try_flush_interval 0.1
queued_chunk_flush_interval 0.01
num_threads 25
buffer_type file
buffer_path /var/log/td-agent/ztrack*.buffer
detach_process 3
disable_retry_limit true
</match> And I can sustain higher rates by increasing the value of |
Good.
We can add interval option to control forwarding interval but |
Yeah, the reason I need |
@repeatedly I don't have any good options for fixing the CPU bottleneck in the Kinesis output plugin at the moment, so I opened a PR (#982) which makes the interval in |
Using a file buffer output plugin with
detach_process
results in chunk sizes >>buffer_chunk_limit
when sending events at high speed. My settings specify 1MB chunk sizes, but it is easy to generate chunk sizes >50MB by writing 500k record (~200 bytes per record) in 4 seconds. My testing was done using a c4.2xl instance running Amazon Linux 2016.03 with td-agent2.Config
Benchmark driver
Resulting buffer files
Full background
I'm trying to use fluentd with the kinesis output plugin, and am currently trying to benchmark what throughput we can achieve. The output plugin uses the Amazon Kinesis Producer Library, which is a native C++ binary. I benchmarked the KPL native process at being able to sustain ~60k RPS (~10MB/s), and thus planned on using
detach_process
to launch multiple KPL processes to support higher throughput through fluentd.The Kinesis output plugin works by sending the entire chunk to the KPL, and then waiting for a response for each record in the chunk before returning from write. It's critical that the chunk sizes don't exceed the limit, since sending massive batches to the KPL at once will cause records to timeout in the KPL while waiting to be sent, eventually resulting in the entire batch being retried.
The text was updated successfully, but these errors were encountered: