Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: KPL aggregation format #16

Closed
jonathanviber opened this issue Feb 5, 2020 · 12 comments
Closed

Feature Request: KPL aggregation format #16

jonathanviber opened this issue Feb 5, 2020 · 12 comments
Labels
enhancement New feature or request

Comments

@jonathanviber
Copy link

Sending a large number of small logs (200 bytes) using PUT RECORDS is less efficient than the KPL (Kinesis Producer Library) format since

a) The KPL sends the data as Protobuf which is much smaller
b) Even if you include 500 records (the max) in your PUT RECORDS request, You will only be sending
a 100kb request. PUT RECORDS allows for up to 5MB per request..
500 records x 200 bytes = +-100kb

The FluentD Kinesis plugin supports kinesis_streams_aggregated which uses the KPL format. Without this feature in fluent-bit, our company couldn't consider changing from FluentD.
Are there any plans to add this functionality to the fluent-bit plugin?

@PettitWesley
Copy link
Contributor

@jonathanviber Supporting KPL makes sense. I can't say exactly when we'll implement it, but we know this is something the plugin should support.

@jonathanviber
Copy link
Author

jonathanviber commented Feb 11, 2020

Thanks for the reply @PettitWesley , I saw it immediately and have been acting on it but apologies for not responding until now. I've been thinking about how to deal with this in the meanwhile.

One option is to run create an agent in C++ using the KPL library and then send messages from fluent-bit to the KPL agent. The flow would be
message => fluentBit => Custom KPL Agent => Kinesis

One of the things putting me off this approach is that it seems like an obvious solution but I have not read of anyone doing it which makes me think that I am missing the obvious problem :)

What do you think?

I'm surprised AWS does not offer a KPL agent. The AWS Kinesis agent does not do aggregation which is a deal breaker for us.

@PettitWesley
Copy link
Contributor

@jonathanviber I lean towards saying the best solution is to write some custom Go code for the aggregation format. The Fluentd plugin uses a custom ruby library: https://github.com/awslabs/aws-fluent-plugin-kinesis/blob/master/lib/fluent/plugin/kinesis_helper/aggregator.rb

It doesn't look that complicated; looks like you just need a protobuf library.

@jonathanviber
Copy link
Author

I was looking at these 2 projects.
https://github.com/awslabs/kinesis-aggregation
https://github.com/a8m/kinesis-producer

We could possible use them to aggregate the data prior to sending it to fluent-bit but it adds other complexities. Currently we read from the tail of the log rather than waiting for a log file to be "full".

The best scenario would be to implement the go aggregation library inside the the fluent-bit kinesis plugin in a similar way to how it is done in FluentD.

I was thinking about trying to write this code and contribute but I have never used the Go language before.

@PettitWesley
Copy link
Contributor

https://github.com/a8m/kinesis-producer

I think this might work in this plugin. The examples show using it to write directly to Kinesis, but it looks like we could also just import their internal library for creating the aggregation format and use that.

@zackwine
Copy link
Contributor

@PettitWesley @jonathanviber Was the work mentioned above ever started? Any branches/info missing from this issue?

My company (Synamedia) is interested in this feature since it equates to a big cost savings for a subset of our log shippers. This feature in combination with compression (#26) is blocking deprecating fluentd for our on-prem to Kinesis log shippers.

I'm willing to help implement/test/review this feature. My company would like to deliver this along with compression in fluentbit in the next 6 weeks to meet a customer deadline. If I contribute is that timeline likely?

@hossain-rayhan
Copy link
Contributor

hossain-rayhan commented Jun 23, 2020

Hi @zackwine ,
This is a good feature request. But, unfortunately, we couldn't start working on this (as per my knowledge).

However, it would be great if your team can contribute to support this (including the compression #26) . From my side, the timeline (6 weeks) seems to be OK. I will update immediately if I hear an exception from my team. Thanks.

@jonathanviber
Copy link
Author

@zackwine
Unfortunately I haven't started on this. I'm a solutions architect and I think this requires a dedicated developer(s) to work on. I keep hoping someone will start on it but so far no-one has. I'm happy to help in any way I can so stay in touch.

@PettitWesley
Copy link
Contributor

@zackwine Between this and the goroutine enhancement, I think we need to give you an award.

We'd love to have you contribute this feature, and will provide any help needed. A 6 week timeline would be awesome; we can get a release of AWS for Fluent Bit out within a few days of merging the change. We know this feature is key for allowing people to switch from Fluentd, and we have prioritized it, but realistically we weren't going to get to it until the end of the year. So you'd be significantly accelerating the timeline and would make a lot of folks happy.

@zackwine
Copy link
Contributor

@hossain-rayhan @jonathanviber @PettitWesley Thanks for your responses. This is good news. This will be a fairly big cost saver for our company as we are leveraging Kinesis at most of customers. Adding this feature will allow us to standardize on fluentbit further reducing our costs/support. I'll start development on this feature next, and share any progress made.

@hencrice hencrice added enhancement New feature or request and removed feature-request labels Jun 29, 2020
@zackwine
Copy link
Contributor

zackwine commented Aug 7, 2020

@jonathanviber @PettitWesley @hossain-rayhan Please test/review the PR I opened. Feedback welcome.

#60

@PettitWesley
Copy link
Contributor

This was released in AWS for Fluent Bit 2.7.0! Many thanks to @zackwine!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants