-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use receivertest package to test beats receivers #41888
Comments
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
While working on elastic#41888 I was benchmarking the filebeatreceiver CreateLogs factory and noticed that the asset decoding in libbeat dominates the cpu and memory profile of the receiver creation. This behavios is expected since asset decoding is intened to occur at startup. However, it's still worthwhile to optimize it if possible. Some time ago I worked on `iobuf.ReadAll` at elastic/elastic-agent-libs#229, an optimized version of `io.ReadAll` that has a better growth algorithm—based on bytes.Buffer—and benefits from the `io.ReaderFrom` optimization. The choice of when to use this is very picky, as using it with a reader that is not a `io.ReaderFrom` can be slower than the standard `io.ReadAll`. For this case we are certain of the reader implementation, so we can use it. Benchmark results show that it is 5% faster and uses 17% less memory. On top of this use klauspost/compress instead of compress/zlib, that shoves off an additional 11% of the cpu time. In summary, the cummulative effect of these changes is a 15% cpu time reduction and 18% less memory usage in the asset decoding. After these fixes the profiles are still dominated by the asset decoding, but I guess that is expected, at least it is a bit faster now.
While working on elastic#41888 I was benchmarking the filebeatreceiver CreateLogs factory and noticed that the asset decoding in libbeat dominates the cpu and memory profile of the receiver creation. This behavior is expected since asset decoding is intended to occur at startup. However, it's still worthwhile to optimize it if possible. Some time ago I worked on `iobuf.ReadAll` at elastic/elastic-agent-libs#229, an optimized version of `io.ReadAll` that has a better growth algorithm—based on bytes.Buffer—and benefits from the `io.ReaderFrom` optimization. The choice of when to use this is very picky, as using it with a reader that is not a `io.ReaderFrom` can be slower than the standard `io.ReadAll`. For this case we are certain of the reader implementation, so we can use it. Benchmark results show that it is 5% faster and uses 17% less memory. After these fixes the profiles are still dominated by the asset decoding, but I guess that is expected, at least it is a bit faster now.
I investigated trying to use the receivertest for tests and have some findings. The two main challenges to use this package is to write a generator implementation that creates unique events that can be processed by the receiver and tell receivertest where it should look for this id in the For the first problem, the way I found to adapt this generator concept to Beats receiver is to have a generator that writes ndjson lines to a file with a specific id and use the file as a filestream input with Regarding the second challenge, it is a bit more complicated. Unfortunately we cannot use the receivertest as-is, it looks for a hardcoded We would have to extend receivertest to have the ability to use a user defined function to lookup this ID. I have a prototype ready at main...mauri870:beats:receivertest. I have opened open-telemetry/opentelemetry-collector#12003 upstream as well. |
While working on #41888 I was benchmarking the filebeatreceiver CreateLogs factory and noticed that the asset decoding in libbeat dominates the cpu and memory profile of the receiver creation. This behavior is expected since asset decoding is intended to occur at startup. However, it's still worthwhile to optimize it if possible. Some time ago I worked on `iobuf.ReadAll` at elastic/elastic-agent-libs#229, an optimized version of `io.ReadAll` that has a better growth algorithm—based on bytes.Buffer—and benefits from the `io.ReaderFrom` optimization. The choice of when to use this is very picky, as using it with a reader that is not a `io.ReaderFrom` can be slower than the standard `io.ReadAll`. For this case we are certain of the reader implementation, so we can use it. Benchmark results show that it is 5% faster and uses 17% less memory. After these fixes the profiles are still dominated by the asset decoding, but I guess that is expected, at least it is a bit faster now.
While working on #41888 I was benchmarking the filebeatreceiver CreateLogs factory and noticed that the asset decoding in libbeat dominates the cpu and memory profile of the receiver creation. This behavior is expected since asset decoding is intended to occur at startup. However, it's still worthwhile to optimize it if possible. Some time ago I worked on `iobuf.ReadAll` at elastic/elastic-agent-libs#229, an optimized version of `io.ReadAll` that has a better growth algorithm—based on bytes.Buffer—and benefits from the `io.ReaderFrom` optimization. The choice of when to use this is very picky, as using it with a reader that is not a `io.ReaderFrom` can be slower than the standard `io.ReadAll`. For this case we are certain of the reader implementation, so we can use it. Benchmark results show that it is 5% faster and uses 17% less memory. After these fixes the profiles are still dominated by the asset decoding, but I guess that is expected, at least it is a bit faster now. (cherry picked from commit 3d1bdcf)
* libbeat: optimize asset data decoding (#42180) While working on #41888 I was benchmarking the filebeatreceiver CreateLogs factory and noticed that the asset decoding in libbeat dominates the cpu and memory profile of the receiver creation. This behavior is expected since asset decoding is intended to occur at startup. However, it's still worthwhile to optimize it if possible. Some time ago I worked on `iobuf.ReadAll` at elastic/elastic-agent-libs#229, an optimized version of `io.ReadAll` that has a better growth algorithm—based on bytes.Buffer—and benefits from the `io.ReaderFrom` optimization. The choice of when to use this is very picky, as using it with a reader that is not a `io.ReaderFrom` can be slower than the standard `io.ReadAll`. For this case we are certain of the reader implementation, so we can use it. Benchmark results show that it is 5% faster and uses 17% less memory. After these fixes the profiles are still dominated by the asset decoding, but I guess that is expected, at least it is a bit faster now. (cherry picked from commit 3d1bdcf) * s/CreateLogsReceiver/CreateLogs/ --------- Co-authored-by: Mauri de Souza Meneguzzo <[email protected]>
The open telemetry project has a
receivertest
module that can be used to test a receiver's contract. I'm particularly interested in theCheckConsumerContract
function that is used to test the contract between the receiver and the next consumer in the pipeline.This test has a couple of interesting scenarios such as:
This test is based on a Generator that is responsible to produce the data used during the test. For this to work properly the beats receiver must implement the
ReceiveLogs(data plog.Logs)
function to receive the logs from this external source. In the case of beats receivers, the beats themselves produce their own data, so we need to come up with a way to adapt this generator concept or implement said function. I believe the messages should be unique, so perhaps thebenchmark
input being able to output unique strings or an integer counter should be enough.This seems quite handy to setup once and use the same machinery to test every beats receiver.
Here is an example test:
During my initial experimentation with this I found a bug with global state in libbeat that is now resolved by #41475.
The text was updated successfully, but these errors were encountered: