-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event normalization conversion optimization #36044
base: main
Are you sure you want to change the base?
Conversation
This pull request doesn't have a |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
💔 Tests Failed
Expand to view the summary
Build stats
Test stats 🧪
Test errors
Expand to view the tests failures
|
/test |
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
@jeniawhite the test failures look legit, could you fix it? It's obviously a big performance gain, however, I'm concerned about mutating events with this new implementation. It's, unfortunately, almost impossible to know if it's completely safe. To make sure of it, we would need to review the rest of the codebase and conclude if missing the cloning step here is fine. However, since it's just normalisation, in case shared maps referenced in the event get normalised there is not much of an impact for our customers. I don't see any possible data corruption in this particular case. |
libbeat/common/event_test.go
Outdated
) | ||
|
||
func TestConvertNestedMapStr(t *testing.T) { | ||
func BenchmarkTestConvertNestedMapStr(b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't convert regular tests to benchmarks. Our CI does not run benchmarks therefore does not fail on their failure.
We need a separate simple benchmark.
14a2aba
to
15dd5c1
Compare
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
c6a7426
to
6d06a1e
Compare
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Test errors
Expand to view the tests failures
|
Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.
revert benchmarks return errors and remove test case check for no errors fixing lint fixing lint and tests remove unused
…/beats into evgb-NormalizeInplace
❕ Build Aborted
Expand to view the summary
Build stats
🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Steps errors
Expand to view the steps failures
|
💚 Build Succeeded
Expand to view the summary
Build stats
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
@@ -0,0 +1,175 @@ | |||
// Licensed to Elasticsearch B.V. under one or more contributor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this benchmark over these 3 PRs:
- Add benchmark for filestream input #37317
- Make filestream benchmark more precise and more re-usable #37334
- Add filestream benchmarks for many files case, fix data race #37345
It seems like you don't have the latest version here, therefore it's marked as a new file.
I think you should update this file to the latest version from main
and then add a case for normalisation and compare the benchmark results before and after your changes.
This pull request is now in conflicts. Could you fix it? 🙏
|
What does this PR do?
I've noticed that event normalization takes up a lot of heap memory because it duplicates all of the event.
We can avoid doing that and normalize the event in-place.
The current benchmarks don't emphasize enough the performance gain.
I will attach actual flow benchmarks later on and there is still room for improvement because not everything is being done in-place, but product wise I've improved the common flows.
Why is it important?
This is a hot-path for event handling and reducing the memory allocations will improve the performance of events
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.