Add benchmark for filestream input #37317

rdner · 2023-12-06T17:21:14Z

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~- [ ] I have made corresponding changes to the documentation~~
~~- [ ] I have made corresponding change to the default configuration files~~
~~- [ ] I have added tests that prove my fix is effective or that my feature works~~
~~- [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.~~

How to test this PR locally

Default Filestream Configuration

cd ./filebeat/input/filestream
go test -run=none -bench='BenchmarkFilestream/filestream_default_throughput.*' -benchmem -benchtime=100x

On my machine I've got the following results:

BenchmarkFilestream/filestream_default_throughput-10
100          10703028 ns/op        13576607 B/op     181225 allocs/op

Fingerprint File Identity

cd ./filebeat/input/filestream
go test -run=none -bench='BenchmarkFilestream/filestream_fingerprint_throughput.*' -benchmem -benchtime=100x

BenchmarkFilestream/filestream_fingerprint_throughput-10
100          11664752 ns/op        13746616 B/op     191417 allocs/op

Related issues

Relates [Filebeat] possible filestream performance regression on Windows #36694

elasticmachine · 2023-12-06T17:52:16Z

❕ Build Aborted

Expand to view the summary

Build stats

Duration: 31 min 32 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-12-06T18:04:46Z

❕ Build Aborted

Expand to view the summary

Build stats

Duration: 31 min 46 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.

elasticmachine · 2023-12-06T18:30:20Z

❕ Build Aborted

Expand to view the summary

Build stats

Duration: 33 min 56 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-12-06T20:24:37Z

💚 Build Succeeded

Expand to view the summary

Build stats

Start Time: 2023-12-06T18:11:54.687+0000
Duration: 133 min 36 sec

Test stats 🧪

Test	Results
Failed	0
Passed	8315
Skipped	755
Total	9070

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-12-06T21:50:01Z

❕ Build Aborted

Expand to view the summary

Build stats

Duration: 8 min 10 sec

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-12-06T23:56:03Z

💚 Build Succeeded

Expand to view the summary

Build stats

Start Time: 2023-12-06T21:45:05.332+0000
Duration: 131 min 51 sec

Test stats 🧪

Test	Results
Failed	0
Passed	8315
Skipped	755
Total	9070

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

leehinman

LGTM

one optional nit/comment

leehinman · 2023-12-07T02:23:43Z

filebeat/input/filestream/input_test.go

+	connector, eventsDone := newTestPipeline(expEventCount)
+	done := make(chan struct{})
+	go func() {
+		err := input.Run(context, connector)


nit. Right now the benchmark is capturing the time spent with the config, setting up manager etc. Might be worth using StopTimer and StartTimer to limit the reporting the time spent in input.Run. I'm sure the setup time is roughly constant so not a huge deal, but it might make it easier to see small changes

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat. (cherry picked from commit f2cf95c)

alexsapran · 2023-12-07T15:33:59Z

awesome work, this will make a big impact especially when we start running and reporting the status of those benchmarks per PR basis.

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.

rdner added enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team backport-7.17 Automated backport to the 7.17 branch with mergify labels Dec 6, 2023

rdner self-assigned this Dec 6, 2023

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Dec 6, 2023

rdner force-pushed the filestream-benchmark branch from 0c78985 to 0ea5a7c Compare December 6, 2023 17:33

rdner force-pushed the filestream-benchmark branch from 0ea5a7c to 65ade18 Compare December 6, 2023 17:56

rdner marked this pull request as ready for review December 6, 2023 18:06

rdner requested a review from a team as a code owner December 6, 2023 18:06

rdner requested review from belimawr, faec, alexsapran and leehinman December 6, 2023 18:06

Add benchmark for filestream input

db5e526

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.

rdner force-pushed the filestream-benchmark branch from 65ade18 to db5e526 Compare December 6, 2023 18:11

rdner enabled auto-merge (squash) December 6, 2023 18:17

rdner added 2 commits December 6, 2023 22:40

Switch helper functions to a more generic interface

0ac77df

Even more generic interfaces

a809df9

leehinman approved these changes Dec 7, 2023

View reviewed changes

rdner merged commit f2cf95c into elastic:main Dec 7, 2023
29 checks passed

mergify bot mentioned this pull request Dec 7, 2023

[7.17](backport #37317) Add benchmark for filestream input #37328

Closed

rdner added the backport-skip Skip notification from the automated backport with mergify label Dec 7, 2023

rdner removed the backport-7.17 Automated backport to the 7.17 branch with mergify label Dec 7, 2023

rdner deleted the filestream-benchmark branch December 7, 2023 09:40

rdner mentioned this pull request Dec 7, 2023

Make filestream benchmark more precise and more re-usable #37334

Merged

3 tasks

rdner mentioned this pull request Dec 7, 2023

Add filestream benchmarks for many files case, fix data race #37345

Merged

3 tasks

rdner mentioned this pull request Jan 2, 2024

Event normalization conversion optimization #36044

Draft

6 tasks

Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024

Add benchmark for filestream input (elastic#37317)

cd73761

Now we can quickly compare performance metrics when we make changes to the filestream implementation without running the whole Filebeat.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark for filestream input #37317

Add benchmark for filestream input #37317

rdner commented Dec 6, 2023 •

edited

Loading

elasticmachine commented Dec 6, 2023

Build stats

elasticmachine commented Dec 6, 2023

Build stats

elasticmachine commented Dec 6, 2023

Build stats

elasticmachine commented Dec 6, 2023

Build stats

Test stats 🧪

elasticmachine commented Dec 6, 2023

Build stats

elasticmachine commented Dec 6, 2023

Build stats

Test stats 🧪

leehinman left a comment

leehinman Dec 7, 2023

alexsapran commented Dec 7, 2023

Add benchmark for filestream input #37317

Add benchmark for filestream input #37317

Conversation

rdner commented Dec 6, 2023 • edited Loading

Checklist

How to test this PR locally

Default Filestream Configuration

Fingerprint File Identity

Related issues

elasticmachine commented Dec 6, 2023

❕ Build Aborted

Build stats

🤖 GitHub comments

elasticmachine commented Dec 6, 2023

❕ Build Aborted

Build stats

🤖 GitHub comments

elasticmachine commented Dec 6, 2023

❕ Build Aborted

Build stats

🤖 GitHub comments

elasticmachine commented Dec 6, 2023

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

elasticmachine commented Dec 6, 2023

❕ Build Aborted

Build stats

🤖 GitHub comments

elasticmachine commented Dec 6, 2023

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

leehinman left a comment

Choose a reason for hiding this comment

leehinman Dec 7, 2023

Choose a reason for hiding this comment

alexsapran commented Dec 7, 2023

rdner commented Dec 6, 2023 •

edited

Loading