[EBPF] Generate complexity data in CI #26723

gjulianm · 2024-06-14T09:24:23Z

What does this PR do?

Enables the generation of complexity data files in KMT CI. It also fixes some errors in the verifier log parser detected during testing of this PR.

Motivation

A future PR will leverage these files to report complexity changes in PRs.

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

pr-commenter · 2024-06-14T12:17:42Z

Regression Detector

Regression Detector Results

Run ID: f84fba87-a0f7-4ad3-9303-af459216b0ac Metrics dashboard Target profiles

Baseline: 9551500
Comparison: b14de82

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	pycheck_1000_100byte_tags	% cpu utilization	+5.02	[+0.27, +9.78]	Logs

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	links
❌	pycheck_1000_100byte_tags	% cpu utilization	+5.02	[+0.27, +9.78]	Logs
➖	uds_dogstatsd_to_api_cpu	% cpu utilization	+0.92	[+0.04, +1.80]	Logs
➖	otel_to_otel_logs	ingress throughput	+0.76	[-0.06, +1.57]	Logs
➖	tcp_syslog_to_blackhole	ingress throughput	+0.05	[-12.79, +12.89]	Logs
➖	tcp_dd_logs_filter_exclude	ingress throughput	-0.00	[-0.01, +0.01]	Logs
➖	uds_dogstatsd_to_api	ingress throughput	-0.00	[-0.00, +0.00]	Logs
➖	idle	memory utilization	-0.07	[-0.10, -0.03]	Logs
➖	file_tree	memory utilization	-0.45	[-0.50, -0.40]	Logs
➖	basic_py_check	% cpu utilization	-1.28	[-3.88, +1.31]	Logs

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

gjulianm · 2024-06-26T10:15:30Z

/trigger-ci --variable RUN_ALL_BUILDS=true --variable RUN_KITCHEN_TESTS=true --variable RUN_E2E_TESTS=on --variable RUN_UNIT_TESTS=on --variable RUN_KMT_TESTS=on

dd-devflow · 2024-06-26T10:15:45Z

🚂 Gitlab pipeline started

Started pipeline #37571002

pr-commenter · 2024-06-26T17:11:15Z

Test changes on VM

Use this command from test-infra-definitions to manually test this PR changes on a VM:

inv create-vm --pipeline-id=37957316 --os-family=ubuntu

Note: This applies to commit b14de82

val06

reviewed

.gitlab/kernel_matrix_testing/common.yml

val06 · 2024-06-28T16:04:00Z

pkg/ebpf/verifier/stats.go

+			// quite large before the garbage collector kicks in and releases memory to the OS.
+			// This causes out-of-memory errors in CI specially, which an environment with higher memory
+			// restrictions and multiple programs running in different VMs.
+			debug.FreeOSMemory()


the 1gb is accounted by the VerifierLog field of the Program struct?

in that case, maybe it is better to set explicitly the string to empty string, since the program parent struct is a held by pointer and it might cause the GC to skip freeing that memory (or delay the actual free operation)

You may need to do both: set the string to "" and call debug.FreeOSMemory()

I tested it with both ways and setting the string to "" was not enough to make the GC collect the memory, that's why I added debug.FreeOSMemory(). Just in case I changed the code to set the verifier log to an empty string to ensure it's freed, although I think that prog being scoped to just the loop is enough for the GC to mark that memory as unused.

golang GC works a bit differently when it comes to pointers and heap memory

gjulianm · 2024-07-02T17:22:31Z

/merge

dd-devflow · 2024-07-02T17:22:37Z

🚂 MergeQueue: pull request added to the queue

The median merge time in main is 26m.

Use /merge -c to cancel this operation!

This reverts commit bb693d9.

gjulianm added this to the 7.56.0 milestone Jun 14, 2024

gjulianm self-assigned this Jun 14, 2024

github-actions bot added the team/ebpf-platform label Jun 14, 2024

gjulianm added changelog/no-changelog qa/no-code-change No code change in Agent code requiring validation labels Jun 14, 2024

gjulianm force-pushed the guillermo.julian/generate-complexity-reports branch from 3c0d510 to bf31624 Compare June 14, 2024 10:28

gjulianm mentioned this pull request Jun 19, 2024

[EBPF] Use Go internal ELF/DWARF library for complexity analysis #26824

Merged

gjulianm force-pushed the guillermo.julian/generate-complexity-reports branch from 72d3521 to 489bc71 Compare June 26, 2024 09:52

github-actions bot added the component/system-probe label Jun 26, 2024

gjulianm force-pushed the guillermo.julian/generate-complexity-reports branch from 489bc71 to 3766295 Compare June 26, 2024 09:56

github-actions bot removed the component/system-probe label Jun 26, 2024

gjulianm force-pushed the guillermo.julian/generate-complexity-reports branch from 3766295 to 5880070 Compare June 26, 2024 11:38

gjulianm added 14 commits June 27, 2024 10:32

Build verifier calculator and send it to VMs

881ca2d

Collect complexity in CI

a78ba4d

Fix syntax

9456652

Fix COLLECT_COMPLEXITY env variable

10754db

Use TEST_COMPONENT for output name

a12c710

Generate the complexity as an artifact

202c3ac

Fix log path

d2320d4

More logging

477eca1

Set BPF dir

2cb58f0

Fix eBPF dir

b7b5d6f

Fix test root

e55d906

Fix shellcheck

b502390

Tag complexity results

3e4b9dd

Fix copy command

db5a9e6

gjulianm added 3 commits June 27, 2024 10:33

Fix excessive memory footprint of the calculator

7b45b5d

Fix parsing of scalar registers

b22fc5b

Skip complexity report on unsupported platforms

facc5d5

gjulianm force-pushed the guillermo.julian/generate-complexity-reports branch from 9ac5f9c to facc5d5 Compare June 27, 2024 08:34

github-actions bot added the component/system-probe label Jun 27, 2024

Fix register parsing with precise values

c87a2d9

gjulianm marked this pull request as ready for review June 28, 2024 11:50

gjulianm requested a review from a team as a code owner June 28, 2024 11:50

val06 requested changes Jun 28, 2024

View reviewed changes

gjulianm added 2 commits July 1, 2024 10:32

Explain unsupported platforms

f7299e2

Set verifier log to empty string

b14de82

val06 approved these changes Jul 1, 2024

View reviewed changes

dd-mergequeue bot merged commit bb693d9 into main Jul 2, 2024
316 of 317 checks passed

dd-mergequeue bot deleted the guillermo.julian/generate-complexity-reports branch July 2, 2024 17:46

brycekahle added a commit that referenced this pull request Jul 2, 2024

Revert "[EBPF] Generate complexity data in CI (#26723)"

6a4200e

This reverts commit bb693d9.

brycekahle mentioned this pull request Jul 2, 2024

Revert "[EBPF] Generate complexity data in CI (#26723)" #27263

Merged

brycekahle added a commit that referenced this pull request Jul 2, 2024

Revert "[EBPF] Generate complexity data in CI (#26723)" (#27263)

3628de0

gjulianm mentioned this pull request Jul 3, 2024

[EBPF] Generate complexity data in CI #27274

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EBPF] Generate complexity data in CI #26723

[EBPF] Generate complexity data in CI #26723

gjulianm commented Jun 14, 2024 •

edited

Loading

pr-commenter bot commented Jun 14, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

gjulianm commented Jun 26, 2024

dd-devflow bot commented Jun 26, 2024 •

edited

Loading

pr-commenter bot commented Jun 26, 2024 •

edited

Loading

val06 left a comment

val06 Jun 28, 2024

brycekahle Jun 28, 2024 •

edited

Loading

gjulianm Jul 1, 2024

val06 Jul 1, 2024 •

edited

Loading

gjulianm commented Jul 2, 2024

dd-devflow bot commented Jul 2, 2024

[EBPF] Generate complexity data in CI #26723

[EBPF] Generate complexity data in CI #26723

Conversation

gjulianm commented Jun 14, 2024 • edited Loading

What does this PR do?

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

pr-commenter bot commented Jun 14, 2024 • edited Loading

Regression Detector

Regression Detector Results

Significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

gjulianm commented Jun 26, 2024

dd-devflow bot commented Jun 26, 2024 • edited Loading

pr-commenter bot commented Jun 26, 2024 • edited Loading

Test changes on VM

val06 left a comment

Choose a reason for hiding this comment

val06 Jun 28, 2024

Choose a reason for hiding this comment

brycekahle Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

gjulianm Jul 1, 2024

Choose a reason for hiding this comment

val06 Jul 1, 2024 • edited Loading

Choose a reason for hiding this comment

gjulianm commented Jul 2, 2024

dd-devflow bot commented Jul 2, 2024

gjulianm commented Jun 14, 2024 •

edited

Loading

pr-commenter bot commented Jun 14, 2024 •

edited

Loading

dd-devflow bot commented Jun 26, 2024 •

edited

Loading

pr-commenter bot commented Jun 26, 2024 •

edited

Loading

brycekahle Jun 28, 2024 •

edited

Loading

val06 Jul 1, 2024 •

edited

Loading