Simplify profile stack trace representation #615

felixge · 2025-01-09T11:29:23Z

Introduce a first-class Stack message type and lookup table.
Replace location index range based stack trace encoding on Sample with a single stack_index reference.
Remove the location_indices lookup table.

The primary motivation is laying the ground work for timestamp based profiling where the same stack trace needs to be referenced much more frequently compared to aggregation based on low
cardinality attributes.

Timestamp based profiling is also expected to be used with the upcoming Off-CPU profiling feature in the eBPF profiler. Off-CPU stack traces have a different distribution compared to CPU stack traces. In particular Off-CPU stack traces are much more repetitive because they typically occur at special leaf functions such as leaf (async preemption being a notable exception). For the same reason it is also uncommon to see a stack trace that are a root-prefix of a previously observed stack trace.

We might need to revisit the previous previous benchmarks to confirm these claims, as the previous analysis seems to have shown that the location range based encoding is always either comparable or better than the mechanism proposed here.

The secondary motivation is simplicitly. Arguably the proposed change here will make it easier to write exporters, processors as well as receivers.

It seems like we had rough consensus around this change in previous SIG meetings, and it seems like a good incremental step to make progress on the timestamp proposal.

- Introduce a first-class Stack message type and lookup table. - Replace location index range based stack trace encoding on Sample with a single stack_index reference. - Remove the location_indices lookup table. The primary motivation is laying the ground work for [timestamp based profiling][timestamp proposal] where the same stack trace needs to be referenced much more frequently compared to aggregation based on low cardinality attributes. Timestamp based profiling is also expected to be used with the the upcoming [Off-CPU profiling][off-cpu pr] feature in the eBPF profiler. Off-CPU stack traces have a different distribution compared to CPU samples. In particular stack traces are much more repetitive because they only occur at call sites such as syscalls. For the same reason it is also uncommon to see a stack trace are a root-prefix of a previously observed stack trace. We might need to revisit the previous [previous benchmarks][benchmarks] to confirm these claims. The secondary motivation is simplicitly. Arguably the proposed change here will make it easier to write exporters, processors as well as receivers. It seems like we had rough consensus around this change in previous SIG meetings, and it seems like a good incremental step to make progress on the timestamp proposal. [timestamp proposal]: #594 [off-cpu pr]: open-telemetry/opentelemetry-ebpf-profiler#196 [benchmarks]: https://docs.google.com/spreadsheets/d/1Q-6MlegV8xLYdz5WD5iPxQU2tsfodX1-CDV1WeGzyQ0/edit?gid=2069300294#gid=2069300294

beaubelgrave · 2025-01-10T18:33:48Z

opentelemetry/proto/profiles/v1development/profiles.proto

-  // Supersedes location_index.
-  int32 locations_length = 2;
+  // Reference to stack in Profile.stack_table.
+  int32 stack_index = 1;


It would be nice to have an optional int64 instruction_ip property here that would take the common stack at stack_index, but also add the instruction_ip to it. This would allow for common on-CPU callstacks to be respresented cheaply that are in the same function but hit different parts of it.

Shouldn't be just int64 since that won't play well with symbolization and inlines, would have to be a location ID.

Also, this optimization would only be most useful to profile types like CPU sampling where the leaf frame is indeed very random. For profiles like heap profiling it would not be as valuable since there even the leaf is a callsite, not a sampled IP.

And if we are looking at such "encode the tree more efficiently" optimizations then we should probably look at using the actual tree encoding as discussed a bit here.

Agree, it's only useful for on-CPU sampling with skids, etc. So I don't think it should be required, but definitely useful when needed as an option.

beaubelgrave reviewed Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify profile stack trace representation #615

Simplify profile stack trace representation #615

felixge commented Jan 9, 2025 •

edited

Loading

beaubelgrave Jan 10, 2025

aalexand Jan 13, 2025

beaubelgrave Jan 13, 2025

Simplify profile stack trace representation #615

Are you sure you want to change the base?

Simplify profile stack trace representation #615

Conversation

felixge commented Jan 9, 2025 • edited Loading

beaubelgrave Jan 10, 2025

Choose a reason for hiding this comment

aalexand Jan 13, 2025

Choose a reason for hiding this comment

beaubelgrave Jan 13, 2025

Choose a reason for hiding this comment

felixge commented Jan 9, 2025 •

edited

Loading