Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweaks for datadog compatibility #74

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

vilterp
Copy link
Collaborator

@vilterp vilterp commented Oct 21, 2022

Without these, the flame graph shows up empty in their viewer.

https://www.datadoghq.com/product/code-profiling/

(Accidentally closed #73; reopening)

@codecov-commenter
Copy link

codecov-commenter commented Oct 21, 2022

Codecov Report

Merging #74 (75cf270) into master (ac78823) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #74      +/-   ##
==========================================
+ Coverage   96.61%   96.62%   +0.01%     
==========================================
  Files           3        3              
  Lines         295      296       +1     
==========================================
+ Hits          285      286       +1     
  Misses         10       10              
Impacted Files Coverage Δ
src/Allocs.jl 92.00% <100.00%> (ø)
src/PProf.jl 99.20% <100.00%> (+<0.01%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -43,7 +43,7 @@ function pprof(alloc_profile::Profile.Allocs.AllocResults = Profile.Allocs.fetch
# Allocs-specific arguments:
frame_for_type::Bool = true,
)
period = UInt64(0x1)
period = sum(alloc.size for alloc in alloc_profile.allocs, init=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 this seems weird.. Isn't this supposed to basically represent how much whatever does one sample represent?

So for a CPU profile, this would be like how much time in between samples? So for an allocation profile, it should be 1 allocation per sample, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah I was misunderstanding this.

Looking at the proto, it says "The number of events between sampled occurrences": https://github.com/google/pprof/blob/131d412537eacd9973045c73835c3ac6ed696765/proto/profile.proto#L85-L86

So I guess it's basically supposed to be the sample interval? E.g. "every 1000th alloc" would be 1000?

]
period_type = ValueType!("heap", "bytes")
period_type = ValueType!("alloc_space", "bytes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh maybe the period_type should actually be (alloc_objects", "count")? then maybe you could keep the period = 0x01?

src/PProf.jl Outdated
@@ -163,7 +163,7 @@ function pprof(data::Union{Nothing, Vector{UInt}} = nothing,
# End of sample
value = [
1, # events
length(location_id), # stack_depth
60000000 # CPU ns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah, I think I found I needed to put some number here to get it to show up in DD, but this is just a bogus value. Not sure what it's supposed to be… Maybe it's supposed to correspond to the sample period? i.e. if you're taking a stack sample every 100ms, this is supposed to be 100ms (in ns)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, that's right. i think that is exactly what it's supposed to be. it's supposed to be the value matching the value type:

        ValueType!("cpu", "nanoseconds")

👍

🤔 i think you should be able to ask Profile what it's current sample rate is, so that this can be an accurate number.

help?> Profile.init()
  init(; n::Integer, delay::Real)

  Configure the delay between backtraces (measured in seconds), and the number n of instruction pointers that may be stored per thread. Each instruction
  pointer corresponds to a single line of code; backtraces generally consist of a long list of instruction pointers. Note that 6 spaces for instruction
  pointers per backtrace are used to store metadata and two NULL end markers. Current settings can be obtained by calling this function with no arguments,
  and each can be set independently using keywords or in the order (n, delay).

  │ Julia 1.8
  │
  │  As of Julia 1.8, this function allocates space for n instruction pointers per thread being profiled. Previously this was n total.

julia> Profile.init()
(10000000, 0.001)

julia> Profile.init()[2]
0.001

Comment on lines -143 to +144
ValueType!("events", "count"), # Mandatory
ValueType!("stack_depth", "count")
ValueType!("events", "count"), # Mandatory
ValueType!("cpu", "nanoseconds")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we still emit stack-depth values, or does even their presence break the datadog viewer?

I'm fine to drop them though; i don't think they really added much, and they're just there because Valentin and i were trying to understand how the sample_types worked, i think

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if they break the DD viewer… I'm not sure why you'd need them though, since each sample itself is a full stack, so the depth information is already there

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool yeah, these mostly all seem fine 👍 couple questions tho

full_name_with_args = _escape_name_for_pprof(String(take!(io)))
call_str = String(take!(io))
# add module name as well
full_name_with_args = _escape_name_for_pprof("$(meth.module).$call_str")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding the module name here should be the only user-visible change in this PR — it's pretty essential to get frames to show up right in DD since it essentially does

package, func = split(name, "."; limit=2)

e.g. for this one:
image

Without the module name it would parse as

  • package: DatadogProfileUploader.profile_and_upload(DatadogProfileUploader
  • func: DDConfig, typeof(myfunc))

If people don't want the behavior to change though, I could put this behind an option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants