Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs on UNMANAGED_CODE_TIME and CPU_TIME #1613

Merged
merged 2 commits into from
Apr 25, 2022

Conversation

josalem
Copy link
Contributor

@josalem josalem commented Apr 18, 2022

@josalem josalem self-assigned this Apr 18, 2022

The above code will generate CPU samples on EventPipe that look like `Main` or `Main->A->B->C->MyNativeFunction` but contain none of the native stack that happens after `MyNativeFunction`.

During a typical CPU usage investigation, you may want to ask about on-CPU vs off-CPU time to determine when your code is blocked waiting or actively doing work. Due to the limitation of only knowing about managed frames, CPU samples collected via EventPipe can't give exact on/off-CPU information. Instead, the TraceEvent library uses a heuristic to add pseudo-frames to the trave that indicate whether there are additional native frames on the stack or no. It uses relative time deltas between events and the state of the Garbage Collector at the time the sample was take to insert either `UNMANAGED_CODE_TIME` or `CPU_TIME` onto the stack in the "Thread Time" view or when exporting to SpeedScope or other formats. `UNMANAGED_CODE_TIME` represents stacks where there are one or more native frames after the last managed frame and that these frames may be blocked waiting _or_ actively on the CPU. `CPU_TIME` represents stacks where the last managed frame is the function currently on the CPU and doing work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last two sentences seem to contradict the 3rd from last.

UNMANAGED_CODE_TIME represents stacks where there are one or more native frames after the last managed frame and that these frames may be blocked waiting or actively on the CPU. CPU_TIME represents stacks where the last managed frame is the function currently on the CPU and doing work.

I read these sentences saying that what matters is whether there are native frames at the leaf of the stack.

It uses relative time deltas between events and the state of the Garbage Collector at the time the sample was take to insert either UNMANAGED_CODE_TIME or CPU_TIME onto the stack in the "Thread Time" view or when exporting to SpeedScope or other formats

I read this sentence saying that what matters is the GC state and time deltas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phrasing is definitely confusing. The GC state is what informs the computer whether there were native leaf frames, and the relative time deltas tell us how long that was the case.

For example, if I have a function that is 50% managed work and 50% P/Invoked work, I'd end up with two sets of thread time stacks. One that is MyFunction->CPU_TIME corresponding to the aggregate of samples where the GC was in COOP mode and another that is MyFunction->UNMANAGED_CODE_TIME for when the GC was in PRE mode.

The duration of the CPU_TIME and UNMANAGED_CODE_TIME frames come from the relative deltas and which frame gets used comes from the GC mode during the sample.

Let me try to reword this so it's more obvious that all these things are describing the same set of information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest deleting the part about GC state and time deltas in that case. I think the critical information for the reader is to understand what callstack produces what descriptive term. The techniques we use to determine the callstack are implementation details that appear unnecessary as long as they are accomplishing the job reliably.

@@ -936,3 +936,41 @@ Unfortunately when a provider registers with the operating system (the Win32 `Ev
## On the fly filtering AND Logging using `ETWReloggerTraceEventSource`

TODO

## `UNMANAGED_CODE_TIME` and `CPU_TIME` when analyzing CPU traces collected with EventPipe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful for people targeting TraceEvent programmatically. For users using PerfView I'd expect them to look here instead: https://github.com/microsoft/perfview/blob/main/src/PerfView/SupportFiles/UsersGuide.htm. That page is the guide that is included in the PerfView tool where all the in-app help links resolve to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's a better spot for this.

@mikelle-rogers
Copy link

@josalem how many people look at this repo? In the meeting we were talking about how perfview is not the tool we really try to advertise. I think it is important to have this documented somewhere people can see it because UNMANAGED_CODE_TIME and CPU_TIME show up on any traces (not just when the traces are analyzed in perfview).

@brianrob
Copy link
Member

brianrob commented Apr 19, 2022

@josalem how many people look at this repo? In the meeting we were talking about how perfview is not the tool we really try to advertise. I think it is important to have this documented somewhere people can see it because UNMANAGED_CODE_TIME and CPU_TIME show up on any traces (not just when the traces are analyzed in perfview).

@mikelle-rogers, this is a good question. I do think that this belongs in the PerfView Users Guide as @noahfalk points out, but I also wonder if it belongs elsewhere. Do these strings show up in any of the formats that dotnet-trace can export to?

@josalem
Copy link
Contributor Author

josalem commented Apr 19, 2022

I think it is important to have this documented somewhere people can see it because UNMANAGED_CODE_TIME and CPU_TIME show up on any traces (not just when the traces are analyzed in perfview).

These pseudo frames show up in 2 places and only if the trace is a nettrace file collected via EventPipe:

  1. PerfView Thread Time view
  2. Exported speedscope and chromium trace files from PerfView or dotnet-trace

While dotnet-trace can export in those formats and generate these pseudo-frames, all the code/API for doing so lives in TraceEvent.

We could link to this blurb in the user guide (after I move it there) from the dotnet-trace docs, but I'm not sure it makes sense to put it directly on dotnet-trace.

For example: "dotnet-trace uses the TraceEvent library for converting nettrace files to other formats. For details on the conversion, see: ".

@mikelle-rogers
Copy link

@josalem

For example: "dotnet-trace uses the TraceEvent library for converting nettrace files to other formats. For details on the conversion, see: ".

That sounds great to me!

@brianrob
Copy link
Member

For example: "dotnet-trace uses the TraceEvent library for converting nettrace files to other formats. For details on the conversion, see: ".

Agreed, sounds great.

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@josalem josalem merged commit 570cc03 into microsoft:main Apr 25, 2022
@josalem josalem deleted the dev/josalem/unmanaged-code-time-docs branch April 25, 2022 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Correctly Document UNMANAGED_CODE_TIME vs CPU_TIME
4 participants