-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly Document UNMANAGED_CODE_TIME vs CPU_TIME #1166
Comments
@iliamosko, could you post a sample trace with this behavior and describe your environment, e.g., what version of .NET the target app is running, what version of dotnet-trace was used, what platform you were running on, what you were using to view the trace, and whether you were running inside a container? |
My application is running on .NET Core 3.1 which is trying to simulate high CPU load. The dotnet-trace version I'm using is 3.1.120604. For the trace collected I used the default dotnet-trace collect command with a 1 minute duration:
I've used that command on both a virtual Linux machine(WSL) and on a Windows 10 machine. I'm getting similar results for both traces, but I'm going to focus on the trace collected on Windows. I'm using PerfView that is running on version 2.0.52 to view the collected trace file. So my questions is, What exactly is UNMANAGED_CODE_TIME? What is it collecting? When comparing that trace to one that uses this dotnet-trace collect command:
UNMANAGED_CODE_TIME gets split into CPU_TIME which is understandable but it still doesn't answer what UNMANAGED_CODE_TIME is collecting. |
Can you expand on what your test app is doing? |
So just taking a look at the DoSomeMath method, here is what I am doing: All it does is get the current time and run the DoSomeMath method on a different thread for a certain amount of time. Other methods in this app follow the same approach. Would it be possible to determine which actions in |
@josalem |
This is a test app to understand trace limitations. |
A stack sample that has
where those percentages are inclusive samples. This tells you what managed calls lead to spending CPU time in native code, as well as the % of stack samples that inclusively had that frame in that position. When you put the PerfView UI into Caller-Callee view and bucket on
I'm not quite sure I understand your statement here. The trace shows you which managed calls had the most inclusive samples and by extension spent the most time on the stack. In your images above, you can see that ~23% of stacks sampled on thread 14448 had the frame I think what might be causing confusion here is the parsing of the data rather than the data itself. It sounds like you want to know the answer the question: "What managed calls took the most time?" If that is the question you want answered, looking at the unfiltered caller-callee view won't be the best way to answer it. I would recommend looking at a flamegraph (try If you want to answer the question, "What percent of samples were collected while executing kernel code?" or "How much time was spent inside DoStuff() from MyNativeCode.so?" then you will need to use a different tracing tool. Specifically, you will need a CPU sampling tool that is capable of collecting native frames, e.g., Let me know if that helps/answered your question 😄 |
@josalem
I think the sticking point is what "the most time on the stack" means in
Is the thread time CPU or BLOCKED? For example, what about this screenshot:
ReadKey call is not CPU- bound for what we know, yet it is lumped together with get_UtcNow under the same UNMANAGED_CODE_TIME bucket. Hard to tell without specialized knowledge of how the framework implements this or that API.
Thank you for looking into this! |
I took a bit of a deeper look into what causes PerfView to output As for differentiating between "blocked" time and "working" time, that is unfortunately something that I don't think |
We are going to try https://www.speedscope.app/ then. Thanks. |
The code responsible for adding the public void LogThreadStack(double timeRelativeMSec, StackSourceCallStackIndex stackIndex, TraceThread thread, SampleProfilerThreadTimeComputer computer, bool onCPU)
{
if (onCPU)
{
if (ThreadRunning) // continue running
{
AddCPUSample(timeRelativeMSec, thread, computer);
}
else if (ThreadBlocked) // unblocked
{
AddBlockTimeSample(timeRelativeMSec, thread, computer);
LastBlockStackRelativeMSec = -timeRelativeMSec;
}
LastCPUStackRelativeMSec = timeRelativeMSec;
LastCPUCallStack = stackIndex;
}
else
{
if (ThreadBlocked) // continue blocking
{
AddBlockTimeSample(timeRelativeMSec, thread, computer);
}
else if (ThreadRunning) // blocked
{
AddCPUSample(timeRelativeMSec, thread, computer);
}
LastBlockStackRelativeMSec = timeRelativeMSec;
LastBlockCallStack = stackIndex;
}
}
|
Then this would be rather be a |
This issue is only tracking the documentation of what those values mean. The underlying issue is being tracked here: dotnet/runtime#45179. I'm not sure why they weren't linked. |
Thanks for the update @josalem, now I know this does not apply to our situation (on Linux). |
@josalem I see this got closed but was there a doc change that completed it? I wasn't able to find it. |
I added docs on the PerfView side here: microsoft/perfview#1613 I didn't add any docs in the dotnet docs though. If we do, we should just link to the PerfView ones. |
Ah thats right, I remember talking about the PerfView docs now. Thanks for reminding my forgetful ole self ;p |
Is there any documentation about what is UNMANAGED_CODE_TIME? When I look into the Thread Time stack, UNMANAGED_CODE_TIME always takes up 100% of the time. It seems very obscure as to what exactly it is collecting.
Originally posted by @iliamosko in #976 (comment)
The text was updated successfully, but these errors were encountered: