-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various fixes for generation aware analysis #70764
Conversation
Tagging subscribers to this area: @dotnet/gc Issue Detailsnull
|
@noahfalk, @davmason this is what I was seeing when it was right after
|
I'm suspicious that this reordering shouldn't have been necessary and the root of the issue is still lurking. It should be possible to enable a new session before calling FinishInitialize() and it is supposed to automatically defer starting it until FinishInitialize() runs. Based on @Maoni0's callstack the deferal worked but then when we resumed the session state wasn't completely initialized and it immediately triggered the AV. @davmason its probably worthwhile to examine why the session state was bad. Based on the callstack I assume it was an null pointer for the stream_writer member. |
Was there a repro for this? I'm happy to take a look if there is a repro or dump |
thanks! I'll share the dump/symbols. |
@Maoni0 do you have some steps to reproduce the AV by running an app? I think we'll want to step through the initialization code and figure out why the NULL valued field never got initialized. |
My blog post here has a basic example of using the generation aware analysis. Since the AV happened very early during initialization, it is probably unrelated to the actual app being analyzed. There are perhaps some tricky environment variable values that trigger the bug, but I am not sure. |
right, all I did was set a few env vars to enable genaware analysis -
and start a test with corerun and hit this on initialization. unfortunately I can no longer repro this even with the change reverted... it started seeming randomly too. maybe something just took a bit longer/shorter to create which caused something in that stream_writer not being ready sometimes. |
I have found and pushed another potential fix. In case there is an error creating the session (e.g. the file is not writable), This fix probably cannot explain why the initialization AV but good to have. |
This might be a completely different issue, but I have seen a similar crash with the same stack trace as @Maoni0 when I have used a |
Playing around with it I can repro the same thing @jakobbotsch reports. The comment for COMPlus_EventPipeOutputPath says "full path excluding file name" runtime/src/coreclr/inc/clrconfigvalues.h Line 691 in 3b2883b
But the code does the opposite, it expects the full path including file name, the only modification it does is it will replace runtime/src/native/eventpipe/ep.c Lines 872 to 893 in 3b2883b
The file path via environment variable path isn't hardened to a bad path, and will try to create the @noahfalk and @lateralusX, the environment varaibles are not considered a customer scenario, correct? We should update the comment in clrconfigvalues.h and do something to gracefully terminate the session, but I am not super concerned about making it a great experience if it's not a customer scenario. @cshung and @Maoni0 - let me know if this is not the case you are hitting. I am able to set the generation aware analysis as long as I either don't start another session or give a full path including file name. But happy to debug further if there's something else lurking here. |
We do tell customers about these environment variables and they are documented: |
this is not what I hit. the sequence of things -
so unfortuantely I don't have a repro currently. will let you know if I ever observe this again. |
I think we at least should do similar NULL check on file_stream_writer, runtime/src/native/eventpipe/ep-session.c Line 182 in a052819
runtime/src/native/eventpipe/ep-session.c Line 191 in a052819
|
@Maoni0, @noahfalk, @lateralusX Can we take a look at this PR again? On top of the initial change, I have made a few more changes to make generation-aware analysis more robust:
I think this PR is ready to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo inline comments. We should also make sure the final merged commit gets an appropriate commit message. The original goal of 'initialize later' is only a minority of the total work now.
362ea68
to
9d13e61
Compare
9d13e61
to
4b843da
Compare
No description provided.