Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process.WaitForExit() deadlock if there are childprocess still running. #51277

Open
Kuinox opened this issue Apr 14, 2021 · 13 comments
Open

Process.WaitForExit() deadlock if there are childprocess still running. #51277

Kuinox opened this issue Apr 14, 2021 · 13 comments
Labels
area-System.Diagnostics.Process needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Milestone

Comments

@Kuinox
Copy link
Contributor

Kuinox commented Apr 14, 2021

Description

This bug was first reported in the msbuild repo but I believe it's a bug in the Process class itself and not in msbuild.
The issue is that WaitForExit() will synchronously wait for the stderr/stdout to complete, the summoned process has exited, but it has childs process still running (the nodereuse processes in msbuild case).
In this case, it looks like that the stderr/stdout pipes wont close.
So we are stuck in the waits there:
https://github.com/dotnet/runtime/blob/main/src/libraries/System.Diagnostics.Process/src/System/Diagnostics/Process.Windows.cs#L186

We hit this code path because the timeout is Timeout.Infinite.
There is hope for a beginning of a workaround:
We can avoid the deadlock if we call WaitForExit(int milliseconds).
But there is an issue in this case

When standard output has been redirected to asynchronous event handlers, it is possible that output processing will not have completed when this method returns. To ensure that asynchronous event handling has been completed, call the WaitForExit() overload that takes no parameter after receiving a true from this overload.

Now there is no way to retrieve the data in the stdout/stderr without a concurrency issue that would cause data to be lost in case of a race condition.

Other information

This bug happens on Windows but by looking at the *nix implementation of WaitForExitCore, it may be affected too (I have no idea how the lifecycle of the pipes are supposed to work on linux/windows, so it's probably wrong)

Edit: I tested on WSL with ubuntu and it doesn't have a problem, so it's a Windows only problem.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Diagnostics.Process untriaged New issue has not been triaged by the area owner labels Apr 14, 2021
@ghost
Copy link

ghost commented Apr 14, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

This bug was first reported in the msbuild repo but I believe it's a bug in the Process class itself and not in msbuild.
The issue is that WaitForExit() will synchronously wait for the stderr/stdout to complete, the summoned process has exited, but it has childs process still running (the nodereuse processes in msbuild case).
In this case, it looks like that the stderr/stdout pipes wont close.
So we are stuck in the waits there:
https://github.com/dotnet/runtime/blob/main/src/libraries/System.Diagnostics.Process/src/System/Diagnostics/Process.Windows.cs#L186

We hit this code path because the timeout is Timeout.Infinite.
There is hope for a beginning of a workaround:
We can avoid the deadlock if we call WaitForExit(int milliseconds).
But there is an issue in this case

When standard output has been redirected to asynchronous event handlers, it is possible that output processing will not have completed when this method returns. To ensure that asynchronous event handling has been completed, call the WaitForExit() overload that takes no parameter after receiving a true from this overload.

Now there is no way to retrieve the data in the stdout/stderr without a concurrency issue that would cause data to be lost in case of a race condition.

Other information

This bug happens on Windows but by looking at the *nix implementation of WaitForExitCore, it may be affected too (I have no idea how the lifecycle of the pipes are supposed to work on linux/windows, so it's probably wrong)

Author: Kuinox
Assignees: -
Labels:

area-System.Diagnostics.Process, untriaged

Milestone: -

@jeffhandley jeffhandley added this to the Future milestone Jul 13, 2021
@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Jul 13, 2021
@Kuinox
Copy link
Contributor Author

Kuinox commented Sep 9, 2021

I wrote a workaround thats allow to retrieve the stdout/stderr of a process that may cause a deadlock like this.
Only time will tell if it really fixed the deadlock without dataloss (I don't know what will happen if we receive a lot of data when the program exit , and the CTS of the AsyncStreamReader is triggered.).

process.WaitForExit( int.MaxValue ); //First we need for the program to exit.
// Here the program exited, but background may still pump messages
ReflectionHack( process ); // This will cancel the reads in the pipes background loop.
process.WaitForExit(); // This allow to wait for the 2 pipes async to finish looping and flushing the last messages.
// Here you shouldn't receive any message.

/// <summary>
/// This method shut down things in <see cref="Process"/>. Call it when you know that the process has exited.
/// Sometimes the <see cref="Process"/> class deadlock itself.
/// See this for more info: https://github.com/dotnet/runtime/issues/51277
/// </summary>
/// <param name="process"></param>
static void ReflectionHack( Process process )
{
    const BindingFlags bindingFlags = BindingFlags.NonPublic | BindingFlags.Instance;
    FieldInfo? outputField = typeof( Process ).GetField( "_output", bindingFlags );
    FieldInfo? errorField = typeof( Process ).GetField( "_error", bindingFlags );
    ((IDisposable)outputField!.GetValue( process )!).Dispose();
    ((IDisposable)errorField!.GetValue( process )!).Dispose();
}

Sadly the cleanest way I found to work around this issue is with this little reflection hack.

Edit: I now think that this reflection hack may lead to data loss at the end of the process.

@Kuinox
Copy link
Contributor Author

Kuinox commented Dec 5, 2021

Hello,
By lookin up for the code of the new WaitForExitAsync, I think it will be prone to the same issue.
Can this issue be looked up ?

@Kuinox
Copy link
Contributor Author

Kuinox commented Dec 7, 2021

I made a reproduction here:
https://github.com/Kuinox/DotnetProcessBugRepro (Run ProcessStarter)

@Kuinox
Copy link
Contributor Author

Kuinox commented Dec 17, 2021

@Evengard
Copy link

#79817
Did that solve the issue I wonder?

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 25, 2024

#79817 Did that solve the issue I wonder?

I just tested on WSL-ubuntu and it doesn't have the problem at all, so unix was just unaffected.
The patch is also Unix specific so I don't think it solved the issue.
I also upgraded my reproduction repo from 5.0 to 8.0 and it still deadlock itself in 8.0
https://github.com/Kuinox/DotnetProcessBugRepro

@MichalPavlik
Copy link
Member

MichalPavlik commented Jul 19, 2024

Could you please investigate this issue as it seems to be impacting our team (MSBuild)? Your attention to this matter would be greatly appreciated.

cc:@baronfel

@Kuinox
Copy link
Contributor Author

Kuinox commented Sep 16, 2024

@MichalPavlik hey, this issue is a bit in a limbo state due to the "need-further-triage" tag, none in the runtime team looked at this issue for the past 3 years.

@MichalPavlik
Copy link
Member

@Kuinox we are still trying to resurrect this issue :)
@baronfel was you able to find someone who would be willing to take a second look?

@philasmar
Copy link

I am running into this issue as well! Any updates?

@YuliiaKovalova
Copy link
Member

it seems to be closed by design: #29232 (comment)

@Kuinox
Copy link
Contributor Author

Kuinox commented Jan 17, 2025

The problem is that there is a simple use case and it looks like there is no solution to it:
I need to summon a process, like dotnet build, get it's full output without it being truncated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Diagnostics.Process needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration
Projects
None yet
Development

No branches or pull requests

6 participants