-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify MainNodeSdkResolverService #7244
Simplify MainNodeSdkResolverService #7244
Conversation
I saw your commit is called WIP, but this isn't a draft, and it doesn't have the WIP label. Do you want reviews? |
@Forgind Yeah sorry my original commit was a work in progress and I ran the tests locally with some failures. So I pushed the commit and remembered that some tests just don't pass locally so I opened the PR and forgot my commit name was WIP. Should I update the commit message or will we just update it if/when this is merged? |
Not a problem; we can update it when it's merged. I just wasn't sure whether you'd wanted it reviewed 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this still react properly to shutdown events? I didn't see anything about that. Otherwise, looks great! Love taking out excess complication.
Good question, I think the packets will just stop coming in but existing requests will still be running. Let me see if I can simulate a shutdown an see what actually happens. If it doesn't respond well, I'll add cancellation support to the |
src/Build/BackEnd/Components/SdkResolution/MainNodeSdkResolverService.cs
Outdated
Show resolved
Hide resolved
I looked into this and shutdown events aren't really applicable anymore since the tasks that respond are fire-and-forget and there's nothing to shutdown. Cancellation doesn't work, but it never has before since cancellation tokens aren't passed into evaluator. |
src/Build/BackEnd/Components/SdkResolution/MainNodeSdkResolverService.cs
Outdated
Show resolved
Hide resolved
_requestReceivedEvent.Reset(); | ||
nodeManager.SendData(request.NodeId, response); | ||
} | ||
}).ConfigureAwait(continueOnCapturedContext: false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Does continueOnCapturedContext: false
make a difference when the task has no continuation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I look into it, it seems like this method is already running on a thread pool thread. The communication receives packets and runs these with a worker queue. Maybe I don't even need to use Task.Run()
at all. Let me look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version doesn't call Task.Run()
this this method is already running in a thread pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, it is running in the thread pool but it doesn't mean that it can be synchronous. Here's the callstack I see with my test build:
Microsoft.Build.dll!Microsoft.Build.BackEnd.SdkResolution.MainNodeSdkResolverService.PacketReceived(int node, Microsoft.Build.BackEnd.INodePacket packet) Line 81 C#
Microsoft.Build.dll!Microsoft.Build.BackEnd.NodePacketFactory.PacketFactoryRecord.RoutePacket(int nodeId, Microsoft.Build.BackEnd.INodePacket packet) Line 110 C#
Microsoft.Build.dll!Microsoft.Build.BackEnd.NodePacketFactory.PacketFactoryRecord.DeserializeAndRoutePacket(int nodeId, Microsoft.Build.BackEnd.ITranslator translator) Line 102 C#
Microsoft.Build.dll!Microsoft.Build.BackEnd.NodePacketFactory.DeserializeAndRoutePacket(int nodeId, Microsoft.Build.BackEnd.NodePacketType packetType, Microsoft.Build.BackEnd.ITranslator translator) Line 58 C#
Microsoft.Build.dll!Microsoft.Build.BackEnd.NodeManager.DeserializeAndRoutePacket(int nodeId, Microsoft.Build.BackEnd.NodePacketType packetType, Microsoft.Build.BackEnd.ITranslator translator) Line 261 C#
Microsoft.Build.dll!Microsoft.Build.BackEnd.NodeProviderOutOfProcBase.NodeContext.ReadAndRoutePacket(Microsoft.Build.BackEnd.NodePacketType packetType, byte[] packetData, int packetLength) Line 1090 C#
> Microsoft.Build.dll!Microsoft.Build.BackEnd.NodeProviderOutOfProcBase.NodeContext.BodyReadComplete(System.IAsyncResult result) Line 1144 C#
The frame at the bottom has:
// Read the next packet.
BeginAsyncPacketRead();
as the next statement to execute. Can we really afford to block processing of further packets from the node until the SDK is resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looked at it. With current state of code we can afford to block packet processing until SDK is resolved. Nodes are single threaded, and node is blocked and wait for response while it sends request to main node for SDK resolution. So it is guaranteed, if I understand it correctly, that there is actually no other packet to be processed - on that particular namedpipe.
Also, how do you assess risk here? Would it be possible to preserve the old behavior under a changewave Just In Case™️? |
I don't think there's much risk here. From what I can tell, fundamental flaws in this code would almost immediately show up for any build that's using out-of-proc nodes. There could be a performance impact but I expect things to be faster since the new code should be launching less threads. I think we're early enough in the release cycle to catch any big problems. I do want to try an experimental insertion and install the build just to test out the perf. My local builds aren't NGen'd and are always slower... |
ff1aab1
to
ab20718
Compare
@rainersigwald can I get an experimental insertion to install and try out please? |
(Resolved offline) |
Results are in, everything works! I cloned https://github.com/msbuild/msbuildsdks which uses a lot of MSBuild project SDKs to test Current version (17.0)
My Version
So my change doesn't make anything slower. I'd imagine I'm also benefiting from other unshipped performance gains so I won't take any credit there. But nothing is broken! |
Also forgot to mention, builds in Visual Studio work just fine as well! |
Very optional, but you can try downloading a recent main build (ideally from right before your change, but eh) to see if it changes anything, I wouldn't be surprised if that was entirely your change! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job. If I understand code correctly, this changes shall not cause regression in MSBuild command line builds. In VS I recommend to measure and compare at some high CPU cores count machine and large projects.
src/Build/BackEnd/Components/SdkResolution/MainNodeSdkResolverService.cs
Outdated
Show resolved
Hide resolved
0431f45
to
7ba2fe9
Compare
@rainersigwald anything we're waiting on for this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit worried about not having a way to opt back into the old behavior, but if you're super confident that this is better in all cases let's get it in for preview 2.
I did a lot of testing and couldn't find any cases where it was causing issues. I loaded a large solution in Visual Studio, did a large command-line build, I added delays to a test SDK resolver, and DDRITs passed. The sooner this goes in the better for sure, I'm really hoping we don't need an escape hatch since this code path is used so much, we should find if there are any problems with it very soon. |
Fixes #7137
Context
The MainNodeSdkResolverService process incoming requests from out-of-proc nodes to handle SDK resolution. The main node has a queue of all incoming packets that are processed by appropriate handlers. The MainNodeSdkResolverService currently takes SdkResolverRequests and places them in a queue for processing. This seems unnecessary to have two queues and two threads draining the queues.
Changes Made
Remove queue and thread that process it in favor of launching a fire-and-forget task that resolves the SDK and sends the response.
Testing
Existing tests cover this area
Notes