-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service with HTTPS endpoint and response compression locks up serving large files #2067
Comments
Does it happen when you change:
|
Our product's client is is locked in to .NET and server to .NET Core so I'm not sure chasing different test conditions is practical in this case. FTR, test case is already attached with this submission. |
@sergey-b-berezin chasing where the bug is is however useful for routing to the right team - is it client side or server side? The "tests" above will answer that. |
Our client used to use HttpClient but once problems cropped up we switched to "older" WebClient hoping it'd help. It didn't. Everything I tested points at .NET Core server being the issue but I've been wrong before. Sorry but I don't have the bandwidth to iron out all test permutations you mentioned. That's why I put together an isolated test case, which demonstrates a specific problem, which other tests you suggested may not. I've got to ask, is this the right project to submit this issue? |
If you suspect the server, can you please at least try different client? It should be simple. |
Tried it with HttpClient last night with the same outcome. I built a test client similar to .NET 4.6.1 one included in the test submission, but one built as .NET Core console app instead, with the same logic using HttpClient. So just as I suspected everything points at the server side. |
What's the next step? Thanks. |
Taking a look now... (Was on vacation last week, sorry for the delay.) |
Understood. Thanks. It's a delicate balance between annoying someone and asking for help. :) |
OK this is possibly a duplicate of aspnet/BasicMiddleware#247, which we plan to fix in ASP.NET Core 3.0. If you can get a dump of the ASP.NET Core process while it's hung, then you can open up the dump in VS and see where the thread stacks are. If the stacks look like the ones in aspnet/BasicMiddleware#247, then it is the same issue. If the stacks are different, we will investigate further. @sergey-m-pega - can you take a look at the threads to see if it's the same issue? |
Tagging @Tratcher here. |
Will try to collect a dump. Assuming it's the same issue, are there any workarounds exist for this issue in the mean time, short of disabling the response compression entirely? Thanks. |
@Tratcher - what workarounds do you know of? |
Let's confirm the stack traces before getting deep into the mitigations. |
Is there post that describes on how to collect it so that it's in the format you expect it to be? Thanks. |
Using task manager is the easiest, if you have direct access to the machine. |
Got it. Collected 2 dumps, 1 for dotnet.exe and 1 for test client executable via procdump.exe -ma <XXX.exe> <XXX.dmp> Let me know if you'd like me to re-run it with different parameters. Those dump files even zipped are quite large. Where would you suggest posting them for you? Thanks. |
You can open them yourself in Visual Studio and look at the Threads window to see what callstacks are blocked. Do you see any that resemble those at aspnet/BasicMiddleware#247? |
I'm not sure how to get that stack trace from dump file tbhwy as I'm not seeing it there. I've trued loading dump file in VS and selecting debug with mixed to see just the regular type stack trace, not even remotely similar to the structure of the above referenced issue. |
What stack traces do you see in the dotnet.exe dump then? |
ntdll.dll!00007ffbbe38aa54() |
That's the Main thread, it's expected to be blocked. What about the others? |
Right, that's why I said earlier that it looks nothing like the other issue stack trace. Dump file handling is something I am not familiar with well enough so I'm not sure how to get the other threads info for you. Please advise. Thanks. |
It's under Debug -> Windows -> Threads. Look under the Location column to see the stack trace for each threads. |
Yeah, I totally forgot that I could look at thread's stack trace there. Never mind... What I was saying earlier was largely related to having dump file only on hand and getting stack trace from that as oppose to attaching to running process. In any case, I suspect one of ".NET Core ThreadPool" threads is the one you are interested in:
|
Yes, that's the issue...
How many threads did you have stuck in a callstack like that? It shouldn't be completely frozen, just slow while it waits for the client to drain data. If you have more threads stuck than the threadpool limit then you might have a problem. Raising the threadpool limit can help mitigate this but not eliminate it. |
The above stack trace you pointed out looks similar to one from the other issue. Only one like that that I could see. It gets completely stuck with no CPU on the box and client side eventually timing out. Yeah, I did read about raising thread pool thread limit but if it doesn't fix it permanently we just can't take a chance on it locking up in production. |
Back to my earlier question please. Are there any workarounds exist til Core 3.0 is released? I assume (future) patch for it is not in 2.2. Thanks. |
Raising the threadpool limit is the first thing to try. |
If raising that limit is guaranteed to fix it then I'd go for it. The last thing we want is to run into occasional production lock ups, which I suspect is something that could still happen being dependent on server's load. So that leaves us with the same option we've already implemented - disabling response compression entirely. That's a far from ideal solution though. |
The only guaranteed fix requires API and language changes in 3.0 (IAsyncDispose). Your case is certainly the most severe we've seen, likely because HTTPS compression is off by default and most people use IIS's compression module instead. |
We'll wait til 3.0 to revisit it then. Thanks for your help. |
Closing as there is nothing actionable left here. Let me know if I overlooked something. |
For anybody still using ASP.NET Core 2.2 and hitting the issue, I want to update that this issue can be worked around by writing another middleware (i.e. ResponseCompressionFlusherMiddleware) that will execute after ResponseCompressionMiddleware and will call FlushAsync on the BodyWrapperStream before ResponseCompressionMiddleware will call Dispose(). The new middleware invoke method should look like this:
|
Issue Title
Service with HTTPS endpoint and response compression locks up serving large files.
General
Just sign up with Github so that I can report the issue. Luckily I was able to isolate it to the smallest repro solution, which mimics our application setup.
Self hosted test service uses .NET Core v2.1.6 on Windows 10 x64 (and VS 2017 v15.9.1). Test client is a .NET 4.6.1 console application that uses WebClient to communicate with the service.
Service serves files on configured HTTPS endpoint. Response compression is On for HTTP and HTTPS. We respond with "application/octet-stream" content type to the client, same content type we have response compression configured for, which is validated via Postman.
After serving a few large files client locks up. It happens more often than not if files served are large enough, it could be a few 100Kb or over 1Mb. Eventually client errors out with WebException "operation has timed out". Sometimes service shows "Connection id "??????????", Request id "?????????:??????????": the connection was closed because the response was not read by the client at the specified minimum data rate" message after a while, but not all the time.
The odd part is that lock up always happens after serving a few larger files. I could never reproduce it serving small files, no matter how many I tried (ran test against an existing source code base for instance processing .cs files). Additionally, lock ups happen with HTTPS endpoint only. If I switch to HTTP endpoint then it always works fine.
To reproduce the issue, build solution attached. Run service, run console app. Test console application gets a list of all files in "bin" folder and asks for file contents from the service. It then saves the contents to "bin\test" effectively copying the file. You can run test against HTTP or HTTPS endpoint. Hit Enter to proceed with HTTPS endpoint and wait for it to lockup. If it doesn't happen then try the same console app again. Eventually it does lock up.
At this point it's a blocking issues for us. Between choosing HTTPS endpoint or turning response compression off we have to let response compression go, which is a shame. So, I'd appreciate any feedback. Thanks!
ResponseCompressionTest.zip
The text was updated successfully, but these errors were encountered: