-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove forced serialization of async-over-sync in Stream base methods #53389
Conversation
Tagging subscribers to this area: @carlossanlop Issue DetailsFixes #45089 The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods. When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block. Yuck. Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed. The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type). That serialization, however, even though it was asynchronous, was also misguided. In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write. In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time. This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior. This PR deletes that serialization and then cleans up all the cruft that was built up around it. This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary. BufferedStream was explicitly using the same serialization mechanism; I left that intact, though we might want to revisit that. BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows. cc: @jkotas, @geoffkizer, @adamsitnik, @marklio
|
The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods. When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block. Yuck. Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed. The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type). That serialization, however, even though it was asynchronous, was also misguided. In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write. In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time. This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior. This PR deletes that serialization and then cleans up all the cruft that was built up around it. This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary. BufferedStream was explicitly using the same serialization mechanism; I left that intact. BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows.
ec40755
to
bca4acd
Compare
I haven't yet wrapped my head around the changes yet. If someone is broken by this change, how easy is it for them to recognize there is an issue, diagnose the break, and take action to fix their problematic code (assuming they own the code in question)? |
It would mean they're issuing concurrent Read/WriteAsync calls on a Stream that wasn't intended to be used concurrently and that doesn't override Read/WriteAsync to provide a better implementation. It would likely manifest as exceptions (e.g. argument out of range) or corrupted data, since in that situation it's effectively removing a lock that was causing operations to be serialized. Fixing the code would entail not using the Stream in that problematic manner, which could simply mean using their own semaphore to provide that same serialization if really desired. |
I think the possibility of corruption is the piece that worries me here the most, particularly in cases where concurrent use is probabilistic. I'd rather have an exception/crash than data corruption. It seems completely possible for someone to run validation and not catch corruption. How possible is it for us to deliver a reliable, diagnostic exception for problematic concurrent use? |
cc @dotnet/compat |
It's not while also achieving the goals of the removal. Doing that would require tracking the in-flight usage, so we wouldn't be to remove all fields. And we have no way of knowing whether concurrent usage is acceptable or not, so we'd be throwing an exception even if it's something that should be allowed, e.g. for a duplex stream that supports a reader and writer concurrently. At that point, there's little benefit in taking a change here at all. |
if (useAsync) | ||
|
||
// To ensure that the buffer of a FileStream opened for async IO is flushed | ||
// by FlushAsync in asynchronous way, we aquire a lock for every buffered WriteAsync. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "aquire" => "acquire"
I'm going to close this for .NET 6. I'd like to revisit for .NET 7 when we have more runway. |
Is this something we should consider for 10 @stephentoub ? It seems like the runway is as long as it gets.... |
Yeah, I think that'd be reasonable. |
Fixes #45089
The base implementation of Stream.BeginRead/Write queue a work items that invoke the abstract Read/Write methods. When Stream.BeginRead/Write were introduced long ago, for reasons I’m not privy to, someone decided it would be a good idea to add protection to these methods, such that if you try to call either BeginRead or BeginWrite while a previous BeginRead or BeginWrite operation was still in flight, the synchronous call to BeginXx would synchronously block. Yuck. Back in .NET Framework 4.5 when we added Stream.Read/WriteAsync, we had to add the base implementations as wrappers for the BeginRead/Write methods, since Read/WriteAsync should pick up the overrides of those methods if they existed. The idea of propagating that synchronous blocking behavior to Read/WriteAsync was unstomachable, but for compatibility we made it so that Read/WriteAsync would still serialize, just asynchronously (later we added a fast path optimization that would skip BeginRead/Write entirely if they weren’t overridden by the derived type). That serialization, however, even though it was asynchronous, was also misguided. In addition to adding overhead, both in terms of needing a semaphore and somewhere to store it and in terms of using that semaphore for every operation, it prevents the concurrent use of read and write. In general, streams aren’t meant to be used concurrently at all, but some streams are duplex and support up to a single reader and single writer at a time. This serialization ends up blocking such duplex streams from being used (if they don’t override Read/WriteAsync), but worse, it ends up hiding misuse of streams that shouldn’t be used concurrently by masking the misuse and turning it into behavior that might appear to work but is unlikely to actually be the desired behavior.
This PR deletes that serialization and then cleans up all the cruft that was built up around it. This is a breaking change, as it’s possible code could have been relying on this (undocumented) protection; the fix for such an app is to stop doing that erroneous concurrent access, which could include applying its own serialization if necessary.
BufferedStream was explicitly using the same serialization mechanism; I left that intact, though we might want to revisit that. BufferedFileStreamStrategy was also using it, as FileStream kinda sorta somewhat tries to enable concurrent (not parallel) usage when useAsync == true on Windows.
cc: @jkotas, @geoffkizer, @adamsitnik, @marklio