-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce scatter/gather file I/O Windows API requirements et. al. #57424
Enforce scatter/gather file I/O Windows API requirements et. al. #57424
Conversation
And use pinned GCHandles and IntPtrs instead of MemoryHandles when passing the segment array to the bottom-most method.
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsThe Windows APIs that perform scatter/gather file I/O have more usage restrictions than their Unix counterparts. Besides requiring the file handle to be opened with buffering disabled, they also require each buffer segment to be aligned at page size boundaries, and will read/write one page from/to each segment. Until now, the gather/scatter runtime/src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs Line 468 in 0ebc7ec
Memory<byte>.Length property at all, which is what raised my suspicions and led us here.
This PR refactors it to additionally check whether each buffer segment is aligned and is exactly as long as a page. The previous check that the total buffer size fits in 32 bits is consolidated with the other two, in a common function for both scatter and gather, that also got the responsibility to pin the segments while checking their alignment (to prevent the GC from potentially moving them around in between). If these requirements are found to be violated, the buffers are unpinned and the operation will be performed with multiple read/write syscalls (like before when the file handle was opened synchronously or with buffering enabled). Furthermore the
|
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
…ddrOfPinnedObject.
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
330449f
to
e1b94a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide a single unit test, where you create a buffer that does not meet ReadFileScatter|WriteFileGather
size and alignment requirements, but meets ReadFile|WriteFile
size and alignment requirements for file handles opened with NO_BUFFERING
.
I added the test @adamsitnik. I also noticed in lines like https://github.com/dotnet/runtime/blob/bae5e4ec0ebbc93542779800a2c5a64e0c8dc620/src/libraries/System.IO.FileSystem/tests/RandomAccess/NoBuffering.Windows.cs#L147 that we always open the handle with |
src/libraries/System.IO.FileSystem/tests/RandomAccess/NoBuffering.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Outdated
Show resolved
Hide resolved
|
||
MemoryHandle[] memoryHandles = new MemoryHandle[buffersCount]; | ||
MemoryHandle pinnedSegments = fileSegments.AsMemory().Pin(); | ||
long* segmentsArray = (long*) NativeMemory.Alloc((nuint)(buffersCount + 1), sizeof(long)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change needs to be benchmarked as I would expect managed allocator to be faster than native for small arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was @jkotas' idea. Allocating a managed array has less overhead, but it also has to be pinned for the duration of the async operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teo-tsirpanis could you please benchmark your changes? The easiest way would be to contribute some new RandomAccess
type benchmarks to dotnet/performance repo similar to FileStream benchmarks and run them using two CoreRun.exe: with and without your changes https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md#dotnet-runtime-prerequisites
dotnet run -c Release -f net6.0 --filter *RandomAccess* --corerun $pathToCoreRunWithoutYourChanges $pathToCoreRunWithYourChanges
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but my machine ran out of space. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but my machine ran out of space
@teo-tsirpanis could you please share the benchmark source code? I am would be very happy to help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teo-tsirpanis I've implemented some benchmarks (dotnet/performance#1967) and run them against your fork.
This is what is going to be required to run them once dotnet/performance#1967 is merged (I am sharing that so you can run some benchmarks in the future):
git clone https://github.com/teo-tsirpanis/dotnet-runtime.git runtime
cd .\runtime\
.\build.cmd -c Release -subset clr+libs
robocopy .\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\ .\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\before /E
git checkout windows-vectored-io-refactor
taskkill /IM "dotnet.exe" /F
.\build.cmd -c Release -subset clr.corelib+clr.nativecorelib+libs.PreTest
robocopy .\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\7.0.0\ .\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\after /E
cd ..
git clone https://github.com/dotnet/performance.git performance
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter *Perf_RandomAccess_NoBuffering* --corerun .\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\before\corerun.exe .\runtime\artifacts\bin\testhost\net6.0-windows-Release-x64\shared\Microsoft.NETCore.App\after\corerun.exe
src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Show resolved
Hide resolved
src/libraries/System.IO.FileSystem/tests/RandomAccess/NoBuffering.Windows.cs
Show resolved
Hide resolved
In this particular file where we test
FILE_FLAG_OVERLAPPED == But we should most probably extend the tests and cover all possible scenarios:
|
@teo-tsirpanis I've sent #57717 to address that |
bae5e4e
to
749b6c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code LGTM, the perf also. @teo-tsirpanis once again thank you!
/backport to release/6.0 |
Started backporting to release/6.0: https://github.com/dotnet/runtime/actions/runs/1185379030 |
The Windows APIs that perform scatter/gather file I/O have more usage restrictions than their Unix counterparts. Besides requiring the file handle to be opened with buffering disabled, they also require each buffer segment to be aligned at page size boundaries, and will read/write one page from/to each segment.
Until now, the gather/scatter
RandomAccess
API Windows implementation only enforces the first requirement, allowing buffers containing segments without the proper alignment or length to be passed to the APIs, leading to potential undefined behavior like buffer overflows or underflows. For example, code aroundruntime/src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Windows.cs
Line 468 in 0ebc7ec
Memory<byte>.Length
property at all, which is what raised my suspicions and led us here.This PR refactors it to additionally check whether each buffer segment is aligned and is exactly as long as a page. The previous check that the total buffer size fits in 32 bits is consolidated with the other two, in a common function for both scatter and gather, that also got the responsibility to pin the segments while checking their alignment (to prevent the GC from potentially moving them around in between). If these requirements are found to be violated, the buffers are unpinned and the operation will be performed with multiple read/write syscalls (like before when the file handle was opened synchronously or with buffering enabled).
Furthermore the
RandomAccess
APIs on Windows now special-case buffers with only one or zero (new in this PR) segments earlier in the call chain, and there were some other smaller changes.