Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove C FILE PAL wrappers #98238

Merged
merged 8 commits into from
Feb 15, 2024
Merged

Remove C FILE PAL wrappers #98238

merged 8 commits into from
Feb 15, 2024

Conversation

jkoritzinsky
Copy link
Member

Remove the PAL wrappers around the C FILE APIs and update locations that were using the existing features.

The existing PAL layer provides the following features:

  • Explicit "Text" mode:
    • Normalize line endings on read from CRLF to LF
    • Specify 't' as part of the mode parameter to fopen
    • This was used by ildasm for writing. Writing with text mode in the PAL does nothing (as the line endings are already normalized). Replaced with an #ifdef to only use explicit text mode on Windows.
  • Implicit "Text" mode:
    • Normalize line endings on read from CRLF to LF
    • This was used across the product, but only used in two places that might read files with CRLF-style line endings.
      • Text-based PGO.
        • This format is a non-shipping format that is used for debugging/diagnostics purposes.
        • Text editors on Windows now support LF-based line endings, so I updated this to use the explicit non-text mode so files produced in the future can be read on all platforms (like the existing behavior).
      • ildasm's handling for showing source-line information
        • ildasm explicitly checks for CRLF line endings even when using the PAL, so we don't need the PAL to handle this.
  • EINTR errno results
    • EINTR errors can only occur in the syscalls used by the C file APIs when a syscall is interrupted by a signal handler not marked with SA_RESTART. All signal handlers CoreCLR installs are marked with SA_RESTART.
  • Clear ferror errors with clearerr before each operation.
    • I couldn't really validate that we don't depend on this behavior today, but given the minimal usage of these APIs in the runtime (mostly reading files in /sys and /proc), I'm not particularly worried.
    • Most file-manipulation scenarios cases in the CoreCLR code-base use the Win32 PAL (CreateFile and family) instead of the C PAL.

@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.01% to +0.00%)
Collection PDIFF
libraries.pmi.linux.arm64.checked.mch -0.01%

Details here


@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for linux/arm64 ran on linux/x64

MinOpts (-0.00% to +0.01%)
Collection PDIFF
libraries.pmi.linux.arm64.checked.mch +0.01%

Details here


@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)
Collection PDIFF
libraries.pmi.linux.arm64.checked.mch +0.01%

Details here


@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for linux/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)
Collection PDIFF
libraries.pmi.linux.arm64.checked.mch +0.01%

Details here


@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for osx/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)
Collection PDIFF
libraries.pmi.osx.arm64.checked.mch +0.01%

Details here


@jkoritzinsky jkoritzinsky marked this pull request as ready for review February 10, 2024 20:45
@janvorli
Copy link
Member

EINTR errors can only occur in the syscalls used by the C file APIs when a syscall is interrupted by a signal handler not marked with SA_RESTART. All signal handlers CoreCLR installs are marked with SA_RESTART.

How about signal handlers installed by user code? E.g. custom host or 3rd party libraries?

src/coreclr/gc/unix/gcenv.unix.cpp Outdated Show resolved Hide resolved
src/coreclr/gc/unix/cgroup.cpp Outdated Show resolved Hide resolved
src/coreclr/pal/src/thread/process.cpp Outdated Show resolved Hide resolved
@jkoritzinsky
Copy link
Member Author

EINTR errors can only occur in the syscalls used by the C file APIs when a syscall is interrupted by a signal handler not marked with SA_RESTART. All signal handlers CoreCLR installs are marked with SA_RESTART.

How about signal handlers installed by user code? E.g. custom host or 3rd party libraries?

For signal handlers installed with the PosixSignalRegistration API, it looks like we're good. For custom hosts or 3rd party libraries that install signal handlers without SA_RESTART, we'd have a problem.

However, if the GC doesn't use the PAL headers (as you said in the later comments on some of the files), then the only places that use the fopen API are off-by-default logging scenarios, all of which other than DOTNET_ReadyToRunLogFile are Debug/Checked only, superpmi, or in non-CoreCLR processes like ilasm or ildasm. Everywhere else that does file manipulation uses the Win32 PAL, not the C APIs.

Therefore, I don't think we need to worry about EINTR error codes, especially as even the Mono runtime doesn't seem to check for them when using the C API instead of direct syscalls (they do check when directly calling write or open, but not fwrite or fopen).

@jkoritzinsky
Copy link
Member Author

cc: @EgorBo for review of the text-based PGO format change (using LF line endings across all platforms instead of CRLF). We can also change this back and have it be LF on non-Windows and CRLF on Windows. I'd rather not support CRLF on non-Windows for this though.

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text PGO changes LGTM cc @AndyAyersMS @jakobbotsch hope it won't break any internal tooling for you

@ryujit-bot
Copy link

Diff results for #98238

Throughput diffs

Throughput diffs for windows/arm64 ran on windows/x64

MinOpts (-0.00% to +0.01%)
Collection PDIFF
libraries.pmi.windows.arm64.checked.mch +0.01%

Details here


@lambdageek
Copy link
Member

especially as even the Mono runtime doesn't seem to check for them when using the C API instead of direct syscalls (they do check when directly calling write or open, but not fwrite or fopen).

Mono probably should check for EINTR in more places, to be honest - we're not a great role model for this.

(Although I'd guess we mostly use the FILE* apis in non-shipping debug scenarios and in the AOT compiler. The AOT compiler doesn't load arbitrary 3rd party native code and I'm not even sure if we set up our own signal handlers in it.)

@jkoritzinsky
Copy link
Member Author

@janvorli do you have any more feedback here?

@jkoritzinsky jkoritzinsky requested a review from jkotas February 15, 2024 22:33
Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@jkoritzinsky jkoritzinsky merged commit 22c5e9f into dotnet:main Feb 15, 2024
129 checks passed
@jkoritzinsky jkoritzinsky deleted the file-pal branch February 15, 2024 23:43
@github-actions github-actions bot locked and limited conversation to collaborators Mar 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants