Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memfd_create when available #105178

Merged
merged 15 commits into from
Aug 20, 2024
Merged
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Runtime.InteropServices;
using System.Threading;
using Microsoft.Win32.SafeHandles;

internal static partial class Interop
{
internal static partial class Sys
{
[LibraryImport(Libraries.SystemNative, EntryPoint = "SystemNative_MemfdCreate", StringMarshalling = StringMarshalling.Utf8, SetLastError = true)]
internal static partial SafeFileHandle MemfdCreate(string name);

[LibraryImport(Libraries.SystemNative, EntryPoint = "SystemNative_MemfdSupported", SetLastError = true)]
private static partial int MemfdSupportedImpl();

private static volatile sbyte s_memfdSupported;

internal static bool MemfdSupported
am11 marked this conversation as resolved.
Show resolved Hide resolved
{
get
{
sbyte memfdSupported = s_memfdSupported;
if (memfdSupported == 0)
{
Interlocked.CompareExchange(ref s_memfdSupported, (sbyte)(MemfdSupportedImpl() == 1 ? 1 : -1), 0);
am11 marked this conversation as resolved.
Show resolved Hide resolved
memfdSupported = s_memfdSupported;
}
return memfdSupported > 0;
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@
Link="Common\Interop\Unix\Interop.Libraries.cs" />
<Compile Include="$(CommonPath)Interop\Unix\Interop.Errors.cs"
Link="Common\Interop\Unix\Interop.Errors.cs" />
<Compile Include="$(CommonPath)Interop\Unix\System.Native\Interop.Close.cs"
Link="Common\Interop\Unix\System.Native\Interop.Close.cs" />
<Compile Include="$(CommonPath)Interop\Unix\System.Native\Interop.Fcntl.cs"
Link="Common\Interop\Unix\Interop.Fcntl.cs" />
<Compile Include="$(CommonPath)Interop\Unix\Interop.IOErrors.cs"
Expand Down Expand Up @@ -119,6 +121,8 @@
Link="Common\Interop\Unix\Interop.MAdvise.cs" />
<Compile Include="$(CommonPath)Interop\Unix\System.Native\Interop.ShmOpen.cs"
Link="Common\Interop\Unix\Interop.ShmOpen.cs" />
<Compile Include="$(CommonPath)Interop\Unix\System.Native\Interop.MemfdCreate.cs"
Link="Common\Interop\Unix\Interop.MemfdCreate.cs" />
<Compile Include="$(CommonPath)Interop\Unix\System.Native\Interop.Unlink.cs"
Link="Common\Interop\Unix\Interop.Unlink.cs" />
</ItemGroup>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,14 @@ private static SafeFileHandle CreateSharedBackingObject(Interop.Sys.MemoryMapped
do
{
mapName = GenerateMapName();
fd = Interop.Sys.ShmOpen(mapName, flags, (int)perms); // Create the shared memory object.
if (Interop.Sys.MemfdSupported)
am11 marked this conversation as resolved.
Show resolved Hide resolved
{
fd = Interop.Sys.MemfdCreate(mapName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the information in flags and perms isn't factored in here, where does it get incorporated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memfd_create flags do not have direct equivalents for read-only or read-write permissions. The flags used with memfd_create are mainly related to file descriptor behavior (e.g., closing on exec and allowing sealing), not the memory protection levels. Therefore, it makes sense to keep MFD_CLOEXEC hardcoded in C.

It was missing mmap call to set the protection, which I have just added. Inheritance is set the same way as with shm_open (default: CLOEXEC, clear flag if Inheritable is requested from line 244).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If flags and perms aren't relevant to the if block, should they be moved to the else block? They're only ever used there. I realize it's inside of a retry loop, but we expect retries to be rare bordering on non-existent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, would there be any hardening benefits to using seals as a stand-in for what perms was being used for?

Copy link
Member Author

@am11 am11 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using shm_open with read-only permissions (e.g., O_RDONLY) and mapping it with mmap with read-only protections (e.g., PROT_READ), the resulting memory mapping will not allow writing through that specific file descriptor and mapping. However, if another process has opened the same shared memory object with read-write permissions (e.g., O_RDWR), it can still write to the shared memory, and those changes will be visible to the read-only mappings.

With memfd_create there is no protection on fd by default. We can write(fd) unless we implement write sealing: am11@f421782. This will make it readonly for current process (same as shm_open) as well as other processes (different than shm_open).

While it is not exactly the drop-in replacement, but I think it is a goodness that we will be more hardened than shm_open?

Copy link
Member

@stephentoub stephentoub Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, other than the extra syscall, there doesn't appear to be a downside to setting seals and it will help to harden the permissions. I suggest we add it in. At that point, since there's then multiple interop calls involved, having completely separate code paths for memfd_create vs shmopen, including error handling, would seem to make sense.

am11 marked this conversation as resolved.
Show resolved Hide resolved
}
else
{
fd = Interop.Sys.ShmOpen(mapName, flags, (int)perms); // Create the shared memory object.
}

if (fd.IsInvalid)
{
am11 marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -204,7 +211,7 @@ private static SafeFileHandle CreateSharedBackingObject(Interop.Sys.MemoryMapped
// the result of native shm_open does not work well with our subsequent call to mmap.
return null;
stephentoub marked this conversation as resolved.
Show resolved Hide resolved
}
else if (errorInfo.Error == Interop.Error.ENAMETOOLONG)
else if (!Interop.Sys.MemfdSupported && errorInfo.Error == Interop.Error.ENAMETOOLONG)
am11 marked this conversation as resolved.
Show resolved Hide resolved
{
Debug.Fail($"shm_open failed with ENAMETOOLONG for {Encoding.UTF8.GetByteCount(mapName)} byte long name.");
// in theory it should not happen anymore, but just to be extra safe we use the fallback
Expand All @@ -219,10 +226,13 @@ private static SafeFileHandle CreateSharedBackingObject(Interop.Sys.MemoryMapped

try
{
// Unlink the shared memory object immediately so that it'll go away once all handles
// to it are closed (as with opened then unlinked files, it'll remain usable via
// the open handles even though it's unlinked and can't be opened anew via its name).
Interop.CheckIo(Interop.Sys.ShmUnlink(mapName));
if (!Interop.Sys.MemfdSupported)
am11 marked this conversation as resolved.
Show resolved Hide resolved
{
// Unlink the shared memory object immediately so that it'll go away once all handles
// to it are closed (as with opened then unlinked files, it'll remain usable via
// the open handles even though it's unlinked and can't be opened anew via its name).
Interop.CheckIo(Interop.Sys.ShmUnlink(mapName));
}

// Give it the right capacity. We do this directly with ftruncate rather
// than via FileStream.SetLength after the FileStream is created because, on some systems,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ namespace System
public static partial class Environment
{
public static long WorkingSet =>
(long)(Interop.procfs.TryReadProcessStatusInfo(Interop.procfs.ProcPid.Self, out Interop.procfs.ProcessStatusInfo status) ? status.ResidentSetSize : 0);
(long)(Interop.procfs.TryReadProcessStatusInfo(ProcessId, out Interop.procfs.ProcessStatusInfo status) ? status.ResidentSetSize : 0);
am11 marked this conversation as resolved.
Show resolved Hide resolved
}
}
1 change: 1 addition & 0 deletions src/native/libs/Common/pal_config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#cmakedefine01 HAVE_F_DUPFD
#cmakedefine01 HAVE_F_FULLFSYNC
#cmakedefine01 HAVE_O_CLOEXEC
#cmakedefine01 HAVE_MEMFD_CREATE
#cmakedefine01 HAVE_GETIFADDRS
#cmakedefine01 HAVE_UTSNAME_DOMAINNAME
#cmakedefine01 HAVE_STAT64
Expand Down
2 changes: 2 additions & 0 deletions src/native/libs/System.Native/entrypoints.c
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@ static const Entry s_sysNative[] =
DllImportEntry(SystemNative_Close)
DllImportEntry(SystemNative_Dup)
DllImportEntry(SystemNative_Unlink)
DllImportEntry(SystemNative_MemfdSupported)
DllImportEntry(SystemNative_MemfdCreate)
DllImportEntry(SystemNative_ShmOpen)
DllImportEntry(SystemNative_ShmUnlink)
DllImportEntry(SystemNative_GetReadDirRBufferSize)
Expand Down
42 changes: 42 additions & 0 deletions src/native/libs/System.Native/pal_io.c
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,48 @@ int32_t SystemNative_Unlink(const char* path)
return result;
}

int32_t SystemNative_MemfdSupported(void)
{
#if HAVE_MEMFD_CREATE
#ifdef TARGET_LINUX
struct utsname uts;
int32_t major, minor;

// memfd_create is only known to work properly on kernel version > 3.17 and throws SIGSEGV instead of ENOTSUP
am11 marked this conversation as resolved.
Show resolved Hide resolved
if (uname(&uts) == 0 && sscanf(uts.release, "%d.%d", &major, &minor) == 2 && (major < 3 || (major == 3 && minor < 17)))
{
return 0;
}
#endif

int32_t fd = memfd_create("test", MFD_CLOEXEC | MFD_ALLOW_SEALING);
if (fd < 0) return 0;

close(fd);
return 1;
#else
errno = ENOTSUP;
return 0;
#endif
}

intptr_t SystemNative_MemfdCreate(const char* name)
{
#if HAVE_MEMFD_CREATE
#if defined(SHM_NAME_MAX) // macOS
assert(strlen(name) <= SHM_NAME_MAX);
#elif defined(PATH_MAX) // other Unixes
assert(strlen(name) <= PATH_MAX);
#endif

return memfd_create(name, MFD_CLOEXEC | MFD_ALLOW_SEALING);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid SetSealWrite P/Invoke call if we call fcntl here. I will test.

Current benchmakrs:

Faster base/diff Base Median (ns) Diff Median (ns) Modality
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateNew(capacity: 1000 11.10 17470.22 1574.24
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateNew(capacity: 1000 10.81 16811.50 1554.71
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateNew(capacity: 1000 10.42 16351.91 1568.75
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateNew(capacity: 1000 10.28 16327.87 1587.61
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateFromFile(capacity: 1.75 50085.65 28573.40
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateFromFile(capacity: 1.63 47557.25 29193.17
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateFromFile(capacity: 1.62 47405.59 29194.67
System.IO.MemoryMappedFiles.Tests.Perf_MemoryMappedFile.CreateFromFile(capacity: 1.62 47616.49 29359.55

#else
(void)name;
errno = ENOTSUP;
return -1;
#endif
}

intptr_t SystemNative_ShmOpen(const char* name, int32_t flags, int32_t mode)
{
#if defined(SHM_NAME_MAX) // macOS
Expand Down
14 changes: 14 additions & 0 deletions src/native/libs/System.Native/pal_io.h
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,20 @@ PALEXPORT intptr_t SystemNative_Dup(intptr_t oldfd);
*/
PALEXPORT int32_t SystemNative_Unlink(const char* path);

/**
* Check if the system supports memfd_create(2).
*
* Returns 1 if memfd_create is supported, 0 if not supported, or -1 on failure. Sets errno on failure.
*/
PALEXPORT int32_t SystemNative_MemfdSupported(void);

/**
* Create an anonymous file descriptor. Implemented as shim to memfd_create(2).
*
* Returns file descriptor or -1 on failure. Sets errno on failure.
*/
PALEXPORT intptr_t SystemNative_MemfdCreate(const char* name);

/**
* Open or create a shared memory object. Implemented as shim to shm_open(3).
*
Expand Down
4 changes: 4 additions & 0 deletions src/native/libs/configure.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,10 @@ check_symbol_exists(
fcntl.h
HAVE_F_FULLFSYNC)

check_function_exists(
memfd_create
HAVE_MEMFD_CREATE)

check_function_exists(
getifaddrs
HAVE_GETIFADDRS)
Expand Down
Loading