Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHERI, take 2021.10 #402

Merged
merged 21 commits into from
Oct 20, 2021
Merged

CHERI, take 2021.10 #402

merged 21 commits into from
Oct 20, 2021

Conversation

nwf
Copy link
Collaborator

@nwf nwf commented Oct 18, 2021

Getting smaller every time; down to <500 changed lines now! Passes most of the test harness on my desk, even (I am unaware of it failing any but I'm impatient and some of the tests seem to trigger some performance pathologies in CHERI qemu). Let's see what CI thinks.

For the humans looking at this PR, this is something of a rampage and I'm happy to split the stack into multiple PRs if desired (already #401 is included here). In rough overview, from bottom to top of the patch stack:

  • NFC changes just tidying parts of the tree I was working in
  • Things found while porting that change behavior but aren't really CHERI specific
  • NFC changes "towards" CHERI
  • refactoring Metaslab/ChunkRecord to include a MetaCommon that holds a high authority pointer to the chunk for CHERI
  • Fixing some things the CHERI compiler noted
  • Adding RISC-V and FreeBSD/RISC-V support
  • The CHERI AAL and FreeBSD PAL adaptation
  • Workarounds for known bugs in CheriBSD

Perhaps best reviewed by commit rather than by cumulative delta.


#include "../ds/defines.h"

#include <cheri.h>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use cherintrin.h instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two files seem to define the same macros. Why do both exist?

We should probably avoid either and use the builtins directly here until the headers are stable.

Copy link

@arichardson arichardson Oct 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added cheriintrin.h as a header that simply exposes the intrinsics+permission macros under shorter names. The cheri.h header also defines macros for the non-purecap case where they are simply no-ops. I can't find a link to the discussion but I if I recall correctly we wanted a *intrin.h header that only provides macros if the feature is supported without a fallback to different behaviour. Additionally cheri.h provides inline functions with fixed void* argument types whereas the builtins are overloaded to preserve the argument type + constness (and also accept __intcap).

I would consider cheriintrin.h stable now, but using the builtins directly sounds might be the simplest solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved to use the builtins everywhere.

src/aal/aal_cheri.h Outdated Show resolved Hide resolved
Copy link
Member

@mjp41 mjp41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surprisingly simple. I guess all the ground work has made the final push for CHERI support pretty simple.

src/mem/localalloc.h Show resolved Hide resolved
@@ -20,6 +20,14 @@ namespace snmalloc
*
* It cannot unreserve memory, so this does not require the
* usual complexity of a buddy allocator.
*
* TODO: This manages pieces of memory smaller than (1U << MIN_CHUNK_BITS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also allocate allocators (CoreAlloc) from here? We might also do slightly more once I rebuild the PalNotification mechanism. This was allocating meta-data objects.

Also, there is issues that in CHECK_CLIENT I have placed meta-data in a larger range with guard pages to make OOB accessing the meta-data harder, which would also need engineering.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment updated to be a little more CHERI-specific.

src/mem/metaslab.h Outdated Show resolved Hide resolved
src/aal/aal_riscv.h Show resolved Hide resolved

#include "../ds/defines.h"

#include <cheri.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two files seem to define the same macros. Why do both exist?

We should probably avoid either and use the builtins directly here until the headers are stable.

src/aal/aal_cheri.h Show resolved Hide resolved
* capability. It does not carry provenance: it is not a capability, but
* merely an address.
*/
typedef ptraddr_t address_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks as if it comes from stddef.h, which isn't included here. Given that the base AAL tells you the address width, it might be a better idea to use that rather than depend on a non-standard declaration from a standard header - the glibc version will undoubtedly hide this behind some feature-check macro and so our headers will be broken by random build-system flags.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct place for ptraddr_t is <stddef.h>, CheriBSD also provides it in sys/sys/_stdint.h and in sys/sys/stddef.h. I take the view that any CHERI-aware <stddef.h> must provide ptraddr_t when building in __CHERI_PURE_CAPABILITY__ mode, and I would hope that the GLibc port does that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm include-ing <stddef.h> now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a fixed-with integer type would also be slightly wrong, as it's not necessarily the same type as ptraddr_t. On Darwin, uint64_t is unsigned long long but size_t is unsigned long, so ptraddr_t would likely also be unsigned long.

Comment on lines 26 to 30
// The "Zihintpause" extension steals some dead space of the "fence"
// instruction and so should be available everywhere even if it doesn't do
// anything on a particular microarchitecture. Our assemblers don't
// understand it, yet, tho', so thus this hilarity.
__asm__ volatile(".byte 0xF; .byte 0x0; .byte 0x0; .byte 0x1");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, Zihintpause was the first extension ratified under the RISC-V 'fast track' process, which meant that no one who understood the point of the thing that they were trying to implement was able to give feedback. As such, it doesn't actually do what snmalloc needs. It is defined to pause the HART for an impdef number of cycles. We want a hint to tell the processor not to keep filling the pipeline with instructions on the assumption that a read of an address remains the same multiple times. The Arm, x86, and Loongarch variants of this instruction have the right semantics.

It may be that RISC-V implementers do the right thing instead of what the spec says but until we have some examples of this I'd rather we didn't put anything here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow it seems even worse. The Zihintpause spec is... vague, not even promising to pause the HART, and focused on instruction retirement rather than on instruction dispatch, which seems odd to me:

indicates the current hart’s rate of instruction retirement should be temporarily reduced or paused.
The duration of its effect must be bounded and may be zero.  No architectural state is changed.

We should, at the very least, ensure that CHERI-RISC-V cores DTRT. However, I have replaced this with a larger block comment.

src/backend/address_space_core.h Outdated Show resolved Hide resolved
Comment on lines +47 to 48
inline static address_t
signed_prev(address_t curr, address_t next, const FreeListKey& key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this take a pointer type, rather than an address type? I would expect this to be the place to add in sealing of freelist pointers in CHERI.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only ever traverse the next links; the prev ones are just there for probabilistic integrity checking. By that token, I thought we'd eventually seal the forward pointers but leave the backward links just addresses. Haven't thought very hard about this yet, so could be convinced otherwise.

@@ -3,6 +3,10 @@
#if defined(__FreeBSD__) && !defined(_KERNEL)
# include "pal_bsd_aligned.h"

# if defined(__CHERI_PURE_CAPABILITY__)
# include <cheri/cheric.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is unavoidable, but I don't really like all of the things that this header leaks into the global namespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheri/cherireg.h brings slightly less, so I've jumped to that instead.

src/pal/pal_posix.h Show resolved Hide resolved
@nwf nwf force-pushed the 202110-cheri branch 7 times, most recently from f271143 to d4e35e1 Compare October 19, 2021 13:31
triple: riscv64-linux-gnu
rtld: ld-linux-riscv64-lp64d.so.1
extra-packages: binutils-riscv64-linux-gnu
ld: /usr/bin/riscv64-linux-gnu-ld
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all to work around the fact that riscv on Linux requires linker relaxations (even crt1.o requires it, so it's not a thing we can work around in our own compilation commands), and llvm doesn't have those implemented, so we have to use the GNU tools.

@nwf
Copy link
Collaborator Author

nwf commented Oct 19, 2021

The snmalloc2 work changed external_pointer to just do address arithmetic rather than rederiving a pointer from something privileged, so I have ripped capptr_rebound out of the AAL.

@nwf nwf requested review from davidchisnall and mjp41 October 19, 2021 13:44

set(CROSS_LINKER_FLAGS "-Wl,--dynamic-linker=/usr/${triple}/lib/$ENV{RTLD_NAME},-rpath,/usr/${triple}/lib")
if((DEFINED SNMALLOC_LINKER) AND NOT ("${SNMALLOC_LINKER}" MATCHES "^$"))
string(PREPEND CROSS_LINKER_FLAGS "-fuse-ld=${SNMALLOC_LINKER} ")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will trigger a warning with newer llvm where they want you to use --ld-path=/path/to/foo (either with or without -fuse-ld=bfd to select the linker type)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's what's there now?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@nwf nwf marked this pull request as draft October 19, 2021 16:22
@nwf
Copy link
Collaborator Author

nwf commented Oct 19, 2021

Hm. While this appears to pass the test suite, please do not merge quite yet. I appear to be handling ChunkUser chunks incorrectly -- there is some subversive amplification to Chunks happening. Will investigate.

We'll want user_address_control_type in some particular PALs, so it can't live
in pal.h.

While here, make the spelling be capptr::is_spatial_refinement.
Even on debug builds, these little things should be inlined
There is no such thing as "struct Slab" any more.

We use alignof(RemoteAllocator) below, so we already require the complete type
definition at this point.
capptr_rebound was only ever going to be used for external_pointer, which now
operates entirely using pointer_offset.  So instead, just make external_pointer
use capptr::AllocWild<void>, capptr_from_client, and a new capptr_reveal_wild.
Avoid computing bits::next_pow2_bits(1 << n).  Even if the compiler can see
through enough of the algebra, it's surely more direct to just use n.

While here, slightly expand documentation about what's going on with the
"sizeclass" encoded into MetaEntry-s.
- Grab a larger second allocation on the first allocator to dodge the sizeclass
  of the prior alloc on that allocator *and* any implicit, bootstrapping slabs
  that get opened (e.g., for remote queue message stubs).

- De-FAST_PATH the domestication function.  No need to always inline it, here.

- Document things a little better
The use of void* had let an overzealous unsafe_ptr() leak a pointer with address
space control to the client (in LocalAllocator::alloc_not_small, specifically).
Correct this to call capptr_chunk_is_alloc() (to capture our intent) and
capptr_to_user_address_control() (to do the bounding) and defer the conversion
to void* until the very periphery of the allocator, using capptr_reveal()
(again, to capture intent).
This preserves the chunk pointer through the use of a chunk as a slab.  It does
grow the structure by one pointer, but on non-CHERI it is still padded to 64
bytes, even with CHECK_CLIENT guards in place:

 0: MetaCommon chunk pointer
 8: next pointer
16: builder head[0]
24: builder head[1]
32: builder tail[0]
40: builder tail[1]
48: builder length[0] (uint16_t)
50: builder length[1] (uint16_t)
52: padding (4 bytes)
56: needed (uint16_t)
58: sleeping (bool)

(Sadly, on CHERI, even without CHECK_CLIENT guards and with no padding, there
are now four pointers in the structure -- chunk, next, head, tail -- plus five
extra bytes.  We will likely wish to explore encoding the head and tail offsets
relative to the chunk pointer.)

This lets us remove the "subversive amplification" in dealloc() in favor of just
preserving the chunk pointer.  Speaking of, be sure to assign that in all the
right places, and ASSERT that we've got it right.
This dates back to the much earlier design that required the use of an authority
map.  Since the current CHERI design does not, go ahead and ask the platform
whenever underlying allocation requests are sufficiently aligned.
So make signed_prev return one.  This distinction only matters on CHERI, of
course; everywhere else address_t *is* a uintptr_t.
@nwf nwf force-pushed the 202110-cheri branch 3 times, most recently from 1c64e95 to 4126e95 Compare October 20, 2021 02:27
This adds a CHERI AAL and expands the FreeBSD PAL to cover CHERI.  It updates a
comment in ds/address.h now that there is an example architecture that
differentiates uintptr_t and address_t.
The correct thing to do, of course, is to fix these upstream, but that requires
understanding exactly what's wrong, and that's harder than just not tickling the
bugs.
@nwf
Copy link
Collaborator Author

nwf commented Oct 20, 2021

Cumulative differences since last review: https://github.com/microsoft/snmalloc/compare/d4e35e1..8dfbb44 ; the most semantically meaningful one is 83dc540 and the rest is just some noise of tidying (esp 1e2f7ff). I think this take on CI addresses Alex's comment, above.

@nwf nwf marked this pull request as ready for review October 20, 2021 02:42

set(CROSS_LINKER_FLAGS "-fuse-ld=${SNMALLOC_LINKER_FLAVOUR} -Wl,--dynamic-linker=/usr/${triple}/lib/$ENV{RTLD_NAME},-rpath,/usr/${triple}/lib")
if((DEFINED SNMALLOC_LINKER) AND NOT ("${SNMALLOC_LINKER}" MATCHES "^$"))
string(APPEND CROSS_LINKER_FLAGS " --ld-path=${SNMALLOC_LINKER}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires Clang 12, but I assume the variable will only be set when that requirement is met?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea; despite the generic name, AFAICT this file is used only in cross-build jobs, where we're already requiring clang-13.

@nwf nwf merged commit 342d826 into microsoft:snmalloc2 Oct 20, 2021
@nwf nwf deleted the 202110-cheri branch October 20, 2021 11:02
@nwf nwf mentioned this pull request Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants