Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make jemalloc bound returned pointers to their usable size #2147

Merged
merged 5 commits into from
Nov 6, 2024

Conversation

markjdb
Copy link
Contributor

@markjdb markjdb commented Jul 10, 2024

This allows MRS to handle realloc() loops more efficiently, since it can avoid going back to the allocator every time the allocation size increases. See #2133. It is also consistent with snmalloc. However, tight bounds might be desirable in some cases, particularly when giving demos. Thus, this PR also adds a new option to MRS to tighten bounds on pointers returned by the backing memory allocator. This option is enabled by default, so default behaviour doesn't change, but this should be reconsidered in the future. In particular, setting _RUNTIME_NOBOUND_CHERI_POINTERS=1 in the environment causes MRS to stop applying bounds, so one gets bounds corresponding to the allocation's usable size.

I also changed jemalloc's malloc_usable_size() implementation to avoid clamping to the capability bounds if the tag bit is clear. This made it possible to compare bounds set by jemalloc with the underlying allocation size, e.g., by comparing cheri_getlen(p) with malloc_usable_size(cheri_cleartag(p)). I used this to write a test program which checks that these values are equal for pointers returned by malloc(), realloc() and posix_memalign() for various allocation sizes and alignments. It's not required otherwise though, so may be dropped.

@bsdjhb
Copy link
Collaborator

bsdjhb commented Jul 10, 2024

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default. I also think it's a bit odd that in the case where revocation is disabled, MRS is no longer a stub since it narrows the bounds of the underlying allocator. I kind of think the !quarantining cases should probably still pass through to the underlying allocator directly.

@markjdb
Copy link
Contributor Author

markjdb commented Jul 10, 2024

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default.

No objection from me! The main problem with MRS not bounding pointers is that it may affect demos and so on in a negatively surprising way.

I also think it's a bit odd that in the case where revocation is disabled, MRS is no longer a stub since it narrows the bounds of the underlying allocator. I kind of think the !quarantining cases should probably still pass through to the underlying allocator directly.

It is odd - MRS is not really the right place to implement this, it's just more convenient than doing this directly in jemalloc and snmalloc. I made this option orthogonal to the quarantining condition since it's logically unrelated. For testing purposes it is convenient to be able to enable or disable bounding independent of whether revocation is enabled (I made use of this when testing the change); to me this usefulness kind of outweighs the layering violation. I could make it conditional on quarantining though if we really think that that's important.

@bsdjhb
Copy link
Collaborator

bsdjhb commented Jul 10, 2024

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default.

No objection from me! The main problem with MRS not bounding pointers is that it may affect demos and so on in a negatively surprising way.

Yes, I guess my thought here is that if we had stuck with snmalloc instead of switching back to jemalloc then those demos would already have the same issues? Perhaps though this is a way of "fixing" snmalloc as another way of thinking about it.

I also think it's a bit odd that in the case where revocation is disabled, MRS is no longer a stub since it narrows the bounds of the underlying allocator. I kind of think the !quarantining cases should probably still pass through to the underlying allocator directly.

It is odd - MRS is not really the right place to implement this, it's just more convenient than doing this directly in jemalloc and snmalloc. I made this option orthogonal to the quarantining condition since it's logically unrelated. For testing purposes it is convenient to be able to enable or disable bounding independent of whether revocation is enabled (I made use of this when testing the change); to me this usefulness kind of outweighs the layering violation. I could make it conditional on quarantining though if we really think that that's important.

Yeah, it's really just the layering violation I'm not a fan of I guess, but I see why it is a convenient place to enforce this.

@markjdb
Copy link
Contributor Author

markjdb commented Jul 10, 2024

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default.

No objection from me! The main problem with MRS not bounding pointers is that it may affect demos and so on in a negatively surprising way.

Yes, I guess my thought here is that if we had stuck with snmalloc instead of switching back to jemalloc then those demos would already have the same issues? Perhaps though this is a way of "fixing" snmalloc as another way of thinking about it.

I wasn't aware of that history. I'd tend to think that we prefer a more performance-friendly setting out of the box, so MRS shouldn't be bounding pointers by default. But, maybe there is some kind of demo checklist that I should update as a part of that change?

if (p == NULL)
return (p);
else if (bound_pointers)
return (cheri_setbounds(p, size));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should be a macro or inline function to reduce explicit instances of cheri_setbounds. Philosophically that is because each setbounds is a point potentially requiring audit when you bring in foreign code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the latest push. I also added a comment above the new function to explain the intent behind the flag.

@brooksdavis
Copy link
Member

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default.

No objection from me! The main problem with MRS not bounding pointers is that it may affect demos and so on in a negatively surprising way.

Yes, I guess my thought here is that if we had stuck with snmalloc instead of switching back to jemalloc then those demos would already have the same issues? Perhaps though this is a way of "fixing" snmalloc as another way of thinking about it.

I wasn't aware of that history. I'd tend to think that we prefer a more performance-friendly setting out of the box, so MRS shouldn't be bounding pointers by default. But, maybe there is some kind of demo checklist that I should update as a part of that change?

Technically what you've got MRS doing is UB since you must past exactly the same pointer to free that malloc returned (I've literally implemented that requirement in a cherified dlmalloc where there was space in the out of bounds metadata to stash a copy of the returned pointer and it made that check extremely cheap (just __builtin_cheri_equal_exact against the stashed copy)).

@markjdb
Copy link
Contributor Author

markjdb commented Jul 16, 2024

Given that snmalloc already did what jemalloc now does, I actually think it's more useful for MRS to not do its own bounds by default.

No objection from me! The main problem with MRS not bounding pointers is that it may affect demos and so on in a negatively surprising way.

Yes, I guess my thought here is that if we had stuck with snmalloc instead of switching back to jemalloc then those demos would already have the same issues? Perhaps though this is a way of "fixing" snmalloc as another way of thinking about it.

I wasn't aware of that history. I'd tend to think that we prefer a more performance-friendly setting out of the box, so MRS shouldn't be bounding pointers by default. But, maybe there is some kind of demo checklist that I should update as a part of that change?

Technically what you've got MRS doing is UB since you must past exactly the same pointer to free that malloc returned (I've literally implemented that requirement in a cherified dlmalloc where there was space in the out of bounds metadata to stash a copy of the returned pointer and it made that check extremely cheap (just __builtin_cheri_equal_exact against the stashed copy)).

That seems like another argument for making MRS not bound pointers by default. Do you think this particular UB should prevent us from making it an option at all?

@brooksdavis
Copy link
Member

That seems like another argument for making MRS not bound pointers by default. Do you think this particular UB should prevent us from making it an option at all?

I think current allocators are OK with this in practice so it's ok as an option for now.

@markjdb markjdb force-pushed the dev-jemalloc-bounds branch from 148845a to e80bd12 Compare July 17, 2024 14:12
@markjdb
Copy link
Contributor Author

markjdb commented Jul 17, 2024

That seems like another argument for making MRS not bound pointers by default. Do you think this particular UB should prevent us from making it an option at all?

I think current allocators are OK with this in practice so it's ok as an option for now.

Ok. I did however switch the default, so now MRS does not touch the bounds unless you ask it to.

Copy link
Collaborator

@bsdjhb bsdjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given Brooks comment about UB, it seems like the cleanest approach would be for each backing allocator (jemalloc, snmalloc) to have a flag to determine if bounds are tight or not rather than doing this in mrs. This would avoid the UB of malloc vs free mismatches and keep mrs as a revocation-only wrapper. We could use a consistent name across both allocators for this setting at least.

mrs_bound_pointer(void *p, size_t size)
{
if (p != NULL && bound_pointers)
p = cheri_setbounds(p, size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to make sure that size is at least 1.

It's possible for an allocator to actually support zero sized allocations, but it needs a special pool so they can't be confused with an adjacent allocation manipulated to point to one past the end with bounds of 0. No current allocator does this (maybe out dlmalloc port does by accident).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the routine to only bound the pointer if size > 0.

@markjdb markjdb force-pushed the dev-jemalloc-bounds branch from e80bd12 to 25a6bee Compare July 24, 2024 14:51
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this pull request Aug 20, 2024
 CTSRD-CHERI#2147 archive_string: clean up strncat_from_utf8_to_utf8 (36047967a)
 CTSRD-CHERI#2153 archive_match: check archive_read_support_format_raw()
       return value (0ce1b4c38)
 CTSRD-CHERI#2154 archive_match: turn counter into flag (287e05d53)
 CTSRD-CHERI#2155 lha: Do not allow negative file sizes (93b11caed)
 CTSRD-CHERI#2156 tests: setenv LANG to en_US.UTF-8 in bsdunzip test_I.c (83e8b0ea8)

Obtained from:		libarchive
Libarchive commit:	83e8b0ea8c3b07e07ac3dee90a8724565f8e53fd
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this pull request Aug 20, 2024
 CTSRD-CHERI#2147 archive_string: clean up strncat_from_utf8_to_utf8 (36047967a)
 CTSRD-CHERI#2153 archive_match: check archive_read_support_format_raw()
       return value (0ce1b4c38)
 CTSRD-CHERI#2154 archive_match: turn counter into flag (287e05d53)
 CTSRD-CHERI#2155 lha: Do not allow negative file sizes (93b11caed)
 CTSRD-CHERI#2156 tests: setenv LANG to en_US.UTF-8 in bsdunzip test_I.c (83e8b0ea8)

MFC after:	3 days
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this pull request Aug 20, 2024
 CTSRD-CHERI#2147 archive_string: clean up strncat_from_utf8_to_utf8 (36047967a)
 CTSRD-CHERI#2153 archive_match: check archive_read_support_format_raw()
       return value (0ce1b4c38)
 CTSRD-CHERI#2154 archive_match: turn counter into flag (287e05d53)
 CTSRD-CHERI#2155 lha: Do not allow negative file sizes (93b11caed)
 CTSRD-CHERI#2156 tests: setenv LANG to en_US.UTF-8 in bsdunzip test_I.c (83e8b0ea8)

MFC after:	3 days
Copy link
Collaborator

@bsdjhb bsdjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that the bounding option belongs in the backing allocator instead of mrs, but I do think the jemalloc fixes are good and would be fine with moving the bounding option down into the backing allocators later.

@markjdb
Copy link
Contributor Author

markjdb commented Oct 23, 2024

This would avoid the UB of malloc vs free mismatches and keep mrs as a revocation-only wrapper

I don't really agree with the "revocation-only" remark, because the problem which motivates this change is entirely caused by MRS and revocation in general. That's the main reason I've been a bit reluctant to handle this within individual allocators.

The memory returned by je_allocm() and je_rallocm() is allocated using
je_mallocx() and je_(x|r)allocx(), respectively, and those functions apply
bounds to their return values by virtue of being implemented using
imalloc().

This makes it easier to make jemalloc bound pointers to the usable size
instead of the allocation size.
This behaviour should be safe and is useful for testing whether the bounds
applied by jemalloc match what malloc_usable_size() reports.  That is, one
can validate bounds with a check like,

    assert(cheri_getlen(p) == malloc_usable_size(cheri_cleartag(p)));

In particular, malloc_usable_size() will look up the usable size using
allocator metadata, so it's handy to be able to compare that size with the
capability bounds.
Previously jemalloc would return pointers bounded to the allocation size,
which is stricter but makes realloc() loops slow since they force MRS to
make a new allocation.  Instead, bound returned pointers to their usable
size; MRS can now optionally further tighten bounds if so desired.
@markjdb markjdb force-pushed the dev-jemalloc-bounds branch from 25a6bee to 0ae431a Compare October 23, 2024 16:25
@markjdb markjdb merged commit 5e45e15 into CTSRD-CHERI:dev Nov 6, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants