Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bypass PagedAllocator/PagedArrayPool by default for ASan build #94906

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

alvinhochun
Copy link
Contributor

Forward allocations to ASan malloc directly so that they can be more effectively tested by ASan. This can be disabled by setting the environment variable ALLOCATORS_DO_NOT_USE_ASAN_MALLOC=1.

When disabled, PagedAllocator will manually poison unallocated chunks instead. Though this is not very effective in detecting memory errors.


This change only affects building with ASan enabled.

PagedAllocator keeps its own memory pool and hand allocations out by pieces. This prevents ASan from working effectively for allocations handled by PagedAllocator. Even if it poisons the unallocated chunks, it hinders these checks:

  • Heap buffer overflow/underflow checks: ASan works by having redzones before and after allocations. PagedAllocator hands out pieces from contiguous pages without redzones in between pieces, so there is a high chance any overflow or underflow will end up accessing memory of neighbouring pieces which does not trip ASan. (Though there is little reason for code to index into the these allocations, so overflow/underflow normally should not happen for them.)
  • Use-after-free: ASan quarantines freed memory for some time to allow use-after-free errors to be caught. PagedAllocator does not have a quarantine – in fact it prefers handing out pieces that was most recently freed.
  • Double free: PagedAllocator will have to check this manually.

In addition, ASan poisoning/unpoisoning does not keep the stack traces, so if the code does hit an error that poisoning is able to catch, you can only see the backtrace of where the whole page was allocated (which can be anywhere), instead of the more helpful backtraces of where the memory was allocated and freed that ASan can provide.

This had managed to catch a use-after-free: #94832.

@RandomShaper
Copy link
Member

RandomShaper commented Jul 29, 2024

I had ideas about this in the back of my mind and I'm happy there's a PR for it!

Thoughts:

  • Do you think PagedArrayPool could benefit from the same?
  • If explicit poisoning is not very helpful, it may be a good idea to keep this more black-or-white. I mean, you either fully embrace asan (allocate normally so it tracks) or let it work by default without trying to assist it by poisoning.
  • The way of controlling this feature, an env var, may be a bit contradictory with, for instance, the way it's done for render API validation, especially (--gpu-validation, --gpu-abort CLI switches).

@alvinhochun
Copy link
Contributor Author

alvinhochun commented Jul 29, 2024

I had ideas about this in the back of my mind and I'm happy there's a PR for it!
Thanks.

Thoughts:

  • Do you think PagedArrayPool could benefit from the same?

Looks like it could. Do you want me to do it in the same PR?

  • If explicit poisoning is not very helpful, it may be a good idea to keep this more black-or-white. I mean, you either fully embrace asan (allocate normally so it tracks) or let it work by default without trying to assist it by poisoning.

Well, poisoning may still catch something and it's not like an expensive operation anyway. It also doesn't interfere with other types of tracking (in theory).

  • The way of controlling this feature, an env var, may be a bit contradictory with, for instance, the way it's done for render API validation, especially (--gpu-validation, --gpu-abort CLI switches).

ASan requires a speciality build. Having command line flags that normally doesn't do anything doesn't seem useful. Unless you want to be able to bypass PagedAllocator even on a normal build? Still doesn't seem very useful.

The concern is that the option needs to be set before the very first allocation, and must not change during the lifetime of the process. If the allocator is used at all before or when parsing the command line, then using a command line flag wouldn't work (I don't know if this actually happens or not.)

@RandomShaper
Copy link
Member

Looks like it could. Do you want me to do it in the same PR?

That'd be great.


Well, poisoning may still catch something and it's not like an expensive operation anyway. It also doesn't interfere with other types of tracking (in theory).

No strong opinions here. Just wanted to discuss it a bit. Sounds good to keep it.


ASan requires a speciality build. Having command line flags that normally doesn't do anything doesn't seem useful. Unless you want to be able to bypass PagedAllocator even on a normal build? Still doesn't seem very useful.

The concern is that the option needs to be set before the very first allocation, and must not change during the lifetime of the process. If the allocator is used at all before or when parsing the command line, then using a command line flag wouldn't work (I don't know if this actually happens or not.)

Certain CLI arguments we already have are restricted to build types (--gpu-abort, --test). I don't think there would be an issue in adding an #ifdef ASAN_ENABLED-scoped additional option. The code that ends up defining ASAN_ENABLED would have to be moved to somewhere else (such as memory.h or typedefs.h, which has to happen anyway if addressing PagedArrayPool. I think the CLI args are parsed early enough to be in time to inform how the allocators work. In addition, a CLI switch would help with discoverability.

All this said, I'd like to hear more opinions. Let's summon @Calinou, for instance.

Forward allocations to ASan malloc directly so that they can be more
effectively tested by ASan. This can be disabled by setting the
environment variable `ALLOCATORS_DO_NOT_USE_ASAN_MALLOC=1`.

When disabled, PagedAllocator will manually poison unallocated chunks
instead. Though this is not very effective in detecting memory errors.
This is less of a straightforward bypass compared to PagedAllocator,
because we need to keep track of the allocated memory by its page_id to
be able to free the allocation.
@alvinhochun alvinhochun force-pushed the paged-allocator-asan-malloc branch from c1ed04e to 0fe14ca Compare July 29, 2024 14:05
@alvinhochun alvinhochun changed the title Bypass PagedAllocator by default for ASan build Bypass PagedAllocator/PagedArrayPool by default for ASan build Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants