-
Notifications
You must be signed in to change notification settings - Fork 438
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[XLA:CPU] Fix random crashes on Windows due to out-of-order RuntimeDy…
…ld sections. On Windows, the JAX test suite has random crashes that turn out to come from the LLVM fatal error "IMAGE_REL_AMD64_ADDR32NB relocation requires an ordered section layout", which originates from the LLVM runtime dyld code: https://github.com/llvm/llvm-project/blob/43d05f4cc959371a93000aef187dab576380b851/llvm/lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFX86_64.h#L117 It appears that LLVM may emit IMAGE_REL_AMD64_ADDR32NB COFF relocations when referring to read-only data, however IMAGE_REL_AMD64_ADDR32NB relocations require that the read-only data section follow within 2GB after the code. Oddly enough, nothing in the LLVM SectionMemoryManager enforces this (llvm/llvm-project#55386)! It appears TF/XLA users have hit this in the past also (tensorflow/tensorflow#56207). SectionMemoryManager maps each code and data section individually. On Linux mmap() receives a "near" hint that indicates where the memory block should be located. On Windows, the same hint is passed to VirtualAlloc() (https://github.com/llvm/llvm-project/blob/43d05f4cc959371a93000aef187dab576380b851/llvm/lib/Support/Windows/Memory.inc#L133), however if the allocation fails (e.g., because that virtual address is already in use) LLVM falls back to allocating without the "near" hint. This will not work with IMAGE_REL_AMD64_ADDR32NB relocations where the order of the sections is mandatory. Since none of the memory managers in the LLVM tree obey the necessary ordering constraints, we need to roll our own. Instead, on all platforms, map() one large block of memory and suballocate it for each section. This is easy enough to do because of the llvm::RuntimeDyld::MemoryManager::reserveAllocationSpace() hook, which ensures that LLVM will tell us ahead of time the total sizes of all the relevant sections. We also know that XLA isn't going to do any more complicated memory management: we will allocate the sections once and we are done. PiperOrigin-RevId: 563218997
- Loading branch information
1 parent
32af89f
commit cb732a9
Showing
1 changed file
with
202 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters