Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Commit

Permalink
Jumpstub fixes
Browse files Browse the repository at this point in the history
- Reserve space for jump stubs for precodes and other code fragments at the end of each code heap segment. This is trying
to ensure that eventual allocation of jump stubs for precodes and other code fragments succeeds. Accounting is done
conservatively - reserves more than strictly required. It wastes a bit of address space, but no actual memory. Also,
this reserve is not used to allocate jump stubs for JITed code since the JITing can recover from failure to allocate
the jump stub now. Fixes #14996.

- Improve algorithm to reuse HostCodeHeap segments: Maintain estimated size of the largest free block in HostCodeHeap.
This estimate is updated when allocation request fails, and also when memory is returned to the HostCodeHeap. Fixes #14995.

- Retry JITing on failure to allocate jump stub. Failure to allocate jump during JITing is not fatal anymore. There is
extra memory reserved for jump stubs on retry to ensure that the retry succeeds allocating the jump stubs that it needs
with high probability.

- Respect CodeHeapRequestInfo::getRequestSize for HostCodeHeap. CodeHeapRequestInfo::getRequestSize is used to
throttle code heap segment size for large workloads. Not respecting it in HostCodeHeap lead to too many
too small code heap segments in large workloads.

- Switch HostCodeHeap nibble map to be allocated on regular heap as part. It simplied the math required to estimate
the nibble map size, and allocating on regular heap is overall goodness since it does not need to be executable.
  • Loading branch information
jkotas committed Dec 1, 2017
1 parent 1344d8e commit 5d0a5a8
Show file tree
Hide file tree
Showing 14 changed files with 425 additions and 460 deletions.
27 changes: 8 additions & 19 deletions Documentation/design-docs/jump-stubs.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,8 +188,9 @@ still reach their intended target with a rel32 offset, so jump stubs are
not expected to be required in most cases.

If this attempt to create a jump stub fails, then the generated code
cannot be used, and we hit a fatal error; we have no mechanism currently
to recover from this failure, or to prevent it.
cannot be used, and the VM restarts the compilation with reserving
extra space in the code heap for jump stubs. The reserved extra space
ensures that the retry succeeds with high probability.

There are several problems with this system:
1. Because the VM doesn't know whether a `IMAGE_REL_BASED_REL32`
Expand All @@ -205,8 +206,6 @@ code because the JIT generates `IMAGE_REL_BASED_REL32` relocations for
intra-function jumps and calls that it expects and, in fact, requires,
not be replaced with jump stubs, because it doesn't expect the register
used by jump stubs (RAX) to be trashed.
3. We don't have any mechanism to recover if a jump stub can't be
allocated.

In the NGEN case, rel32 calls are guaranteed to always reach, as PE
image files are limited to 2GB in size, meaning a rel32 offset is
Expand All @@ -217,8 +216,8 @@ jump stubs, as described later.

### Failure mitigation

There are several possible mitigations for JIT failure to allocate jump
stubs.
There are several possible alternative mitigations for JIT failure to
allocate jump stubs.
1. When we get into "rel32 overflow" mode, the JIT could always generate
large calls, and never generate rel32 offsets. This is obviously
somewhat expensive, as every external call, such as every call to a JIT
Expand Down Expand Up @@ -469,19 +468,9 @@ bytes allocated, to reserve space for one jump stub per FixupPrecode in
the chunk. When the FixupPrecode is patched, for LCG methods it will use
the pre-allocated space if a jump stub is required.

For the non-LCG, non-FixupPrecode cases, we need a different solution.
It would be easy to similarly allocate additional space for each type of
precode with the precode itself. This might prove expensive. An
alternative would be to ensure, by design, that somehow shared jump stub
space is available, perhaps by reserving it in a shared area when the
precode is allocated, and falling back to a mechanism where the precode
reserves its own jump stub space if shared jump stub space cannot be
allocated.

A possibly better implementation would be to reserve, but not allocate,
jump stub space at the end of the code heap, similar to how
CodeHeapReserveForJumpStubs works, but instead the reserve amount should
be computed precisely.
For non-LCG, we are reserving, but not allocating, a space at the end
of the code heap. This is similar and in addition to the reservation done by
COMPlus_CodeHeapReserveForJumpStubs. (See https://github.com/dotnet/coreclr/pull/15296).

## Ready2Run

Expand Down
4 changes: 1 addition & 3 deletions src/debug/daccess/fntableaccess.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,7 @@ struct FakeHeapList
DWORD_PTR mapBase; // changed from PBYTE
DWORD_PTR pHdrMap; // changed from DWORD*
size_t maxCodeHeapSize;
DWORD cBlocks;
bool bFull; // Heap is considered full do not use for new allocations
bool bFullForJumpStubs; // Heap is considered full do not use for new allocations of jump stubs
size_t reserveForJumpStubs;
};

typedef struct _FakeHpRealCodeHdr
Expand Down
2 changes: 1 addition & 1 deletion src/inc/clrconfigvalues.h
Original file line number Diff line number Diff line change
Expand Up @@ -599,7 +599,7 @@ RETAIL_CONFIG_STRING_INFO(INTERNAL_WinMDPath, W("WinMDPath"), "Path for Windows
// Loader heap
//
CONFIG_DWORD_INFO_EX(INTERNAL_LoaderHeapCallTracing, W("LoaderHeapCallTracing"), 0, "Loader heap troubleshooting", CLRConfig::REGUTIL_default)
RETAIL_CONFIG_DWORD_INFO(INTERNAL_CodeHeapReserveForJumpStubs, W("CodeHeapReserveForJumpStubs"), 2, "Percentage of code heap to reserve for jump stubs")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_CodeHeapReserveForJumpStubs, W("CodeHeapReserveForJumpStubs"), 1, "Percentage of code heap to reserve for jump stubs")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_NGenReserveForJumpStubs, W("NGenReserveForJumpStubs"), 0, "Percentage of ngen image size to reserve for jump stubs")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_BreakOnOutOfMemoryWithinRange, W("BreakOnOutOfMemoryWithinRange"), 0, "Break before out of memory within range exception is thrown")

Expand Down
6 changes: 3 additions & 3 deletions src/inc/loaderheap.h
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,7 @@ class UnlockedLoaderHeap
#endif

protected:
void *UnlockedAllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment);
void *UnlockedAllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment, size_t dwReserveForJumpStubs);

void UnlockedSetReservedRegion(BYTE* dwReservedRegionAddress, SIZE_T dwReservedRegionSize, BOOL fReleaseMemory);
};
Expand Down Expand Up @@ -838,10 +838,10 @@ class ExplicitControlLoaderHeap : public UnlockedLoaderHeap


public:
void *AllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment)
void *AllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment, size_t dwReserveForJumpStubs)
{
WRAPPER_NO_CONTRACT;
return UnlockedAllocMemForCode_NoThrow(dwHeaderSize, dwCodeSize, dwCodeAlignment);
return UnlockedAllocMemForCode_NoThrow(dwHeaderSize, dwCodeSize, dwCodeAlignment, dwReserveForJumpStubs);
}

void SetReservedRegion(BYTE* dwReservedRegionAddress, SIZE_T dwReservedRegionSize, BOOL fReleaseMemory)
Expand Down
4 changes: 2 additions & 2 deletions src/utilcode/loaderheap.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1731,7 +1731,7 @@ void *UnlockedLoaderHeap::UnlockedAllocAlignedMem(size_t dwRequestedSize,



void *UnlockedLoaderHeap::UnlockedAllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment)
void *UnlockedLoaderHeap::UnlockedAllocMemForCode_NoThrow(size_t dwHeaderSize, size_t dwCodeSize, DWORD dwCodeAlignment, size_t dwReserveForJumpStubs)
{
CONTRACT(void*)
{
Expand All @@ -1753,7 +1753,7 @@ void *UnlockedLoaderHeap::UnlockedAllocMemForCode_NoThrow(size_t dwHeaderSize, s
//
// Thus, we'll request as much heap growth as is needed for the worst case (we request an extra dwCodeAlignment - 1 bytes)

S_SIZE_T cbAllocSize = S_SIZE_T(dwHeaderSize) + S_SIZE_T(dwCodeSize) + S_SIZE_T(dwCodeAlignment - 1);
S_SIZE_T cbAllocSize = S_SIZE_T(dwHeaderSize) + S_SIZE_T(dwCodeSize) + S_SIZE_T(dwCodeAlignment - 1) + S_SIZE_T(dwReserveForJumpStubs);
if( cbAllocSize.IsOverflow() )
{
RETURN NULL;
Expand Down
25 changes: 23 additions & 2 deletions src/vm/amd64/cgenamd64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -692,7 +692,8 @@ UMEntryThunk* UMEntryThunk::Decode(LPVOID pCallback)
return (UMEntryThunk*)pThunkCode->m_uet;
}

INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMethod, LoaderAllocator *pLoaderAllocator /* = NULL */)
INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMethod,
LoaderAllocator *pLoaderAllocator /* = NULL */, bool throwOnOutOfMemoryWithinRange /*= true*/)
{
CONTRACTL
{
Expand Down Expand Up @@ -721,11 +722,31 @@ INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMe
TADDR hiAddr = baseAddr + INT32_MAX;
if (hiAddr < baseAddr) hiAddr = UINT64_MAX; // overflow

// Always try to allocate with throwOnOutOfMemoryWithinRange:false first to conserve reserveForJumpStubs until when
// it is really needed. LoaderCodeHeap::CreateCodeHeap and EEJitManager::CanUseCodeHeap won't use the reserved
// space when throwOnOutOfMemoryWithinRange is false.
//
// The reserved space should be only used by jump stubs for precodes and other similar code fragments. It should
// not be used by JITed code. And since the accounting of the reserved space is not precise, we are conservative
// and try to save the reserved space until it is really needed to avoid throwing out of memory within range exception.
PCODE jumpStubAddr = ExecutionManager::jumpStub(pMethod,
target,
(BYTE *)loAddr,
(BYTE *)hiAddr,
pLoaderAllocator);
pLoaderAllocator,
/* throwOnOutOfMemoryWithinRange */ false);
if (jumpStubAddr == NULL)
{
if (!throwOnOutOfMemoryWithinRange)
return 0;

jumpStubAddr = ExecutionManager::jumpStub(pMethod,
target,
(BYTE *)loAddr,
(BYTE *)hiAddr,
pLoaderAllocator,
/* throwOnOutOfMemoryWithinRange */ true);
}

offset = jumpStubAddr - baseAddr;

Expand Down
3 changes: 2 additions & 1 deletion src/vm/amd64/cgencpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -379,7 +379,8 @@ void EncodeLoadAndJumpThunk (LPBYTE pBuffer, LPVOID pv, LPVOID pTarget);


// Get Rel32 destination, emit jumpStub if necessary
INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMethod, LoaderAllocator *pLoaderAllocator = NULL);
INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMethod,
LoaderAllocator *pLoaderAllocator = NULL, bool throwOnOutOfMemoryWithinRange = true);

// Get Rel32 destination, emit jumpStub if necessary into a preallocated location
INT32 rel32UsingPreallocatedJumpStub(INT32 UNALIGNED * pRel32, PCODE target, PCODE jumpStubAddr);
Expand Down
Loading

0 comments on commit 5d0a5a8

Please sign in to comment.