Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Jumpstub fixes #15296

Merged
merged 1 commit into from
Dec 1, 2017
Merged

Jumpstub fixes #15296

merged 1 commit into from
Dec 1, 2017

Conversation

jkotas
Copy link
Member

@jkotas jkotas commented Nov 30, 2017

This set of changes is fixing #14995 and #14996, and improves reliability of jump stub allocation. The key changes are:

  • Retry JITing on failure to allocate jump stub. Failure to allocate jump during JITing is not fatal anymore. There is extra memory reserved for jump stubs on retry to ensure that the retry succeeds allocating the jump stubs that it needs with high probability.

  • Reserve space for jump stubs for precodes and other code fragments at the end of each code heap segment. This is trying to ensure that eventual allocation of jump stubs for precodes and other code fragments succeeds. Accounting is done conservatively - reserves more than strictly required. It wastes a bit of address space, but no actual memory. Also, this reserve is not used to allocate jump stubs for JITed code since the JITing can recover from failure to allocate the jump stub now.

My plan is to port these changes to .NET Framework as well since these issues affect important workloads there.

@jkotas
Copy link
Member Author

jkotas commented Nov 30, 2017

@dotnet-bot test Windows_NT x64 corefx_baseline
@dotnet-bot test Ubuntu x64 corefx_baseline

@BruceForstall
Copy link
Member

@@ -721,11 +722,26 @@ INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMe
TADDR hiAddr = baseAddr + INT32_MAX;
if (hiAddr < baseAddr) hiAddr = UINT64_MAX; // overflow

// Always try to allocate with throwOnOutOfMemoryWithinRange=false first to conserve
// reserved space untill when it is really needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: untill

What is the benefit of trying with 'false' first here? It seems like the next try with true would do the same thing but also trying to use the emergency reserve for jump stubs and if that fails, it would throw anyway as before. I probably missed something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a missing piece of logic to conserve reserved space in CanUseCodeHeap. Fixed and added the needed comments to explain - @vitek-karas was confused by this as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense now, thanks!

@jkotas jkotas force-pushed the jumpstub-fixes branch 2 times, most recently from 9abbc29 to 02b9e3c Compare December 1, 2017 00:16
@@ -599,7 +599,7 @@ RETAIL_CONFIG_STRING_INFO(INTERNAL_WinMDPath, W("WinMDPath"), "Path for Windows
// Loader heap
//
CONFIG_DWORD_INFO_EX(INTERNAL_LoaderHeapCallTracing, W("LoaderHeapCallTracing"), 0, "Loader heap troubleshooting", CLRConfig::REGUTIL_default)
RETAIL_CONFIG_DWORD_INFO(INTERNAL_CodeHeapReserveForJumpStubs, W("CodeHeapReserveForJumpStubs"), 2, "Percentage of code heap to reserve for jump stubs")
RETAIL_CONFIG_DWORD_INFO(INTERNAL_CodeHeapReserveForJumpStubs, W("CodeHeapReserveForJumpStubs"), 1, "Percentage of code heap to reserve for jump stubs")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we reduce this because JIT required jump stubs are not covered by this number anymore... so we don't need so much space anymore... right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right.

@@ -721,11 +722,26 @@ INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMe
TADDR hiAddr = baseAddr + INT32_MAX;
if (hiAddr < baseAddr) hiAddr = UINT64_MAX; // overflow

// Always try to allocate with throwOnOutOfMemoryWithinRange=false first to conserve
// reserved space untill when it is really needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "until" (single l)

@@ -721,11 +722,26 @@ INT32 rel32UsingJumpStub(INT32 UNALIGNED * pRel32, PCODE target, MethodDesc *pMe
TADDR hiAddr = baseAddr + INT32_MAX;
if (hiAddr < baseAddr) hiAddr = UINT64_MAX; // overflow

// Always try to allocate with throwOnOutOfMemoryWithinRange=false first to conserve
// reserved space untill when it is really needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expand the comment to make it very explicit that we're talking about conserving JumpStubReserve (not a reserved VM space, which is what it sounds like to me now).

delta = rel32UsingJumpStub(fixupLocation, (PCODE)target, m_pMethodBeingCompiled, NULL, false /* throwOnOutOfMemoryWithinRange */);
if (delta == 0)
{
m_fRel32Overflow = TRUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth a comment of what the effect of this will be. Something like "this forces the JIT to retry the method, which allows us to reserve more space for jump stubs and have a higher chance that we will find space for them". The "unintended" side-effect of this is also that we will recompile the method with "long-pointers" for all data accesses, even though it might not have been necessary otherwise. Worth a note, but I think it's OK to keep it that way - if we're already struggling with jump stubs, then making the method work at all is a victory... small perf gains like rel32 data accesses are fine to ignore.

Copy link
Member Author

@jkotas jkotas Dec 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unintended" side-effect of this is also that we will recompile the method with "long-pointers" for all data accesses

BTW: By the time we got here we are recompiling the method with "long-pointers" already. (It is the problem 1. described in https://github.com/dotnet/coreclr/blob/master/Documentation/design-docs/jump-stubs.md#jump-stubs-and-the-jit .)

@@ -2101,11 +2123,18 @@ HeapList* LoaderCodeHeap::CreateCodeHeap(CodeHeapRequestInfo *pInfo, LoaderHeap
{
if (loAddr != NULL || hiAddr != NULL)
{
#ifdef _DEBUG
// Always exercise the fallback path with force relocs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would word this "Always exercise the fallback path in the caller when forced relocs are turned on. (Took me a couple of minutes to realize what you mean by this).

@@ -2675,49 +2664,6 @@ EEJitManager::DomainCodeHeapList *EEJitManager::GetCodeHeapList(CodeHeapRequestI
return pList;
}

HeapList* EEJitManager::GetCodeHeap(CodeHeapRequestInfo *pInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove mention of this method from the comment in codeman.cpp:565

// pCurrent is the first (and possibly only) heap that would satistfy
pResult = pCurrent;
}
// We use the initial creation size as a discriminator (i.e largest heap)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read it correctly - your change removed this behavior - that is we don't prefer the largest heap anymore, we simply take the first one available...
Is that intentional? There doesn't seem to be any reasoning mentioned in the code as to why we tried to get the largest heap before... so I don't know if removing that functionality is OK or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The code to prefer largest heap was added on 2004/07/02 by @briansull . Brian, do you happen to remember why it was done?

Copy link

@briansull briansull Dec 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't recall why.
Probably just so we start using the new largest heap for the new JITted code, leaving the older mostly used up heaps around in case we needed them for jumpstubs.
Back then we didn't have all of the code to reserve the extra space for jump stubs.

#endif

size_t nibbleMapSize = HEAP2MAPSIZE(ROUND_UP_TO_PAGE(pHp->maxCodeHeapSize));
pHp->pHdrMap = new DWORD[nibbleMapSize / sizeof(DWORD)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to just new up the nibble map memory here?
I totally agree that we should not allocate the nibble map from the code heap itself (as that is executable memory, so we should not use it for data), I'm just wondering if we should allocate the nibble map from some of the other heaps (low-frequency heap maybe...)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a problem with it.

We need to be able to free this dynamic method heaps independently. Allocating on LoaderHeaps (like low-frequency heap) would not allow us to do that.

}
else
{
LOG((LF_BCL, LL_INFO100, "Level2 - CodeHeap [0x%p] - allocation failed:\n\tm_pLastAvailableCommittedAddr: 0x%X\n\tsizeToCommit: 0x%X\n\tm_pBaseAddr: 0x%X\n\tm_TotalBytesAvailable: 0x%X\n", this, m_pLastAvailableCommittedAddr, sizeToCommit, m_pBaseAddr, m_TotalBytesAvailable));
return NULL;
// Update largest availble block size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

availble
Spelling "available"

@jkotas
Copy link
Member Author

jkotas commented Dec 1, 2017

Does it make sense to also update https://github.com/dotnet/coreclr/blob/master/Documentation/design-docs/jump-stubs.md?

Updated to match the current state.

@jkotas
Copy link
Member Author

jkotas commented Dec 1, 2017

@vitek-karas Thanks for the review - all feedback addressed.

- Reserve space for jump stubs for precodes and other code fragments at the end of each code heap segment. This is trying
to ensure that eventual allocation of jump stubs for precodes and other code fragments succeeds. Accounting is done
conservatively - reserves more than strictly required. It wastes a bit of address space, but no actual memory. Also,
this reserve is not used to allocate jump stubs for JITed code since the JITing can recover from failure to allocate
the jump stub now. Fixes #14996.

- Improve algorithm to reuse HostCodeHeap segments: Maintain estimated size of the largest free block in HostCodeHeap.
This estimate is updated when allocation request fails, and also when memory is returned to the HostCodeHeap. Fixes #14995.

- Retry JITing on failure to allocate jump stub. Failure to allocate jump during JITing is not fatal anymore. There is
extra memory reserved for jump stubs on retry to ensure that the retry succeeds allocating the jump stubs that it needs
with high probability.

- Respect CodeHeapRequestInfo::getRequestSize for HostCodeHeap. CodeHeapRequestInfo::getRequestSize is used to
throttle code heap segment size for large workloads. Not respecting it in HostCodeHeap lead to too many
too small code heap segments in large workloads.

- Switch HostCodeHeap nibble map to be allocated on regular heap as part. It simplied the math required to estimate
the nibble map size, and allocating on regular heap is overall goodness since it does not need to be executable.
@jkotas jkotas merged commit c1e44d9 into dotnet:master Dec 1, 2017
@jkotas jkotas deleted the jumpstub-fixes branch December 1, 2017 07:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants