Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom memory manager for LLVM #16777

Merged
merged 2 commits into from
Jun 15, 2016
Merged

Implement custom memory manager for LLVM #16777

merged 2 commits into from
Jun 15, 2016

Conversation

yuyichao
Copy link
Contributor

@yuyichao yuyichao commented Jun 6, 2016

Use various ways to reuse the page that has the page protection set in order to avoid wasting a page of memory for each JIT event.

Fixes #14626

Tested on Linux with LLVM 3.7.1 and 3.8 with OrcJIT.
This reduces the memory used in the subarray test by ~100MB (There are around 25k JIT events each page protects at least one page of memory even though the total allocation is only around 10-20MB.)

Should in principle work on OSX, using the mkstemp tmpfile fallback code path.

Does not work on windows yet due to how we handle windows unwinding (we patch the code section after the code is generated, causing a segfault since the load address is never writable). It might be enough to modify those code to write to local address instead of load address. @Keno also mentioned some way to fix this properly. (This is what the request for help is mainly for....) Fixed

@yuyichao yuyichao added the compiler:codegen Generation of LLVM IR and native code label Jun 6, 2016
@yuyichao
Copy link
Contributor Author

yuyichao commented Jun 6, 2016

Another note is that LLVM 3.7.1 seems to generate corrupted eh_frame when using Keno's remote memory manager. Not sure why that happens but my current suspicion is the reuse of local address. Fixed

@vtjnash
Copy link
Member

vtjnash commented Jun 6, 2016

also mentioned some way to fix this properly. (This is what the request for help is mainly for....)

at present, this has been blocked on the need to write our own custom memory manager (specifically, one that supports the small memory model required by ntdll on win64 for all dynamic libraries), and then patch the LLVM runtime relocation code to add support for the ADDR32NB relocation type.

@yuyichao
Copy link
Contributor Author

yuyichao commented Jun 6, 2016

specifically, one that supports the small memory model required by ntdll on win64 for all dynamic libraries

What's the requirement? All sections to be smaller and within 4G to each other? The current custom memory manager doesn't do that yet but it's not that hard to add (especially if we can use the reserve function).

@vtjnash
Copy link
Member

vtjnash commented Jun 6, 2016

yes, and consecutively allocated (images must be non-overlapping and can't be interleaved)

@yuyichao
Copy link
Contributor Author

yuyichao commented Jun 6, 2016

(images must be non-overlapping and can't be interleaved)

Doesn't this require the current allocation scheme? The page' where we have allocations in basically can't be reused anymore. (edit: or no W^X)

@yuyichao
Copy link
Contributor Author

yuyichao commented Jun 6, 2016

Or maybe we can waste virtual memory but not physical memory?

@vtjnash
Copy link
Member

vtjnash commented Jun 6, 2016

I'm not sure. It appears that the actual requirements are undocumented and considered to be an implementation detail. We may be able to get away with some lying, since we are JIT and not statically linked. But unfortunately, there's no way to verify the source code of ntdll to learn it's expectations.

@Keno
Copy link
Member

Keno commented Jun 6, 2016

Why would there be an expectation that the sections are non-interleaved?

@vtjnash
Copy link
Member

vtjnash commented Jun 6, 2016

Actually, maybe only dbghelp cares (https://msdn.microsoft.com/en-us/library/windows/desktop/ms681353(v=vs.85).aspx), so perhaps it would only cause problems for WinDbg, but not ntdll.

@yuyichao yuyichao force-pushed the yyc/codegen/memmgr branch 10 times, most recently from b9a83ae to c63e4ae Compare June 6, 2016 20:14
@yuyichao yuyichao force-pushed the yyc/codegen/memmgr branch 9 times, most recently from 7159938 to e855467 Compare June 7, 2016 00:29
@@ -129,8 +131,17 @@ static void addOptimizationPasses(T *PM)
//PM->add(createCFGSimplificationPass()); // Merge & remove BBs
}

#ifdef USE_MCJIT
RTDyldMemoryManager* createRTDyldMemoryManager(void);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to codegen_internal.h as well.

@yuyichao yuyichao force-pushed the yyc/codegen/memmgr branch 2 times, most recently from 73dbc4f to 07ea952 Compare June 10, 2016 00:10
@yuyichao
Copy link
Contributor Author

I think I addressed all the comments apart from the one about cache flushing. I was hoping that mprotect will flush the write to the memory and calling the LLVM helper will flush the instruction cache for the runtime address. Not sure what's still missing.

@yuyichao yuyichao force-pushed the yyc/codegen/memmgr branch from 07ea952 to 6d79549 Compare June 10, 2016 18:54
yuyichao added 2 commits June 11, 2016 11:35
Use various ways to reuse the page that has the page protection set in order
to avoid wasting a page of memory for each JIT event.

Fixes #14626
@yuyichao yuyichao force-pushed the yyc/codegen/memmgr branch from 6d79549 to 8533a1c Compare June 11, 2016 15:43
@Keno
Copy link
Member

Keno commented Jun 15, 2016

@yuyichao I think this should be merged. Will you do the honor?

@yuyichao
Copy link
Contributor Author

Sure, I was just waiting for one of you to merge it ;-p

@yuyichao yuyichao merged commit b029f50 into master Jun 15, 2016
@yuyichao yuyichao deleted the yyc/codegen/memmgr branch June 15, 2016 18:26
@Keno
Copy link
Member

Keno commented Jun 15, 2016

This may be triggering a compiler bug in gcc 4.8, complaining that the copy constructor of Block was deleted (even though the use in question should have been a move constructor). We may have to comment out that deletion for the GCC 4.8 series.

@yuyichao
Copy link
Contributor Author

Where does it need a copy constructor for Block (whether move or not)? temp_buff.emplace_back()?

@tkelman
Copy link
Contributor

tkelman commented Jun 15, 2016

See buildbot log if it's more useful - https://build.julialang.org/builders/build_ubuntu14.04-x64/builds/940/steps/shell_1/logs/stdio
also quick to reproduce / test things since it only takes a couple minutes to hit.

@Keno
Copy link
Member

Keno commented Jun 15, 2016

SmallVector.

@yuyichao
Copy link
Contributor Author

That one is supposed to use a placement new with the default constructor... I guess the LLVM 3.7 SmallVector implementation might be constructing a Block and use the placement new to move..... GCC 6.1.1 didn't complain though......

@Keno
Copy link
Member

Keno commented Jun 15, 2016

What default constructor would it be using? How would it get the Block into its storage without using the move constructor?

@yuyichao
Copy link
Contributor Author

new (ptr) Block()? The default constructor with no argument is provided.

@Keno
Copy link
Member

Keno commented Jun 15, 2016

So you were relying on that particular function not being instantiated because you're not using it?

@yuyichao
Copy link
Contributor Author

And I want to make sure I don't copy/move it accidentally since it can cause memory leak or memory free'd at the wrong time.

@Keno
Copy link
Member

Keno commented Jun 15, 2016

Make the move constructor assert(false)?

@yuyichao
Copy link
Contributor Author

Actually moving construct is always fine, moving assignment is not and is deleted. I've updated #16950 to implement the moving constructor properly.


static int init_self_mem()
{
int fd = open("/proc/self/mem", O_RDWR | O_SYNC | O_CLOEXEC);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That kernel is actually too old for this method to work although I guess we should still compile this method for this to be included in the generic binary.... Should be fixed in 3f4e1e7

@JeffBezanson
Copy link
Member

BTW, the memory savings from this is absolutely great. Awesome work @yuyichao !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Approaches for avoiding fragmentation in code memory allocation
5 participants