Fix dlmalloc for allocations bigger than 2GB #18055

miwelc · 2022-10-13T17:20:16Z

Background of this PR: #17747

miwelc · 2022-10-13T17:24:22Z

I've introduced unwanted changes due to whitespace removal. I'll fix it.

sbc100 · 2022-10-13T17:38:04Z

system/lib/dlmalloc.c

@@ -1678,7 +1678,12 @@ extern size_t getpagesize();
 #define TWO_SIZE_T_SIZES    (SIZE_T_SIZE<<1)
 #define FOUR_SIZE_T_SIZES   (SIZE_T_SIZE<<2)
 #define SIX_SIZE_T_SIZES    (FOUR_SIZE_T_SIZES+TWO_SIZE_T_SIZES)
+#if __EMSCRIPTEN__
+/* Emscripten's sbrk can interpret unsigned values greater than (MAX_SIZE_T / 2U) (2GB) correctly */
+#define HALF_MAX_SIZE_T     (MAX_SIZE_T)


But clearly the value of this macro is now incorrect, and dlmalloc was not designed to have HALF_MAX_SIZE_T == MAX_SIZE_T on any system.

Does does dlmalloc deal with requests for memory that are larger than HALF_MAX_SIZE_T on other systems? Shouldn't it just call sbrk multiple times in this case to get enough continuous memory?

As far as I can see, dlmalloc doesn't do multiple calls when ssize < HALF_MAX_SIZE_T. The problem is not that it doesn't get enough continuous memory, is that it doesn't even call sbrk in the first place.
You are right, HALF_MAX_SIZE_T is not MAX_SIZE_T. But that macro is only used with the purpose of checking the parameter passed to sbrk(). It is the minimal change required to fix this issue and that's why I opted for it.
I can change all conditionals instead to ssize < MAX_SIZE_T or remove them since they would become always true.

So this is a fundamental limit of dlmalloc? It cannot allocate anything larger than half of size_t? At least not in MORECORE mode.

Yes. As the comment in their source code says, they limit this because sbrk parameter is signed. I don't know if that is documented anywhere but I've checked it and in MacOS, for example, sbrk is defined as void *sbrk(int); in unistd.h under some conditions at least (it seems they later changed it to intprt_t). As you've said, this only affects MORECORE mode. When using mmap this works as expected.

But Emscripten's sbrk doesn't have that interface or at least interprets unsigned values greater than HALF_MAX_SIZE_T correctly.

If we are going to patch dlmalloc like this perhaps we want to introduce another setting which completely ignores HALF_MAX_SIZE_T? i.e. in this mode we could not defined HALF_MAX_SIZE_T at all and #ifdef out all the places it is checked?

How about UNLIMITED_MORECORE?

Interesting idea. I think that would work. I assume dlmalloc is being careful here and not assuming it can call sbrk twice and have something make changes in between, but for us that is a safe assumption: only sbrk will modify the memory size, and we lock around dlmalloc so even pthreads builds should be ok. The only risk I can think of is someone calling emscripten_resize_heap in between on another thread, but I think even that might work (since that function takes the final requested size, not a delta).

Perhaps the safe thing is to call in a loop but assert on later iterations seeing that nothing changed in between.

I believe dlmalloc is already built with that assumption:

MORECORE_CONTIGUOUS default: 1 (true) if HAVE_MORECORE If true, take advantage of fact that consecutive calls to MORECORE with positive arguments always return contiguous increasing addresses. This is true of unix sbrk. It does not hurt too much to set it true anyway, since malloc copes with non-contiguities. Setting it false when definitely non-contiguous saves time and possibly wasted space it would take to discover this though.

Also, I don't think emscripten_resize_heap changes the sbrk pointer... only sbrk can do that. emscripten_resize_heap just resizes the memory. It should really be called emscripten_resize_memory .. just like all HEAPU8 and friends should really have been called MEMU8.

Interesting, but MORECORE_CONTIGUOUS isn't enough, I think, since someone else can call it in between your calls? You would allocate [A, B), they would allocate [B, C), and you'd then allocate [C, D), but your two allocations are not contiguous even though from sbrk's point of view they are. But probably I'm missing something...

Good point about emscripten_resize_heap, I think that's ok.

Yes, between calls to sbrk some other code calling sbrk on its own can interfere. dlmalloc detects this in some cases (I can think of at least one case where it doesn't and seems like a bug to me with security implications) and sets non-contiguous mode from then on.

Dividing this in two calls increases the risk of this happenning and even if we fix dlmalloc to detect it in all cases, not being able to trim wastes space (considering we are talking about HALF_SIZE_T allocations makes it worse).

I'd rather keep changes as minimal as possible, so that it's easier to update dlmalloc to a new version in the future and to minimize the risk of introducing a bug. But it's your call.

…== 0

sbc100

Nice!

I wonder if this actually saves a few bytes on codesize too?

Can you run ./test/runner other.*code_size* other.*metadce* --rebase to see if anything changes? (To know for sure you really need to run that command twice, once before you change to get a new baseline, and then once after you change).

system/lib/dlmalloc.c

sbc100 · 2022-10-13T19:37:57Z

Thanks for working on this, excited to see if we save some codesize too .. I think maybe the whole of sys_trim might be dead code for most programs after this change!

… even more memory by error

miwelc · 2022-10-14T16:40:20Z

Nice!

I wonder if this actually saves a few bytes on codesize too?

Can you run ./test/runner other.*code_size* other.*metadce* --rebase to see if anything changes? (To know for sure you really need to run that command twice, once before you change to get a new baseline, and then once after you change).

I've run it but I don't see any changes at all. I've also had to skip one test with EMTEST_SKIP_NODE I don't know why, as far I know NODE_JS and JS_ENGINES are correctly set.

sbc100 · 2022-10-14T17:21:01Z

Nice!
I wonder if this actually saves a few bytes on codesize too?
Can you run ./test/runner other.*code_size* other.*metadce* --rebase to see if anything changes? (To know for sure you really need to run that command twice, once before you change to get a new baseline, and then once after you change).

I've run it but I don't see any changes at all. I've also had to skip one test with EMTEST_SKIP_NODE I don't know why, as far I know NODE_JS and JS_ENGINES are correctly set.

Thats is very odd. Can you attach you .emscripten file? You will need node configured in order to run those tests.

brendandahl · 2022-10-14T17:25:47Z

Worth noting that the tests expect a specific version of node 14.18.2_64bit. If you install emsdk I'd recommend pointing it at the version it installs in emsdk/node/14.18.2_64bit/bin/node.

miwelc · 2022-10-14T17:27:05Z

Thats is very odd. Can you attach you .emscripten file? You will need node configured in order to run those tests.

Here is my .emscripten file:

import os
emsdk_path = "/Users/miwelc/Development/emsdk"
NODE_JS = emsdk_path + '/node/14.18.2_64bit/bin/node'
PYTHON = emsdk_path + '/python/3.9.2_64bit/bin/python3'
LLVM_ROOT = emsdk_path + '/upstream/bin'
BINARYEN_ROOT = emsdk_path + '/upstream'
EMSCRIPTEN_ROOT = "/Users/miwelc/emscripten"
COMPILER_ENGINE = NODE_JS
JS_ENGINES = [NODE_JS]

It's the same config I use when using Emscripten to compile my projects, but with EMSCRIPTEN_ROOT changed. Am I doing something wrong?

sbc100 · 2022-10-14T17:50:07Z

Thats is very odd. Can you attach you .emscripten file? You will need node configured in order to run those tests.

Here is my .emscripten file:
import os
emsdk_path = "/Users/miwelc/Development/emsdk"
NODE_JS = emsdk_path + '/node/14.18.2_64bit/bin/node'
PYTHON = emsdk_path + '/python/3.9.2_64bit/bin/python3'
LLVM_ROOT = emsdk_path + '/upstream/bin'
BINARYEN_ROOT = emsdk_path + '/upstream'
EMSCRIPTEN_ROOT = "/Users/miwelc/emscripten"
COMPILER_ENGINE = NODE_JS
JS_ENGINES = [NODE_JS]
It's the same config I use when using Emscripten to compile my projects, but with EMSCRIPTEN_ROOT changed. Am I doing something wrong?

I'm pretty sure EMSCRIPTEN_ROOT doesn't do anything anymore.. you can remove that line.

Is this .emscripten inside of the your emscripten directory? (if not where is it?). I recommend putting it inside the emscripten checkout, or alternatively you can just set EM_CONFIG to point to the one in the emsdk directory.

miwelc · 2022-10-14T18:58:18Z

Thats is very odd. Can you attach you .emscripten file? You will need node configured in order to run those tests.

Here is my .emscripten file:
import os
emsdk_path = "/Users/miwelc/Development/emsdk"
NODE_JS = emsdk_path + '/node/14.18.2_64bit/bin/node'
PYTHON = emsdk_path + '/python/3.9.2_64bit/bin/python3'
LLVM_ROOT = emsdk_path + '/upstream/bin'
BINARYEN_ROOT = emsdk_path + '/upstream'
EMSCRIPTEN_ROOT = "/Users/miwelc/emscripten"
COMPILER_ENGINE = NODE_JS
JS_ENGINES = [NODE_JS]
It's the same config I use when using Emscripten to compile my projects, but with EMSCRIPTEN_ROOT changed. Am I doing something wrong?
I'm pretty sure EMSCRIPTEN_ROOT doesn't do anything anymore.. you can remove that line.

Is this .emscripten inside of the your emscripten directory? (if not where is it?). I recommend putting it inside the emscripten checkout, or alternatively you can just set EM_CONFIG to point to the one in the emsdk directory.

Yes, it's inside my emscripten directory. I've tried both export EM_CONFIG='/Users/miwelc/emscripten/.emscripten' and export EM_CONFIG='/Users/miwelc/Development/emsdk/.emscripten' with no luck. It does find the file, because if I set a wrong path it errors on that immediately.

On a happier note, if I remove emscripten/cache between runs now I get consistent size reduction of about 70 bytes in the tests.

sbc100

lgtm! But can we write a test for this?

kripken · 2022-10-17T21:36:31Z

For a test, the best place is likely the browser test suite. It's safest there since the tests run sequentially, so there is no risk of OOMing the entire test runner. We prefix those tests with zzz_zzz to ensure they run at the very end (to minimize the risk of OOM). See this example test which you can use as starting point:

emscripten/test/test_browser.py

Lines 5296 to 5301 in 03ccf3a

    
           # Tests that emmalloc supports up to 4GB Wasm heaps. 
        
           @no_firefox('no 4GB support yet') 
        
           def test_zzz_zzz_emmalloc_4gb(self): 
        
             self.btest(test_file('mem_growth.cpp'), 
        
                        expected='-65536', # == 4*1024*1024*1024 - 65536 casted to signed 
        
                        args=['-sMALLOC=emmalloc', '-sABORTING_MALLOC=0', '-sALLOW_MEMORY_GROWTH=1', '-sMAXIMUM_MEMORY=4GB'])

system/lib/dlmalloc.c

miwelc · 2022-10-18T17:19:19Z

Sure, I'll write a test. Thanks for the example test!

Also, out of curiosity, will it be possible in the future to shrink the wasm module's ArrayBuffer? This PR is built on the assumption that Emscripten's sbrk receives an unsigned parameter. In the future, if we want to support shrinking wasm memory it will be easily supported in wasm64 by setting UNDEFINED_MORECORE to 0 but for wasm32 it will need a different approach or making that option mutually exclusive with allocating chunks bigger than 2GB.

We've discussed requesting memory in a loop when >2GB and I don't think the extra binary size and complexity of the required error handling is worth it now but in the scenario of wanting to support this use case at the same time as supporting giving memory back to the browser for wasm32... this should be revisited.

miwelc · 2022-10-20T14:07:49Z

The new test seems to be passing but CI fails with too many unresponsive tests. I've reused an existing test for emmalloc testing 3GB allocations.

kripken

Great! And good idea about the test, it is best to reuse that existing one.

One question I'm still thinking about is what our plan is for upgrades to dlmalloc in the future. Perhaps having a section with a list of our changes to it would be useful. Or we could mark them using // XXX EMSCRIPTEN as we've done in other parts of the system library. But the git history might be good enough, and I don't think that needs to block - we can improve that later if we want - so let's land this.

Thanks for working on this @miwelc !

sbc100 · 2022-10-24T14:42:18Z

Great! And good idea about the test, it is best to reuse that existing one.

One question I'm still thinking about is what our plan is for upgrades to dlmalloc in the future.

I don't think dlmalloc is being actively worked on so I think that is quite unlikely.

kleisauke · 2022-10-24T15:33:07Z

I don't think dlmalloc is being actively worked on so I think that is quite unlikely.

FWIW, musl v1.2.1 replaced their original dlmalloc-like allocator with a new "mallocng" malloc implementation, since the former suffered from fundamental design problems. See the v1.2.1 release notes.

Unfortunately, as mentioned in #12833 (comment), it depends on the availability of an OS-level mmap API. Perhaps the memory-control proposal could help with that in the future.

kripken · 2022-10-24T19:13:22Z

I see, thanks @sbc100 @kleisauke

tiran · 2022-10-26T08:58:55Z

I believe this change set broke over-allocation on WASM64. One of my Python test cases now fails with an assertion error assert(requestedSize > oldSize); in emscripten_resize_heap. The test case attempts to allocate 0x7fffffffffffffff and assumes an allocation error.

test_constructor (test.test_io.CBufferedReaderTest.test_constructor) ... 
Aborted(Assertion failed)
exiting due to exception: RuntimeError: Aborted(Assertion failed),RuntimeError: Aborted(Assertion failed)
    at abort (/cpython/builddir/wasm64-emscripten-node-debug/python.js:1010:11)
    at assert (/cpython/builddir/wasm64-emscripten-node-debug/python.js:485:5)
    at _emscripten_resize_heap (/cpython/builddir/wasm64-emscripten-node-debug/python.js:7125:7)
    at sbrk (wasm://wasm/02557b1e:wasm-function[10612]:0x52c2d0)
    at dlmalloc (wasm://wasm/02557b1e:wasm-function[10603]:0x529d0e)
    at _PyMem_RawMalloc (wasm://wasm/02557b1e:wasm-function[2018]:0xbc92a)
    at PyMem_Malloc (wasm://wasm/02557b1e:wasm-function[2042]:0xbe3ea)
    at _buffered_init (wasm://wasm/02557b1e:wasm-function[7035]:0x315dae)
    at _io_BufferedReader___init__ (wasm://wasm/02557b1e:wasm-function[7034]:0x315cf6)
    at wrap_init (wasm://wasm/02557b1e:wasm-function[2351]:0xd9028)

The values are

requestedSize 20324352 oldSize 20971520

This reproducer passes on 3.1.24 and aborts on tot-upstream (3.1.25-git):

// emcc -sMEMORY64 -sASSERTIONS -sALLOW_MEMORY_GROWTH t.c -o t.js
#include <stdlib.h>

int main(void) {
    malloc(0x7fffffffffffffff);
    malloc(0x7fffffffffffffff);
}

miwelc · 2022-10-26T14:40:44Z

These changes simply remove the check that ensured sbrk was only called with values smaller than half size_t (e.g. (requested_size + other_internal_data_and_padding) < HALF_MAX_SIZE_T). If I'm not mistaken in your test you are requesting exactly half size_t which previously would silently fail returning 0 and no allocation happened (and neither sbrk or emscripten_resize_heap were actually called). In your test, your first malloc was already failing. If you change it so that you request the same total memory but in 3 smaller chunks it should fail with a previous Emscripten version.

I'm not sure but I think the issue is due to sbrk only checking for 32-bit overflow in

emscripten/system/lib/sbrk.c

Line 74 in 6153cb3

if (increment > 0 && (uint32_t)new_brk <= (uint32_t)old_brk) {

All of this is just theory, I'll look into it and get back to you.

sbc100 · 2022-10-26T18:15:14Z

It looks like that code in sbrk.c should be wrapped in #ifdef __wasm32__. Im not sure if that will fix this issue but it looks like it would be more correct.

miwelc · 2022-10-26T18:57:04Z

After some testing, it turns out it's just that emscripten_resize_heap() is receiving a correct size but bigger than what a JS Number can store as integer. If I call it with 2^53-1 it doesn't abort (but it correctly complains that the requested size is bigger than the limit). If I call it with 2^53 it aborts.

In summary:

The bug fixed by this PR prevented the test from actually testing what intended.
Now valid values reach sbrk and emscripten_resize_heap but the latter is not able to load correctly the value in JS.

Also, I've been testing the overflow check mentioned before and it seems there is actually a bug in that code although a different one:

// wasm64
uintptr_t old_brk = (uintptr_t)0xFFFFFFFF; // 2^31
uintptr_t new_brk = old_brk + 10;
if((uint32_t)new_brk <= (uint32_t)old_brk) {
    // Condition IS met even if new_brk is valid
}

uintptr_t old_brk = (uintptr_t)0xFFFFFFFF + 1; // 2^31 + 1
uintptr_t new_brk = old_brk + 10;
if((uint32_t)new_brk <= (uint32_t)old_brk) {
    // Condition is NOT met
}

uintptr_t old_brk = (uintptr_t)0xFFFFFFFF; // 2^31
uintptr_t new_brk = old_brk + 10;
if(new_brk <= old_brk) {
    // Condition is NOT met
}

So it turns out that if old_brk is some multiple of 2^31, it detects a false overflow.
I recommend just removing those (uint32_t) casts as these variables are declared as uintptr_t which is uint32_t under wasm32 and uint64_t under wasm64 and that code would automatically work for both environments.

miwelc · 2022-10-31T12:39:29Z

So it turns out that if old_brk is some multiple of 2^31, it detects a false overflow. I recommend just removing those (uint32_t) casts as these variables are declared as uintptr_t which is uint32_t under wasm32 and uint64_t under wasm64 and that code would automatically work for both environments.

@sbc100 Should I open a PR fixing this bug?

sbc100 · 2022-10-31T19:30:18Z

Yes please!

Fix dlmalloc for allocations bigger than 2GB.

2bdec18

Remove unwanted changes

2cc696f

sbc100 reviewed Oct 13, 2022

View reviewed changes

miwelc mentioned this pull request Oct 13, 2022

Allocating a >2GB pointer only seems to work when MALLOC='emmalloc' flag is set. Why isn't this documented anywhere? #17747

Closed

miwelc added 3 commits October 13, 2022 20:53

Added UNSIGNED_MORECORE option. Take it into account when trimming.

8db6d3d

MORECORE_CANNOT_TRIM should always be defined.

27260ba

Unconditionally set MORECORE_CANNOT_TRIM to 1 when UNSIGNED_MORECORE …

b23e891

…== 0

sbc100 reviewed Oct 13, 2022

View reviewed changes

system/lib/dlmalloc.c Outdated Show resolved Hide resolved

miwelc added 2 commits October 14, 2022 13:52

Remove unneeded changes

9125882

Avoid trying to trim when UNSIGNED_MORECORE so that we don't allocate…

c7d033e

… even more memory by error

sbc100 reviewed Oct 17, 2022

View reviewed changes

kripken reviewed Oct 17, 2022

View reviewed changes

system/lib/dlmalloc.c Outdated Show resolved Hide resolved

miwelc added 3 commits October 20, 2022 13:36

Add test

dee20f8

Fixed indentation

38b4e49

Fix MORECORE_CANNOT_TRIM use

2cc29a6

kripken approved these changes Oct 21, 2022

View reviewed changes

kripken merged commit 00daf40 into emscripten-core:main Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dlmalloc for allocations bigger than 2GB #18055

Fix dlmalloc for allocations bigger than 2GB #18055

miwelc commented Oct 13, 2022

miwelc commented Oct 13, 2022

sbc100 Oct 13, 2022

miwelc Oct 13, 2022 •

edited

Loading

sbc100 Oct 13, 2022

miwelc Oct 13, 2022 •

edited

Loading

sbc100 Oct 13, 2022

kripken Oct 13, 2022

sbc100 Oct 13, 2022

sbc100 Oct 13, 2022 •

edited

Loading

kripken Oct 13, 2022

miwelc Oct 13, 2022 •

edited

Loading

sbc100 left a comment

sbc100 commented Oct 13, 2022

miwelc commented Oct 14, 2022

sbc100 commented Oct 14, 2022

brendandahl commented Oct 14, 2022

miwelc commented Oct 14, 2022

sbc100 commented Oct 14, 2022

miwelc commented Oct 14, 2022

sbc100 left a comment

kripken commented Oct 17, 2022

miwelc commented Oct 18, 2022

miwelc commented Oct 20, 2022

kripken left a comment

sbc100 commented Oct 24, 2022

kleisauke commented Oct 24, 2022

kripken commented Oct 24, 2022

tiran commented Oct 26, 2022 •

edited

Loading

miwelc commented Oct 26, 2022 •

edited

Loading

sbc100 commented Oct 26, 2022

miwelc commented Oct 26, 2022 •

edited

Loading

miwelc commented Oct 31, 2022 •

edited

Loading

sbc100 commented Oct 31, 2022

Fix dlmalloc for allocations bigger than 2GB #18055

Fix dlmalloc for allocations bigger than 2GB #18055

Conversation

miwelc commented Oct 13, 2022

miwelc commented Oct 13, 2022

sbc100 Oct 13, 2022

Choose a reason for hiding this comment

miwelc Oct 13, 2022 • edited Loading

Choose a reason for hiding this comment

sbc100 Oct 13, 2022

Choose a reason for hiding this comment

miwelc Oct 13, 2022 • edited Loading

Choose a reason for hiding this comment

sbc100 Oct 13, 2022

Choose a reason for hiding this comment

kripken Oct 13, 2022

Choose a reason for hiding this comment

sbc100 Oct 13, 2022

Choose a reason for hiding this comment

sbc100 Oct 13, 2022 • edited Loading

Choose a reason for hiding this comment

kripken Oct 13, 2022

Choose a reason for hiding this comment

miwelc Oct 13, 2022 • edited Loading

Choose a reason for hiding this comment

sbc100 left a comment

Choose a reason for hiding this comment

sbc100 commented Oct 13, 2022

miwelc commented Oct 14, 2022

sbc100 commented Oct 14, 2022

brendandahl commented Oct 14, 2022

miwelc commented Oct 14, 2022

sbc100 commented Oct 14, 2022

miwelc commented Oct 14, 2022

sbc100 left a comment

Choose a reason for hiding this comment

kripken commented Oct 17, 2022

miwelc commented Oct 18, 2022

miwelc commented Oct 20, 2022

kripken left a comment

Choose a reason for hiding this comment

sbc100 commented Oct 24, 2022

kleisauke commented Oct 24, 2022

kripken commented Oct 24, 2022

tiran commented Oct 26, 2022 • edited Loading

miwelc commented Oct 26, 2022 • edited Loading

sbc100 commented Oct 26, 2022

miwelc commented Oct 26, 2022 • edited Loading

miwelc commented Oct 31, 2022 • edited Loading

sbc100 commented Oct 31, 2022

miwelc Oct 13, 2022 •

edited

Loading

miwelc Oct 13, 2022 •

edited

Loading

sbc100 Oct 13, 2022 •

edited

Loading

miwelc Oct 13, 2022 •

edited

Loading

tiran commented Oct 26, 2022 •

edited

Loading

miwelc commented Oct 26, 2022 •

edited

Loading

miwelc commented Oct 26, 2022 •

edited

Loading

miwelc commented Oct 31, 2022 •

edited

Loading