Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_emscripten_memcpy_js runtime crash with MAIN/SIDE_MODULE + WASM_BIGINT #22161

Closed
Faless opened this issue Jun 29, 2024 · 9 comments · Fixed by #22171
Closed

_emscripten_memcpy_js runtime crash with MAIN/SIDE_MODULE + WASM_BIGINT #22161

Faless opened this issue Jun 29, 2024 · 9 comments · Fixed by #22171

Comments

@Faless
Copy link
Contributor

Faless commented Jun 29, 2024

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.61 (67fa4c16496b157a7fc3377afd69ee0445e8a6e3)
clang version 19.0.0git (https:/github.com/llvm/llvm-project 7cfffe74eeb68fbb3fb9706ac7071f8caeeb6520)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /media/Storage/emsdk/upstream/bin

Failing command line in full:

I've created a small test showing the issue (patch: test.diff.txt ):

  @with_env_modify({'EMCC_FORCE_STDLIBS': '1'})
  def test_memops_bulk_memory_side(self):
    create_file('pre.js', '''
    Module["dynamicLibraries"] = ["libside.wasm"];
    ''')
    create_file('side.c', '''
#include <assert.h>
#include <string.h>
#include <emscripten/console.h>
#include <emscripten/emscripten.h>

void EMSCRIPTEN_KEEPALIVE side_main() {
  char buffer[1024];
  char out[1024];
  memset(buffer, 'a', 1024);
  buffer[sizeof(buffer) - 1] = '\\0';
  memcpy(out, buffer, sizeof(buffer));
  assert(strcmp(buffer, out) == 0);
  emscripten_console_log(buffer);
}
    ''')
    create_file('main.c', '''
#include <assert.h>
#include <string.h>

extern void side_main();

int main() {
  side_main();
  return 0;
}
    ''')
    common_args = [] # OK
    common_args += ['-sMIN_SAFARI_VERSION=150000'] # Breaks
    #common_args += ['-sWASM_BIGINT=1'] # Also breaks (because it forces MIN_SAFARI_VERSION to 150000)
    #common_args += ['-mbulk-memory'] # Fixes the build, and can also be only applied to the SIDE MODULE
    side_args = ['-sSIDE_MODULE=2'] + common_args
    main_args = ['-sMAIN_MODULE=1', '-sEXPORT_ALL=1', '-sWARN_ON_UNDEFINED_SYMBOLS=0', '--pre-js=pre.js'] + common_args
    self.run_process([EMCC, 'side.c', '-o', 'libside.wasm'] + side_args)
    self.run_process([EMCC, 'main.c', '-o', 'test.js'] + main_args)
    self.assertContained('a' * 1023, self.run_js('test.js', interleaved_output=True))

The gist is that:

The following works:

emcc -sSIDE_MODULE=2 side.c -o libside.wasm
emcc -sMAIN_MODULE=1 -sEXPORT_ALL=1 main.c -o main.js

If we add the -sWASM_BIGINT flag instead, it will crash at runtime with:

Aborted(external symbol '_emscripten_memcpy_js' is missing. perhaps a side module was not linked in? if this function was expected to arrive from a system library, try to build the MAIN_MODULE with EMCC_FORCE_STDLIBS=1 in the environment).

As far as I understood, WASM_BIGINT bump the minimum supported Safari version (and in fact, the -ssMIN_SAFARI_VERSION=150000 also breaks the build), resulting in an incorrect BULK_MEMORY configuration (main module with bulk memory side module without).

Indeed, adding -mbulk-memory to the (side module) command fixes the WASM_BIGINT builds.

This does not seem to happen when PTRHEADS is also enabled (since AFAU that will always force bulk memory on even on the side module).

@sbc100
Copy link
Collaborator

sbc100 commented Jun 30, 2024

So as long as the main module and the side module agree on whether bulk memory is enabled it works? That makes sense to me. Are you saying you want to be able to have them disagree and things still work?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 30, 2024

I can't seem to reproduce this locally when I add -sWASM_BIGINT. I must be missing something:

$ ./emcc -sWASM_BIGINT -sSIDE_MODULE=2 side.c -o libside.wasm
$ ./emcc -sMAIN_MODULE=1 main.c -o main.js -sWASM_BIGINT --pre-js=pre.js -sWARN_ON_UNDEFINED_SYMBOLS=0
$ node ./main.js 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Furthermore I'm rather confused the undefined reference to _emscripten_memcpy_js. This symbol is only referenced from within emscripten_memcpy.c which is part of libc. Since libc is only ever included in the main module I don't see any way for that symbol to be referenced from the side module directly... unless you are somehow linking libc into your side module.. which I don't think is possible.

@Faless
Copy link
Contributor Author

Faless commented Jun 30, 2024

Are you saying you want to be able to have them disagree and things still work?

The problem is that using the same flags for side and main module (WASM_BIGINT) seems to produce incompatible binaries unless I also specify the bulk memory option... or at least that was my diagnosis of Godot dlink builds being broken with that error after updating emscripten.

Looking at your snippet, the only difference I can spot is that the test also sets the EMCC_FORCE_STDLIBS=1 env variable (similar to what we do in Godot EMCC_FORCE_STDLIBS=libc,libc++,libc++abi so extensions can rely on main module stdlibs).

I've tried removing the env variable from the test, and indeed it started working once I remove that from the test... So the issue is maybe the interaction between EMCC_FORCE_STDLIBS and bulk memory inference?

I'm pretty much at a loss here, the only thing I know is that our single threaded dlink-enabled builds are broken since we update to newer emscripten (from 3.1.39) and added WASM_BIGINT support. All other builds (nothreads, threads, threads+dlink) all work.

I've opened #22167 with the failing test, to see if we can reproduce in CI.

@Faless
Copy link
Contributor Author

Faless commented Jun 30, 2024

So the test is indeed failing in CI with:

-- begin program output --
Aborted(external symbol '_emscripten_memcpy_js' is missing. perhaps a side module was not linked in? if this function was expected to arrive from a system library, try to build the MAIN_MODULE with EMCC_FORCE_STDLIBS=1 in the environment)
/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:125
      throw ex;
      ^

RuntimeError: Aborted(external symbol '_emscripten_memcpy_js' is missing. perhaps a side module was not linked in? if this function was expected to arrive from a system library, try to build the MAIN_MODULE with EMCC_FORCE_STDLIBS=1 in the environment)
    at abort (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:623:11)
    at __emscripten_memcpy_js (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:7417:91)
    at test.wasm.__memcpy (wasm://wasm/test.wasm-00b762e6:wasm-function[900]:0xb51d3)
    at libside.wasm.side_main (wasm://wasm/libside.wasm-351cd8ae:wasm-function[7]:0x1c4)
    at _side_main (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:15958:34)
    at test.wasm.__original_main (wasm://wasm/test.wasm-00b762e6:wasm-function[291]:0x95282)
    at test.wasm.main (wasm://wasm/test.wasm-00b762e6:wasm-function[292]:0x952a7)
    at callMain (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40613:15)
    at doRun (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40663:23)
    at run (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40678:5)
    at runCaller (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40586:19)
    at removeRunDependency (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:592:7)
    at /tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:2052:11
Thrown at:
    at abort (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:623:11)
    at __emscripten_memcpy_js (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:7417:91)
    at $__memcpy (wasm://wasm/test.wasm-00b762e6:1:741844)
    at $side_main (wasm://wasm/libside.wasm-351cd8ae:1:453)
    at _side_main (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:15958:34)
    at $__original_main (wasm://wasm/test.wasm-00b762e6:1:610947)
    at $main (wasm://wasm/test.wasm-00b762e6:1:610984)
    at callMain (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40613:15)
    at doRun (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40663:23)
    at run (/tmp/emtest_5uqgpmcf/emscripten_test_other__fbk0mvy/test.js:40678:5)


Node.js v18.20.3
-- end program output --
test_memops_bulk_memory_side (test_other.other) ... FAIL

If I read the stack trace correctly, the side is calling the main module __memcpy symbol as expected, but then the main module __memcpy tries to call into __emscripten_memcpy_js which is marked as a stub?

@sbc100
Copy link
Collaborator

sbc100 commented Jul 1, 2024

I will see if I can reproduce with EMCC_FORCE_STDLIBS.

BTW, are you using EMCC_FORCE_STDLIBS in your build? Do you know why you need to use that?

@sbc100
Copy link
Collaborator

sbc100 commented Jul 1, 2024

I was able to reproduce using EMCC_FORCE_STDLIBS=1

sbc100 added a commit to sbc100/emscripten that referenced this issue Jul 1, 2024
In particular `EMCC_FORCE_STDLIBS=1` would previously place libc before
libbulkmemory which don't work.  libbulkmemory always needs to come
first on the link line.

Fixes: emscripten-core#22161
sbc100 added a commit to sbc100/emscripten that referenced this issue Jul 1, 2024
In particular `EMCC_FORCE_STDLIBS=1` would previously place libc before
libbulkmemory which don't work.  libbulkmemory always needs to come
first on the link line.

Fixes: emscripten-core#22161
sbc100 added a commit to sbc100/emscripten that referenced this issue Jul 1, 2024
In particular `EMCC_FORCE_STDLIBS=1` would previously place libc before
libbulkmemory which don't work.  libbulkmemory always needs to come
first on the link line.

Fixes: emscripten-core#22161
@Faless
Copy link
Contributor Author

Faless commented Jul 1, 2024

BTW, are you using EMCC_FORCE_STDLIBS in your build? Do you know why you need to use that?

Yes, we are, we build Godot in the following way when supporting extensions (via dynamic linking):

  • A minimal runtime (web_runtime.cpp) which only calls a godot_web_main extern, with MAIN_MODULE=1, EMCC_FORCE_STDLIBS=libc,libc++,libc++abi and all our JS support libraries.
  • All the actual godot code with SIDE_MODULE=2 (and -fvisibility=hidden), exposing the "true main" (and the extension API).

Godot extensions (which are developed by third parties, and built as dynamic libraries), can then be successfully linked against the minimal runtime, even if the Godot code doesn't use a specific stdlib feature.

In the past (the first builds dates back to 2020), without using EMCC_FORCE_STDLIBS, parts of the standard libraries would be stripped, end extensions using e.g. std::cout instead of the godot-exposed godot_print would result in linking errors (because std:cout was being stripped away since Godot doesn't use it).

Similarly, I couldn't build the whole godot code with MAIN_MODULE=2, or again, the dead code elimination would strip too much (seemingly also ignoring the EMCC_FORCE_STDLIBS).

By building the Godot code as SIDE_MODULE=2 and having an almost empty main module things works as expected.

I'm not sure there's a better way to allow extension developers to use the C++ standard library, without EMCC_FORCE_STDLIBS but I'm open to suggestions.

I was able to reproduce using EMCC_FORCE_STDLIBS=1

Thank you so much 💙 , I tested the patch, and confirmed it fixes both my test, and Godot nothreads+dlink builds.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 1, 2024

The whole of the C/C++ standard library should be included when you build with -sMAIN_MODULE=1.
You should not need EMCC_FORCE_STDLIBS=1. Can you try removing it?

@Faless
Copy link
Contributor Author

Faless commented Jul 2, 2024

You should not need EMCC_FORCE_STDLIBS=1. Can you try removing it?

I've tested, and indeed it's not necessary, thank you!

I've opened godotengine/godot#93853 to remove the env variable from our builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants