Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM [clang]: minor LSDA issue in WASM EH, bugreport. #192

Closed
mf-RDP opened this issue Jan 8, 2022 · 26 comments
Closed

LLVM [clang]: minor LSDA issue in WASM EH, bugreport. #192

mf-RDP opened this issue Jan 8, 2022 · 26 comments

Comments

@mf-RDP
Copy link

mf-RDP commented Jan 8, 2022

Hi,

I decided to open this here, as I see the relevant ppl are all looking here. Please advise if I shall open an issue for LLVM.

I did a full implementation in an own VM for this and generally things look fine, I use an own implementation of _Unwind_CallPersonality and did implement some scan_eh_tab() from the cxx abi in some slightly different form.

Two issues I noticed:

First, see attached cpp, test25clrNOK_LLVM_CATCHPADS_ISSUE.
The lsda gets broken I think. There should really be more catch-switches (actions), something seems to get optimized away that should be there.

grafik

Second, minor issue, see function test25clrNOK_LLVM_RETURN_O0ISSUE
When compiling -O2, things are fine, when compiling -O0, there is an error with the return inside the catchblock. Please see comment in code. Thank You.
llvm_e_lsdaissue.cpp.txt

@aheejin
Copy link
Member

aheejin commented Jan 11, 2022

First of all, which compiler / which version are you using? Emscripten? It looks like you have implemented your own libc++abi (personality function). Then are you using the upstream LLVM with your own library?

I compiled your program with Emscripten with -fwasm-exceptions, and both of -O0 and -O2 work correctly for me.

First, see attached cpp, test25clrNOK_LLVM_CATCHPADS_ISSUE. The lsda gets broken I think. There should really be more catch-switches (actions), something seems to get optimized away that should be there.

grafik

It looks correct to me. The number of actions is not the same as the number of catches in a function. LLVM does some optimization to share the "end tail" of action chain if possible. Please check EHStreamer::computeActionsTable for details. Also check the structure of LSDA info here.

This assembly with comments can be easier to read.
You can create this by:

emcc -fwasm-exceptions test.cpp -S -o test.s
GCC_except_table1:
.Lexception0:
  .int8  255                             # @LPStart Encoding = omit
  .int8  0                               # @TType Encoding = absptr
  .uleb128 .Lttbase0-.Lttbaseref0
.Lttbaseref0:
  .int8  1                               # Call site Encoding = uleb128
  .uleb128 .Lcst_end0-.Lcst_begin0
.Lcst_begin0:
  .int8  0                               # >> Call Site 0 <<
                                        #   On exception at call site 0
  .int8  9                               #   Action: 5
  .int8  1                               # >> Call Site 1 <<
                                        #   On exception at call site 1
  .int8  21                              #   Action: 11
  .int8  2                               # >> Call Site 2 <<
                                        #   On exception at call site 2
  .int8  13                              #   Action: 7
.Lcst_end0:
  .int8  1                               # >> Action Record 1 <<
                                        #   Catch TypeInfo 1
  .int8  0                               #   No further actions
  .int8  2                               # >> Action Record 2 <<
                                        #   Catch TypeInfo 2
  .int8  125                             #   Continue to action 1
  .int8  3                               # >> Action Record 3 <<
                                        #   Catch TypeInfo 3
  .int8  125                             #   Continue to action 2
  .int8  4                               # >> Action Record 4 <<
                                        #   Catch TypeInfo 4
  .int8  125                             #   Continue to action 3
  .int8  5                               # >> Action Record 5 <<
                                        #   Catch TypeInfo 5
  .int8  125                             #   Continue to action 4
  .int8  8                               # >> Action Record 6 <<
                                        #   Catch TypeInfo 8
  .int8  117                             #   Continue to action 1
  .int8  7                               # >> Action Record 7 <<
                                        #   Catch TypeInfo 7
  .int8  125                             #   Continue to action 6
  .int8  6                               # >> Action Record 8 <<
                                        #   Catch TypeInfo 6
  .int8  0                               #   No further actions
  .int8  2                               # >> Action Record 9 <<
                                        #   Catch TypeInfo 2
  .int8  125                             #   Continue to action 8
  .int8  7                               # >> Action Record 10 <<
                                        #   Catch TypeInfo 7
  .int8  125                             #   Continue to action 9
  .int8  3                               # >> Action Record 11 <<
                                        #   Catch TypeInfo 3
  .int8  125                             #   Continue to action 10
  .p2align  2
                                        # >> Catch TypeInfos <<
  .int32  _ZTI10testclass6                # TypeInfo 8
  .int32  _ZTI10testclass3                # TypeInfo 7
  .int32  _ZTI10testclass5                # TypeInfo 6
  .int32  _ZTIi                           # TypeInfo 5
  .int32  _ZTIl                           # TypeInfo 4
  .int32  _ZTI10testclass1                # TypeInfo 3
  .int32  _ZTI10testclass4                # TypeInfo 2
  .int32  0                               # TypeInfo 1

In your figure, action [11] is not an action (I think it is just a zero padding or something), so there are 11 actions in total. index: N in your figure denotes the TypeInfo index, and next@ -N means where to find the next action. Those numbers are in bytes, so -3 means -1 in the action index. 0 means no further actions. -11 means -7 in the action index.

I'll refer to indices as written in the comments in the assembly file. There are three call sites. In wasm EH, "callsite" means the region covered by one try. Callsite 0 represents the innermost try, 1 represents the middle try and 2 is for the outermost try.
CallSite 0 says go to action 5. In each "action" section, you can see what TypeInfo it catches and what's the next action. So for the call site 0's first action is 5, and the sequence of actions is 5(int)->4(long)->3(testclass1)->2(testclass4)->1(...). I wrote the TypeInfo within (), which you can crosscheck. This matches the sequence of catch clauses in the innermost try. We can check similarly for the middle and the outer trys.
Callsite 1's action sequence: 11(testclass1)->10(testclass3)->9-(testclass4)>8(testclass5)
Callsite 2's action sequence: 7(testclass3)->6(testclass6)->1(...)

So even if the total number of catch in the function is 12, the number of actions is 11, because the action 1 (catch (...)) is shared between the innermost and the outermost try in the tail of the action sequences.

Second, minor issue, see function test25clrNOK_LLVM_RETURN_O0ISSUE When compiling -O2, things are fine, when compiling -O0, there is an error with the return inside the catchblock. Please see comment in code. Thank You. llvm_e_lsdaissue.cpp.txt

Don't really get it. I compiled the program with Emscripten and both -O0 and -O2 returned 312321311 for me. It would help if you specify the exact compiler, version, command line, and any other setting changes you made.

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Hi,

thanks for taking care on this.

I use clang, most recent commit, and compile bare-metal for wasm32-wasi-unknown (or wasm64).
Emscripten is not involved at all.

clang++ -fwasm-exceptions -DDEBUG -mbulk-memory -mmutable-globals -O2 -I.. -I.... -I. -Wl,--max-memory=327680 -Wl,--shared-memory -matomics -target wasm32-wasi --sysroot /project/wasi_libc_orig/sysroot_testing -Wl,--export-all -Wl,-L. -Wl,--entry=__main_void -Wl,--allow-undefined -o llvm_e_simple_o2.cpp.wasm llvm_e_simple.cpp

Thanks for pointing me on the better view (GCC_except_table1). Really, thanks for explaining as information on this is hard to get except from libcxxabi or LLVM sources.

I will review both issues I thought I found and report back.

However, some issue with codegen for the blocks must be there, please see attached. The calls for malloc and the division by zero are not within the try ....

Many Thanks

llvm_try_block_missing.cpp.txt

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Hi Heejin,

regarding the second, 'return issue', I can confirm you are right.
I did misinterpret the immediate for 'br' out of the catch block. (I thought the catchblock itself counts, it obviously does not.)

Thank You.

@aheejin
Copy link
Member

aheejin commented Jan 11, 2022

I was testing your example in #192 (comment) and I don't think I can reproduce your results. First of all it doesn't even have main so the whole function is optimized away. Even if I add main myself the result was somewhat different.

But in general, Wasm EH's compilation results don't reflect the C++'s original try-catch structure 100%. The only thing that's guaranteed is if something can throw is inside a try-catch in C++, it goes into try-catch in Wasm. A div cannot throw, so it can be outside the try. (It can trap, but that's different from throwing, and Wasm's catch instruction doesn't catch traps.) A call can be outside the try if it is guaranteed not to throw. But many library functions, even if they don't throw, are assumed to be able to throw because the compiler cannot prove otherwise. In my compilation result, malloc was inside the try, even if it doesn't throw.

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Thanks. My VM can throw for a div by zero (although maybe this is not per wasm specs), but so, unfortunately, I cannot use this to catch div by zero reliably.

I am still looking at the LSDA problems, but for the other comments I can confirm now it is working as per spec.

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

I think this cleared out my problems with the LSDA. Reason: the callsite table emitted for WASM is quite different (no 'length' and 'landingPad') to what is coded in cxa_personality.cpp. I was really unaware of this, maybe there could be some hint in the specs or cxa_personality.cpp.

The only thing I consider unfavorable is that the try/catch blocks in CPP differ from the WASM ones....

Regards

@sbc100
Copy link
Member

sbc100 commented Jan 11, 2022

Would this be helped if we upstreamed our compiler-rt changes from emscripten into LLVM? (I see some changes to cxa_personality.cpp but not a huge amount). See emscripten-core/emscripten#15097

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Sure. Thanks @ALL for your efforts regarding wasm. I hope there will be no further changes to this proposal and it gets finalized soon.

@aheejin
Copy link
Member

aheejin commented Jan 11, 2022

Thanks. My VM can throw for a div by zero (although maybe this is not per wasm specs), but so, unfortunately, I cannot use this to catch div by zero reliably.

What do you mean by "My VM can throw"? Wasm's div instruction cannot throw by the spec. It can trap, but it's not throwing, and Wasm's catch cannot catch traps. We discussed about this a long time ago and decided against catching traps for other reasons. #1 (comment) is one of the comments with which I summarized the rationale.

I think this cleared out my problems with the LSDA. Reason: the callsite table emitted for WASM is quite different (no 'length' and 'landingPad') to what is coded in cxa_personality.cpp. I was really unaware of this, maybe there could be some hint in the specs or cxa_personality.cpp.

I'm not sure about your version of cxa_personality.cpp, but Wasm's LSDA table shares most of the structures with something called "SjLj EH", which also exists in the upstream libc++abi's cxa_personality.cpp. This is the link to the upstream libc++abi's cxa_personality.cpp: https://github.com/llvm/llvm-project/blob/4edb9983cb8c8b850083ee5941468f5d0ef6fe2c/libcxxabi/src/cxa_personality.cpp#L78-L127 We share this structure with Wasm EH.

The only thing I consider unfavorable is that the try/catch blocks in CPP differ from the WASM ones....

Please note that no platforms can exactly preserve C++-level try-catch; the LLVM-level and machine-level instructions change during many optimization/transformation passes as long as they preserve the execution semantics.

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Regarding 'divzero', I implemented a feature that checks the divisor, creates an exn if 0 and throws it from the VM inside (This is more or less experimental and I will likely remove this. I just liked the idea.) My personality function would look for a named classtype (use RTTI) that would select a predefined catchpad for this.

I adapted my LSDA reader already. Works perfectly fine now. In fact, call sites differ (two values are removed as stated above) + some slight changes to the logic actions.

Special thanks for pointing to the clang -S platform disassembly, I was not aware it is so nice and detailed.

@aheejin
Copy link
Member

aheejin commented Jan 11, 2022

Regarding 'divzero', I implemented a feature that checks the divisor, creates an exn if 0 and throws it from the VM inside (This is more or less experimental and I will likely remove this. I just liked the idea.) My personality function would look for a named classtype (use RTTI) that would select a predefined catchpad for this.

I am not sure if I understand your approach. So when your VM reads a Wasm div instruction, it internally converts it to a sequence including a throw instruction at runtime? I don't think that is going to work then. The toolchain (LLVM) treats throwing and non-throwing instructions differently and is free to move non-throwing instructions in and out of try. If you change a non-throwing instruction to a throwing one at runtime, that changes the execution semantics.

Special thanks for pointing to the clang -S platform disassembly, I was not aware it is so nice and detailed.

You're welcome! :)

@mf-RDP
Copy link
Author

mf-RDP commented Jan 11, 2022

Divbyzero: No, I have an also 'own' JIT compiler. It creates machine code like this....

`(stack value0 in place lhs)
(stack value1 in place rhs)
[duplicate (value1)]
[if (duplicate of value1==0)]
[throw new EXN, i.e. jump to catch or escape frame]
[division]
[...]

`

This of course has a huge overhead which is the main reason I will remove this. Besides, it has other problems. But I had a nice prototype :)

@aheejin
Copy link
Member

aheejin commented Jan 11, 2022

I'm not sure what the difference between what I described and JIT is, but again, changing code at runtime can be at your own risk of violating the original code semantics. If you create a throw instruction outside of a try, you are changing the code semantics.

@aheejin
Copy link
Member

aheejin commented Jan 20, 2022

I'll close this. Please reopen if you have any remaining issues.

@aheejin aheejin closed this as completed Jan 20, 2022
@mf-RDP
Copy link
Author

mf-RDP commented Mar 5, 2022

Hi Ahn,

I think there is an issue with a recent change in lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp.
Compiling with -fwasm-exceptions will now always hit another assertion:

clang++ -fwasm-exceptions -DDEBUG -mbulk-memory -mmutable-globals -O0 -I.. -I.... -I. -Wl,--max-memory=327680 -Wl,--shared-memory -matomics -target wasm32-wasi --sysroot /project/wasi-libc/sysroot-intrin-bulk-memory -Wl,--export-all -Wl,-L. -Wl,--entry=__main_void -Wl,--allow-undefined -o llvm_e_simple.cpp.wasm llvm_e_simple.cpp
fatal error: error in backend: only -ftls-model=local-exec is supported for now on non-Emscripten OSes: variable __wasm_lpad_context
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: C:\LLVM\BUILDLLDB\Release\bin\clang++.exe -cc1 -triple wasm32-unknown-wasi -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name llvm_e_simple.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-feature +exception-handling -mllvm -wasm-enable-eh -target-cpu generic -target-feature +bulk-memory -target-feature +mutable-globals -target-feature +atomics -fvisibility hidden -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fcoverage-compilation-dir=c:\project\ALP\WASM\exc\exc_newproposal -resource-dir C:\LLVM\BUILDLLDB\Release\lib\clang\15.0.0 -D DEBUG -I .. -I ..\.. -I . -isysroot /project/wasi-libc/sysroot-intrin-bulk-memory -internal-isystem C:\LLVM\BUILDLLDB\Release\lib\clang\15.0.0\include -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include/wasm32-wasi -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include -O0 -fdeprecated-macro -fdebug-compilation-dir=c:\project\ALP\WASM\exc\exc_newproposal -ferror-limit 19 -fmessage-length=240 -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -exception-model=wasm -fcolor-diagnostics -o C:\Users\I\AppData\Local\Temp\llvm_e_simple-51cd15.o -x c++ llvm_e_simple.cpp

  1.  <eof> parser at end of file
    
  2.  Code generation
    
  3.  Running pass 'Function Pass Manager' on module 'llvm_e_simple.cpp'.
    
  4.  Running pass 'WebAssembly Instruction Selection' on function '@_Z10catchfunc1v'
    

After removing the assertion from WebAssemblyTargetLowering::LowerGlobalTLSAddress, things generally work again, but I discovered the use of global GOT.data.internal.__wasm_lpad_context is more or less effectless, it never gets assigned a value and remains always at it's initial 0.

Many Thanks & Kind Regards
S.

@mf-RDP mf-RDP changed the title LLVM [clang]: minor LSDA issue in WASM EH, bugreport LLVM [clang]: minor LSDA issue in WASM EH, bugreport. Mar 5, 2022
@mf-RDP
Copy link
Author

mf-RDP commented Mar 5, 2022

I just see, __wasm_lpad_context is probably just some TLS address to be filled in by host VM? However, we need to get rid of the assertion :)

@sbc100
Copy link
Member

sbc100 commented Mar 5, 2022

I think the problem is that -matomics and bulk memory are not yet support except with the emscripten triple. The wasi triple does not yet support threading or TLS.

Doe it work if you remove -mbulk-memory and -matomics?

@mf-RDP
Copy link
Author

mf-RDP commented Mar 6, 2022

I need both -mbulk-memory and -matomics otherwise also (my code uses these). As well as bare-metal -fwasm-exceptions. It did not work without -matomics either.

Solution is to simply remove the assertion WebAssemblyISelLowering.cpp ( if (GV->getThreadLocalMode() != GlobalValue::LocalExecTLSModel && !Subtarget->getTargetTriple().isOSEmscripten()) ...).

It is just not compatible to the change [WebAssembly] Make EH/SjLj vars unconditionally thread local
GV->setThreadLocalMode(GlobalValue::GeneralDynamicTLSModel);
It did work fine before.

Thank You.

@sbc100
Copy link
Member

sbc100 commented Mar 6, 2022

I need both -mbulk-memory and -matomics otherwise also (my code uses these).

What do you mean exactly when you say "my code uses these"? If you are talking about atomics usage in C/C++ those should not need these flags. LLVM should lower away all uses of TLS and atomics so in the final final. Or are you saying you are actaully trying to make a multi-threaded program with shared memory. WASI doesn't support multi threaded programs today so you would need to use the wasm32-unknown-emscripten triple if you want and actual shared memory multi-threaded program.

In summary:

  • If you are building a single threaded WASI program, you can use atomics in the source language without the need for -mbulk-memory or -matomics.
  • If you are trying to build an actual multi-threaded WASI program that is not something that is possible today (as the assertion says only emscripten support this today).

If you want to try adding support for multi-threaded WASI output (i don't know of any runtimes where you could run such a program today) then yes indeed, you would want to remove that assertion and build you own version of llvm.

@mf-RDP
Copy link
Author

mf-RDP commented Mar 6, 2022

I am developing an own VM with implementations for atomics, bulk-memory and exception handling (and SIMD). I was always using wasm32-unknown-unknown or wasm32-wasi. My VM does implement e.g. atomic RMWs, memory.copy etc.
It does not need shared memory, it does multithreading - but as an own implementation [nothing special from LLVM needed].

You can consider this as bare-metal compilation.

My issue is just that right now, in fact bare-metal compilation with wasm32-unknown-unknown is broken for -fwasm-exceptions. This is due to __wasm_lpad_context now conflicting with mentioned assertion. I think the assertion above should be changed to allow at least wasm32-unknown-unknown or even better, be removed (it just blocks from otherwise both perfect compilation and resulting binaries - all OK from what I've seen).

@sbc100
Copy link
Member

sbc100 commented Mar 6, 2022

Surely if you are implementing threading then you will need to use shared memory? IIRC atomic instructions require the use of a shared memory.

In any case, it sounds like you are trying to do something new/uncharted that is not yet supported by llvm. WebAssembly in LLVM does not yet support TLS except when using the emscripten triple is used because an official TLS ABI has yet be defined. The one use by emscripten is considered experimental. If you want to use the experimental TLS support then you need to use the emscripten triple. At least that is how it works today.

@mf-RDP
Copy link
Author

mf-RDP commented Mar 6, 2022

I just checked and yes, I can use the emscripten triple without any obvious problems. However, that is really not nice as my work (and probably other ppls work) has nothing to do with emscripten. I really just need wasm32-unknown-unknown or wasm64-unknown-unknown and wasi-libc, thus I used the wasi triple.

BTW, Threading works like this: heap above _heap_base is shared and below _heap_base (TLS and stack) is local per thread. load/store instructions take care of selecting the correct memory segment. The linker created flags are not needed.

Therefore, maybe you can modify or remove this assertion at some time. I would do so but I'm not a LLVM contributor as of now.

@sbc100
Copy link
Member

sbc100 commented Mar 6, 2022

I just checked and yes, I can use the emscripten triple without any obvious problems. However, that is really not nice as my work (and probably other ppls work) has nothing to do with emscripten. I really just need wasm32-unknown-unknown or wasm64-unknown-unknown and wasi-libc, thus I used the wasi triple.

You are also relying on experimental TLS support in LLVM so I don't think its true that you "just need wasm32-unknown-unknown". Adding TLS support to wasm32-unknown-unknown would require some consensus building on what that should look like.

One alternative is that perhaps we could add a flag to allow users of non-emscripten targets to opt into the experimental TLS support? I'm not what what that flag would look like or where it would get processed, so it might be easier for now to just use the emscripten triple if you want this support.

@mf-RDP
Copy link
Author

mf-RDP commented Mar 6, 2022

Ok. Can you provide an information or something where one can investigate what is different in -emscripten vs. -wasi or -unknown? So far I see the main(argc,argv) resolving into __main_void thing and how startup args are passed through.

@sbc100
Copy link
Member

sbc100 commented Mar 6, 2022

I'm afraid I don't know of any place where the differences are documented. My understanding is that there are just a few differences that mostly relate to features such as TLS and exception handling. The core ABI and codegen should not be effected by the triple.

If we were to add some documentation on the differences perhaps a good place to do that would might be: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/README.txt.

@mf-RDP
Copy link
Author

mf-RDP commented Mar 6, 2022

Ok. No problem. From what I found out right now, differences are indeed minimal.

  • program entry ("main")
  • EH, but -emscripten fully supports both models from what I see and -emscripten + -fwasm-exceptions is just exactly the same

If I encounter serious difficulties, I would like to come back here.

Thank You.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants