Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking.md: How to handle calls to weakly defined functions #46

Closed
sbc100 opened this issue Mar 6, 2018 · 11 comments
Closed

Linking.md: How to handle calls to weakly defined functions #46

sbc100 opened this issue Mar 6, 2018 · 11 comments

Comments

@sbc100
Copy link
Member

sbc100 commented Mar 6, 2018

We need to find a way to handle the following pattern, specifically in case where foo() is not defined at link time:

int foo() __attribute__((weak));

int main() {
  if (&foo) {
    return foo();
  }
  return 0;
}

Currently llvm will generate a call instruction to foo and a relocation at the call site to inject the function index. If foo turns out to be undefined at link time there is no valid function which the linker can insert.

@NWilson and I have discuess a few possible options (see https://reviews.llvm.org/D44028):

  1. Have the linker error out. This seems bad as the above code is valid and should link.
  2. Have the linker generate a synthetic function that abort (i.e. contains only an unreachable) and use the index of the synthetic function here. The doenside is that this adds complexity to the linker and that it would need to generate a new function for each signature.
  3. Have llc always emit call_indirect when calling weak references. Then bad FUNCTION_INDEX relocations against weak symbols in the object format.

(3) seems like the nicest so long as it is feasilbe in llvm and that use increased use of call_indirect doesn't cause a performance drop. My guess is that weak symbols are not that common, but it also seems valid to assume that declaring a symbol as weak should not have a runtime overhead.

@sbc100
Copy link
Member Author

sbc100 commented Mar 6, 2018

@binji
Copy link
Member

binji commented Mar 6, 2018

Can the compiler generate the synthetic function instead? If there is a naming convention, it can just use that to find the correct one.

@sbc100
Copy link
Member Author

sbc100 commented Mar 6, 2018

Interesting. That could work. I think @NWilson 's current linker patch doesn't add to much complexity, but maybe having the compiler do it would mean less synthetic function generation in the linker. The cost would be that we'd need to define some kind of convention. i.e. "For each weak undefined the symbol there much be an accompanying defined symbol that the linker can use".. this could be a naming convension or it could be in the symbol table it self.

@NWilson
Copy link
Contributor

NWilson commented Mar 6, 2018

For the record - I initially wanted (3) and wrote a patch on that basis. Then I changed my mind (based on thoughts from Sam) and went for (2) instead.

If we can do it in the linker, we should do it in the linker, since it's then more "universal", making fewer assumptions about the origin of the of the linker's inputs. If we solved the problem in LLC, we'd still have to do something about it in the linker to check/validate the input file, and we'd have to mention the corner-case in Linking.md (whether to allow or disallow it, the spec needs to say). So whatever we do it'll infect the linker, may as well keep LLC out of the mess if it doesn't have to be involved. That was my eventual thinking anyway. Just doing it in the LLD is probably the least fuss overall, even though doing it in LLC might make the LLD-specific change smaller.

NB: Bugzilla link to issue reported there in Jan: https://bugs.llvm.org/show_bug.cgi?id=36039

@NWilson
Copy link
Contributor

NWilson commented Mar 7, 2018

Here's another idea (which I don't necessarily favour, it's just for completeness): Rename R_WEBASSEMBLY_FUNCTION_INDEX_LEB to R_WEBASSEMBLY_CALL, and make it a 6-byte relocation not a 5-byte one. It doesn't just patch a 5-byte LEB operand, it specifically patches a 6-byte opcode+LEB sequence. Then, if we want to patch a "null" call, the relocation can be processed to write out 0x000101010101 (unreachable then five nops). So the call itself can be patched out to abort directly. If the linker has permission to go back a byte and remove the call instruction, that gets rid of the need for the synthetic stubs.

@dschuff
Copy link
Member

dschuff commented Mar 7, 2018

Is the linker currently already generating the wasm_nullptr function (i.e. the thing that gets index 0 in the table)? Having the linker generate dummy weak functions isn't too much different from that, is it? Presumably they could even use the same stubs?

@sbc100
Copy link
Member Author

sbc100 commented Mar 7, 2018

For the null table entry there is no need to generate anything, we simply leave the entry empty to get the desired effect.

@binji
Copy link
Member

binji commented Mar 7, 2018

@NWilson I had the same thought about patching the call into unreachable. I kind of like this idea, but patching to a synthesized function that traps is probably clearer, assuming we can get a stack trace.

@NWilson
Copy link
Contributor

NWilson commented Mar 7, 2018

The idea of patching the entire call instruction rather than just the SLEB operand is ugly because it would introduce a third "type" of relocation (5-byte SLEB / 4-byte I32 / 6-byte CALL) for not very good reasons. More code overall I'd reckon, since it hooks deeper into LLC.

The better messaging and nicer stack traces for the "stub" approach are another plus ("error: trap in function __undefined" or similar would at least indicate that something was undefined).

@sbc100
Copy link
Member Author

sbc100 commented Mar 10, 2018

This is now fixed I think. Via linker-generated stubs.

@sbc100 sbc100 closed this as completed Mar 10, 2018
@NWilson
Copy link
Contributor

NWilson commented Mar 10, 2018

Thanks! The corresponding update to Linking.md needs to be merged in #47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants