Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the relocations that LLVM is currently using. #7

Merged
merged 6 commits into from
Mar 13, 2017
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 53 additions & 32 deletions Linking.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In order to achieve this the following tasks need to be performed:

The linking technique described here is designed to be fast, and avoids having
the disassemble the the code section. The relocation information required by
the linker is stored in custom sections whos names begin with "reloc.". For
the linker is stored in custom sections whose names begin with "reloc.". For
each section that requires relocation a "reloc" section will be present in the
wasm file. By convension the reloc section names end with name of the section
thet they refer to: e.g. "reloc.CODE" for code section relocations. However
Expand All @@ -29,7 +29,12 @@ encoded in the reloc section itself.
Relocation Sections
-------------------

A "reloc" section is defined as:
A relocation section is a user-defined section with a name starting with
"reloc." Relocation sections start with an identifier specifying which
section they apply to, and must be sequenced in the module after that
section.

Relocation contain the following fields:

| Field | Type | Description |
| -----------| ------------------- | ------------------------------ |
Expand All @@ -39,47 +44,63 @@ A "reloc" section is defined as:
| count | `varuint32` | count of entries to follow |
| entries | `relocation_entry*` | sequence of relocation entries |

a `relocation_entry` is:
A `relocation_entry` is:

| Field | Type | Description |
| -------- | ------------------- | ------------------------------ |
| type | `varuint32` | the relocation type |

A relocation type can be one of the following:

- `0 / R_FUNCTION_INDEX` - a function index encoded as an LEB128. Used
for the immediate argument of a `call` instruction in the code section.
- `1 / R_TABLE_INDEX` - a table index encoded as an SLEB128. Used
for the immediates that refer to the table index space. e.g. loading the
address of the function using `i32.const`.
- `2 / R_GLOBAL_INDEX` - a global index encoded as an LEB128. Points to
the immediate value of `get_global` / `set_global` instructions.
- `3 / R_DATA` - an index into the global space which is used store the address
of a C global

For relocation types other than `R_DATA` the following fields are present:
- `0 / R_WEBASSEMBLY_FUNCTION_INDEX_LEB` - a function index encoded as a 5-byte
[varuint32]. Used for the immediate argument of a `call` instruction.
- `1 / R_WEBASSEMBLY_TABLE_INDEX_SLEB` - a function table index encoded as a
5-byte [varint32]. Used to refer to the immediate argument of a `i32.const`
instruction, e.g. taking the address of a function.
- `2 / R_WEBASSEMBLY_TABLE_INDEX_I32` - a function table index encoded as a
[uint32], e.g. taking the address of a function in a static data initializer.
- `3 / R_WEBASSEMBLY_GLOBAL_ADDR_LEB` - a global index encoded as a 5-byte
[varuint32]. Used for the immediate argument of a `load` or `store`
instruction, e.g. directly loading from or storing to a C++ global.
- `4 / R_WEBASSEMBLY_GLOBAL_ADDR_SLEB` - a global index encoded as a 5-byte
[varint32]. Used for the immediate argument of a `i32.const` instruction,
e.g. taking the address of a C++ global.
- `5 / R_WEBASSEMBLY_GLOBAL_ADDR_I32` - a global index encoded as a [uint32],
e.g. taking the address of a C++ global in a static data initializer.

[varuint32]: https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#varuintn
[varint32]: https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#varintn
[uint32]: https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#uintn

For `R_WEBASSEMBLY_FUNCTION_INDEX_LEB`, `R_WEBASSEMBLY_TABLE_INDEX_SLEB`,
and `R_WEBASSEMBLY_TABLE_INDEX_I32` relocations the following fields are
present:

| Field | Type | Description |
| ------ | ---------------- | ---------------------------------------- |
| offset | `varuint32` | offset of the value to rewrite |
| index | `varuint32` | the index of the function used |

For `R_WEBASSEMBLY_GLOBAL_ADDR_LEB`, `R_WEBASSEMBLY_GLOBAL_ADDR_SLEB`,
and `R_WEBASSEMBLY_GLOBAL_ADDR_I32` relocations the following fields are
present:

| Field | Type | Description |
| ------ | ---------------- | ----------------------------------- |
| offset | `varuint32` | offset of [S]LEB within the section |

For `R_DATA` relocations the following fields are presnet:

| Field | Type | Description |
| ------------- | ----------------- | ------------------------------ |
| global\_index | `varuint32` | the index of the global used |
| offset | `varuint32` | offset of the value to rewrite |
| index | `varuint32` | the index of the global used |
| addend | `varint32` | addend to add to the address |

Merging Global Section
----------------------

Merging of globals sections requires re-numbering of the globals. To enable
this an `R_GLOBAL_INDEX` entry in the `reloc` section is generated for each
`get_global` / `set_global` instruction. The immediate values of all
`get_global` / `set_global` instruction are stored as padded LEB123 such that
they can be rewritten without altering the size of the code section. The
relocation points to the offset of the padded immediate value within the code
section, allowing the linker to both read the current value and write an updated
one.
Merging of globals sections requires re-numbering of the globals.

This convention requires the first global in the global section to be a mutable
i32 global initialized to "STACKTOP" in the "env" module. During linking, only
one of these globals is kept, and it remains the first global. This is a
simple convention which allows code to reference global `0` without needing to
be rewritten.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the fact that any imported globals come first in the index space of globals? Are we relying on the fact that there will be no imported globals in the final linked executable?


Merging Function Sections
-------------------------
Expand All @@ -95,9 +116,9 @@ stored in the code section:

The immediate argument of all such instruction are stored as padded LEB123
such that they can be rewritten without altering the size of the code section.
For each such instruction a `R_FUNCTION_INDEX_LEB` or `R_FUNCTION_INDEX_SLEB`
`reloc` entry is generated pointing to the offset of the immediate within the
code section.
For each such instruction a `R_WEBASSEMBLY_FUNCTION_INDEX_LEB` or
`R_WEBASSEMBLY_TABLE_INDEX_SLEB` `reloc` entry is generated pointing to the
offset of the immediate within the code section.

The same technique applies for all function calls whether the function is
imported or defined locally.
Expand Down