Skip to content

Commit

Permalink
[mono] Update runtime docs. (dotnet#99699)
Browse files Browse the repository at this point in the history
* [mono] Update wasm docs.

* Add AOT docs.

* Fix typos.

* Add documentation for wrappers.

* Fix typos.
  • Loading branch information
vargaz authored Mar 22, 2024
1 parent 6940c19 commit 51782dc
Show file tree
Hide file tree
Showing 3 changed files with 290 additions and 7 deletions.
144 changes: 144 additions & 0 deletions docs/design/mono/aot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# Ahead of Time Compilation

## Introduction

The mono Ahead of Time (AOT) compiler enables the compilation of the IL code in a .NET assembly to
a native object file. This file is called an AOT image. This AOT image can be used by the runtime to avoid
having to JIT the IL code.

## Usage

The AOT compiler is integrated into the mono runtime executable, and can be run using the `--aot` command
line argument, i.e.
`<mono-executable> --aot HelloWorld.dll`

## Source code structure

- `aot-compiler.c`: The AOT compiler
- `aot-runtime.c`: Code used at runtime to load AOT images
- `image-writer.c`: Support code for emitting textual assembly
- `dwarfwriter.c`: Support code for emitting DWARF debug info

## Configurations

### Desktop AOT

In this mode, the AOT compiler creates a platform shared object file (.so/.dylib), i.e. `HelloWorld.dll.so`. During execution, when
an assembly is loaded, the runtime loads the corresponding shared object and uses it to avoid having to AOT the methods in the
assembly.

Emission of the native code is done by first emitting an assembly (.s) file, then compiling and linking it with the system tools
(`as`/`ld`, or `clang`).

### Static AOT

In this mode, the AOT compiler creates a platform object file (.o). This file needs to be linked into the application and registered
with the runtime.

Static compilation is enabled by using the `static` aot option, i.e. `--aot=static,...`. The resulting object file contains a linking
symbol named `mono_aot_module_<assembly name>_info`. This symbol needs to be passed to the a runtime function before the
runtime is initialized, i.e.:
`mono_aot_register_module (mono_aot_module_HelloWorld_info);`

### Full AOT

In this mode, which can be combined with the other modes, the compiler generates additional code which enables the runtime to
function without any code being generated at runtime. This includes 2 types of code:
- code for 'extra' methods, i.e. generic instances, runtime generated wrappers methods, etc.
- trampolines

This is enabled by using `full` aot option, i.e. `--aot=full,...`. At runtime, all assemblies need to have a full-aot-ed AOT image
present in order for the app to work. This is used on platforms which don't allow runtime code generation like IOS.

### LLVM support

LLVM support can be enabled using the `llvm` aot option, i.e. `--aot=llvm`. In this mode, instead of generating native code,
the AOT compiler generates an LLVM bitcode (.bc), file, then compiles it to native code using the `opt`/`llc` LLVM tools. The
various AOT data structures are also emitted into the .bc file instead of as assembly.
Since the LLVM backend currently doesn't support all .net methods, a smaller assembly file is still emitted, and linked together
with the `opt`/`llc` compiled object file into the final shared object file.

## Versioning

The generated AOT images have a dependency on the exact version input assembly used to generate them and the versions of all the
referenced assemblies. This means the GUIDs of the assemblies have to match. If there is a mismatch, the AOT image will fail to load.

## File structure

The AOT images exports one symbol named `mono_aot_module_<assembly name>_info` which points to a `MonoAotFileInfo` structure,
which contains pointers to the tables/structures. The AOT image contains:
- the native code
- data structures required to load the code
- cached data intended to speed up runtime operation

The AOT image contains serialized versions of many .NET objects like methods/types etc. This uses ad-hoc binary encodings.

## Runtime support

The `aot-runtime.c` file contains the runtime support for loading AOT images.

### Loading AOT images

When an assembly is loaded, the corresponding AOT images is either loaded using the system dynamic linker (`dlopen`), or
found among the statically linked AOT images.

### Loading methods

Every method in the AOT image is assigned an index. The AOT methods corresponding to 'normal' .NET methods are assigned
an index corresponding to their metadata token index, while the 'extra' methods are assigned subsequent indexes. There is
a hash table inside the AOT image mapping extra methods to their AOT indexes. Loading a method consists of
- finding its method index
- finding the method code/data corresponding to the method index

The mapping from method index to the code is done in an architecture specific way, designed to minimize the amount of
runtime relocations in the AOT image. In some cases, this involves generating an extra table with assembly call instructions to
all the methods, then disassembling this table at runtime.



### Runtime constants

The generated code needs to access data which is only available at runtime. For example, for an `ldstr "Hello"` instruction, the
`"Hello"` string is a runtime constant.

These constants are stored in a global table called the GOT which is modelled after the Global Offset Table in ELF images. The GOT
table contains pointers to runtime objects. The AOT image contains descriptions of these runtime objects so the AOT runtime can
compute them. The entries in the GOT are initialized either when the AOT image is loaded (for frequently used entries), or before
the method which uses them is first executed.

### Initializing methods

Before an AOTed method can be executed, it might need some initialization. This involves:
- executing its class cctor
- initializing the GOT slots used by the method

For methods compiled by the mono JIT, initialization is done when the method is loaded. This means that its not possible to
have direct calls between methods. Instead, calls between methods go through small pieces of generated code called PLT
(Program Linkage Table) entries, which transfer control to the runtime which loads the called method before executing it.
For methods compiled by LLVM, the method entry contains a call to the runtime which initializes the method.

## Trampolines

In full-aot mode, the AOT compiler needs to emit all the trampolines which will be used at runtime. This is done in
the following way:
- For most trampolines, the AOT compiler calls the normal trampoline creation function with the `aot` argument set
to TRUE, then saves the returned native code into the AOT image, along with some relocation information like the
GOT slots used by the trampolines.
- For some small trampolines, the AOT compiler directly emits platform specific assembly.

The runtime might require an unbounded number of certain trampolines, but the AOT image can only contain a fixed
number of them. To solve this problem, on some platforms (IOS), its possible to have infinite trampolines. This is
implemented by emitting a different version of these trampolines which reference their corresponding data using
relative addressing. At runtime, a page of these trampolines is mapped using `mmap` next to a writable page
which contains their corresponding data. The same page of trampolines is mapped multiple times at multiple
addresses.

## Cross compilation

Its possible to use the AOT compiler to target a platform different than the host. This requires a separate cross compiler
build of the runtime.
The generated code depends on offsets inside runtime structures like `MonoClass`/`MonoVTable` etc. which could
differ between the host and the target. This is handled by having a tool called the offsets-tool, which is a python
script which uses the clang python interface to compute and emit a C header file containing these offsets. The header
file is passed as a cmake argument during the runtime build. Inside the runtime code, the `MONO_STRUCT_OFFSET`
C macro reads the data from the offsets file to produce the offset corresponding to the target platform.
110 changes: 110 additions & 0 deletions docs/design/mono/runtime-ilgen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# IL generation at runtime

## Introduction

The mono runtime makes extensive use of generating IL methods at runtime. These
methods are called 'wrappers' in the runtime code, because some of them 'wrap' other
methods, like a managed-to-native wrapper would wrap the native function being called.
Wrappers have the `MonoMethod.wrapper_type` field set to the type of the wrapper.

## Source code structure

- `wrapper-types.h`: Enumeration of wrapper types
- `marshal*`: Functions for generating wrappers
- `method-builder*`: Low level functions for creating new IL methods/code at runtime

## WrapperInfo

Every wrapper has an associated `WrapperInfo` structure which describes the wrapper.
This can be retrieved using the `mono_marshal_get_wrapper_info ()` function.
Some wrappers have subtypes, these are stored in `WrapperInfo.subtype`.

## Caching wrappers

Wrappers should be unique, i.e. there should be only one instance of every wrapper. This is
achieved by caching wrappers in wrapper type specific hash tables, which are stored in
`MonoMemoryManager.wrapper_caches`.

## Generics and wrappers

Wrappers for generic instances should be created by doing:
instance method -> generic method definition -> generic wrapper -> inflated wrapper

## AOT support

In full-aot mode, the AOT compiler will collect and emit the wrappers needed by the
application at runtime. This involves serializing/deserializing the `WrapperInfo` structure.

## Wrapper types

### Managed-to-native

These wrappers are used to make calls to native code. They are responsible for marshalling
arguments and result values, setting up EH structures etc.

### Native-to-managed

These wrappers are used to call managed methods from native code. When a delegate is passed to
native code, the native code receives a native-to-managed wrapper.

### Delegate-invoke

Used to handle more complicated cases of delegate invocation that the fastpaths in the JIT can't handle.

### Synchronized

Used to wrap synchronized methods. The wrapper does the locking.

### Runtime-invoke

Used to implement `mono_runtime_invoke ()`.

### Dynamic-method

These are not really wrappers, but methods created by user code using the `DynamicMethod` class.

Note that these have no associated `WrapperInfo` structure.

### Alloc

SGEN allocator methods.

### Write-barrier

SGEN write barrier methods.

### Castclass

Used to implement complex casts.

### Stelemref

Used to implement stelem.ref.

### Unbox

Used to unbox the receiver before calling a method.

### Managed-to-managed/other

The rest of the wrappers, distinguished by their subtype.

#### String-ctor

Used to implement string ctors, the first argument is ignored, and a new string is allocated.

#### Element-addr

Used to implement ldelema in multi-dimensional arrays.

#### Generic-array-helper

Used to implement the implicit interfaces on arrays like IList<T> etc. Delegate to helper methods on the Array class.

#### Structure-to-ptr

Used to implement Marshal.StructureToPtr.

#### Ptr-to-structure

Used to implement Marshal.PtrToStructure.
43 changes: 36 additions & 7 deletions docs/design/mono/wasm-aot.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,29 @@ The LLVM backend of the Mono JIT is used to generate an llvm .bc file for each a
compiled to webassembly using emscripten, then the resulting wasm files are linked into the final app. The 'bitcode'/'llvmonly'
variant of the LLVM backend is used since webassembly doesn't support inline assembly etc.

## Source code structure

`mini-llvm.c`: The LLVM backend.
`mini-wasm.h/c`: The wasm backend. This is a minimal version of a normal mono JIT backend which only supports llvm.
`llvm-runtime.cpp`: Code to throw/catch C++ exceptions.
`aot-runtime-wasm.c`: Code related to interpreter/native transitions on wasm.
`llvmonly-runtime.c`: Runtime support for the generated AOT code.

WASM specific code is inside `HOST_WASM/TARGET_WASM` defines.

## GC Support

On wasm, the execution stack is not stored in linear memory, so its not possible to scan it for GC references. However, there
is an additional C stack which stores variables whose addresses are taken. Variables which hold GC references are marked as
'volatile' in the llvm backend, forcing llvm to spill those to the C stack so they can be scanned.
is an additional C stack in linear memory which is managed explicitly by the generated wasm code. This stack is already
scanned by the mono GC as on other platforms.
To make GC references in AOTed methods visible to the GC, every method allocates a gc_pin area in its prolog, and
stores arguments/locals with a reference type into it. This will cause the GC to pin those references so the rest of
the generated code can treat them normally as LLVM values.

## Interpreter support

Its possible for AOTed and interpreted code to interop, this is called mixed mode.
On wasm, the two supported execution modes are interpreter, or aot+interpreter. This means its always
possible to fall back to the interpreter if needed.
For the AOT -> interpreter case, every call from AOTed code which might end up in the interpreter is
emitted as an indirect call. When the callee is not found, a wrapper function is used which
packages up the arguments into an array and passes control to the interpreter.
Expand All @@ -24,6 +38,22 @@ AOTed code. There is usually one aot->interp and interp->aot wrapper for each si
some sharing. These wrappers are generated by the AOT compiler when the 'interp' aot option
is used.

## Exception handling

On wasm, its not possible to walk the stack so the normal mono exception handling/unwind code
cannot be used as is. Its also hard to map the .NET exception handling concepts like filter clauses
to the llvm concepts. Instead, c++/wasm exceptions are used to implement unwinding, and the
interpreter is used to execute EH code.
When an exception needs to be thrown, we store the exception info in TLS, and throw a dummy C++ exception instead.
Internally, this is implemented by emscripten either by calling into JS, or by using the wasm exception handling
spec.
The c++ exception is caught in the generated AOT code using the relevant llvm catch instructions. Then execution is
transferred to the interpreter. This is done by creating a data structure on the stack containing all the IL level state like
the IL offset and the values of all the IL level variables. The generated code continuously updates this state during
execution. When an exception is caught, this IL state is passed to the interpreter which continues execution from
that point. This process is called `deopt` in the runtime code.
Exceptions are also caught in various other places like the interpreter-aot boundary.

## Null checks

Since wasm has no signal support, we generate explicit null checks.
Expand Down Expand Up @@ -59,8 +89,7 @@ if (vt_entry == null)
vt_entry = init_vt_entry ();
```

### GC overhead
### Exception handling

Since GC variables are marked as volatile and stored on the C stack, they are loaded/stored on every access,
even if there is no GC safe point between the accesses. Instead, they should only be loaded/stored around
GC safe points.
It might be possible to implement EH in the generated code without involving the interpreter. The
current design adds a lot of overhead to methods which contain IL clauses.

0 comments on commit 51782dc

Please sign in to comment.