Mono AOT full program analyses and interprocedural optimizations #80942

kotlarmilos · 2023-01-20T19:00:28Z

Description

For target platforms which do not support dynamic code generation (e.g., iOS), programs are compiled in the full AOT mode. Currently, Mono AOT compiler compiles such programs by compiling managed assemblies one by one. One of the biggest advantages of this approach is that it is not necessary to recompile the whole application if there is a change in a single assembly.

On the other hand, the compiler does not have any knowledge about cross-assembly references and heavily relies on the passes performed by the ILLinker, the tool performing full program analysis in the AOT pipeline. This prevents the compiler to do a better job at removing the unreachable code and to perform better inter-procedural optimizations.

This experiment checks if inter-procedural optimizations can be achieved when all assemblies are compiled together.

Tasks

Conducting experiments and obtaining results
(optional) Integration with the main branch as an experimental feature

/cc: @SamMonoRT

ghost · 2023-01-20T19:01:37Z

Tagging subscribers to 'os-ios': @steveisok, @akoeplinger
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

For target platforms which do not support dynamic code generation (e.g., iOS), programs are compiled in the full AOT mode. Currently, Mono AOT compiler compiles such programs by compiling managed assemblies one by one. One of the biggest advantages of this approach is that it is not necessary to recompile the whole application if there is a change in a single assembly.

On the other hand, the compiler does not have any knowledge about cross-assembly references and heavily relies on the passes performed by the ILLinker, the tool performing full program analysis in the AOT pipeline. This prevents the compiler to do a better job at removing the unreachable code and to perform better inter-procedural optimizations.

This experiment checks if inter-procedural optimizations can be achieved when all assemblies are compiled together.

Tasks

Mono AOT compiler updated to produce single library in full AOT mode
Conducting experiments and obtaining results
(optional) Integration with the main branch as an experimental feature

/cc: @SamMonoRT

Author:	kotlarmilos
Assignees:	kotlarmilos, ivanpovazan, LeVladIonescu
Labels:	`area-Codegen-AOT-mono`, `os-ios`
Milestone:	Future

EgorBo · 2023-01-20T20:39:43Z

I thought that in the end for the iOS case you mentioned it will be a set of static files (per assembly) which will be linked all together (presumably, with LTO?) into a single binary?

Also, I assume that Mono does inline methods from external assemblies if they're small, doesn't it?

kotlarmilos · 2023-01-20T22:31:18Z

I thought that in the end for the iOS case you mentioned it will be a set of static files (per assembly) which will be linked all together (presumably, with LTO?) into a single binary?

@EgorBo good point! Currently, Mono AOT for iOS produces a set of static files (per assembly) that are then linked all together by a native linker into a single binary. It means that assemblies are passed into the AOT compiler separately.

By passing them together into the AOT compiler (somewhat similar to dedup feature in #80419) we want to check if the ILLinker "pre-pass" will bring inter-procedural optimizations.

Also, I assume that Mono does inline methods from external assemblies if they're small, doesn't it?

Not sure about that, maybe @vargaz or @ivanpovazan can confirm.

Ideas or feedback on how it can be improved are more than welcome :)

EgorBo · 2023-01-20T22:36:21Z

Well, I assume generally if you combine the whole thing in one piece of LLVM IR - LLVM will be able to perform cross-managed-assembly inlining by its own so it should be a good for perf anyway? 🙂

AndyAyersMS · 2023-01-20T23:28:35Z

You might find this prototype we did to add IPA to crossgen 2 interesting (note it was never merged): https://github.com/erozenfeld/runtime/commits/Crossgen2WPO

kotlarmilos · 2023-01-21T10:58:18Z

Well, I assume generally if you combine the whole thing in one piece of LLVM IR - LLVM will be able to perform cross-managed-assembly inlining by its own so it should be a good for perf anyway? 🙂

Good, we will figure it out.

You might find this prototype we did to add IPA to crossgen 2 interesting (note it was never merged): https://github.com/erozenfeld/runtime/commits/Crossgen2WPO

Thanks for sharing it. I see changes in ILCompiler that might be aligned with #80941.

LeVladIonescu · 2023-02-02T17:00:26Z

With the assumption that during wasm's AOT compilation all the assemblies are passed to the compiler in the same time to produce a single output file I've started investigating how this works in order to do something similar for iOS. Here's the outcome:

Steps:

Enable AOT compilation by setting _WasmShouldAOT and RunAOTCompilation parameters to true
Build sample browser using : ./../../../../../dotnet.sh build /p:TargetOS=browser /p:TargetArchitecture=wasm /p:Configuration=Debug /p:RunAOTCompilation=true /bl (from project directory)

After inspecting the binlog I've noticed that all assemblies are passed separately to the AOT compiler, which means that it's the same principle on how it's done for iOS.

For example:

/runtime/artifacts/bin/mono/browser.wasm.Debug/cross/browser-wasm/mono-aot-cross --debug --llvm "--aot=no-opt,static,direct-icalls,deterministic,dwarfdebug,llvm-path=/runtime/src/mono/wasm/emsdk/upstream/bin/,static,dedup-skip,llvmonly,interp,asmonly,llvm-outfile=/runtime/artifacts/obj/mono/Wasm.Browser.Sample/wasm/Debug/browser-wasm/wasm/for-build/System.Private.CoreLib.dll.bc.tmp" "System.Private.CoreLib.dll"
/runtime/artifacts/bin/mono/browser.wasm.Debug/cross/browser-wasm/mono-aot-cross --debug --llvm "--aot=no-opt,static,direct-icalls,deterministic,dwarfdebug,llvm-path=/runtime/src/mono/wasm/emsdk/upstream/bin/,static,dedup-skip,llvmonly,interp,asmonly,llvm-outfile=/runtime/artifacts/obj/mono/Wasm.Browser.Sample/wasm/Debug/browser-wasm/wasm/for-build/Wasm.Browser.Sample.dll.bc.tmp" "Wasm.Browser.Sample.dll"

Question : Could it be there another flag which can be set in order to enable passing all assemblies in the same time?

Proposal for next step:

Try changing how mono_aot_assemblies() works. Currently, this function is calling aot_assembly() for every assembly and is also emitting the AOT image of it. Instead of this, we can emit only one AOT image for all the assemblies in mono_aot_assemblies() after all the assemblies have been compiled.

We can split the AOT compilation in 3 phases:

collect phase – aot_assembly() without compile_methods() and emit_aot_image()
compilation phase – compile_methods() having full program analysis and with linkonce enabled – here we should have all the methods from all the assemblies in the MonoAotCompile acfg variable
emit phase – emit_aot_image()

One concern with this strategy would be how are duplicates handled before emitting the AOT image, will llvm-linkonce take care of those?
Another one is regarding MonoAotCompile struct. In order to collect all the methods in the same acfg MonoAotCompile variable we will need to append the date collected and processed by aot_assembly() to the acfg variable which will store all those methods. Would this be a good approach?

vargaz · 2023-02-02T17:54:56Z

This will require a large amount of changes. Both the aot compiler, and the aot runtime expects a one-to-one mapping between assemblies and aot images.

kotlarmilos · 2023-02-02T18:59:24Z

@LeVladIonescu good progress!

This will require a large amount of changes. Both the aot compiler, and the aot runtime expects a one-to-one mapping between assemblies and aot images.

I agree that it would require a large amount of changes and would be hard to test properly. What we want to achieve by having a single output file is a full program analyses by LLVM. It might not necessarily mean that the code should be emitted in single AOT image. It might be possible to collect methods from all assemblies and provide them during the LLVM compilation but to emit only a subset from a corresponding assembly.

LeVladIonescu · 2023-02-03T15:55:17Z

After offline sync we agreed to change the strategy.

Now, the next step is to try to not allow the AOT compiler add methods into method_id -> method table in order to let LLVM better optimize those methods.
We will target this for the HelloiOS app and will compare the results.

kotlarmilos · 2024-07-30T14:21:31Z

Obsolete, lower priority compared to other tasks.

kotlarmilos added the area-Codegen-AOT-mono label Jan 20, 2023

kotlarmilos added this to the Future milestone Jan 20, 2023

kotlarmilos assigned kotlarmilos, ivanpovazan and LeVladIonescu Jan 20, 2023

kotlarmilos mentioned this issue Jan 20, 2023

Mono AOT size savings efforts #80938

Open

16 tasks

kotlarmilos added the os-ios Apple iOS label Jan 20, 2023

kotlarmilos modified the milestones: Future, 8.0.0, 9.0.0 Jul 13, 2023

SamMonoRT unassigned LeVladIonescu Oct 25, 2023

ivanpovazan modified the milestones: 9.0.0, Future Feb 9, 2024

kotlarmilos closed this as completed Jul 30, 2024

github-actions bot locked and limited conversation to collaborators Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mono AOT full program analyses and interprocedural optimizations #80942

Mono AOT full program analyses and interprocedural optimizations #80942

kotlarmilos commented Jan 20, 2023 •

edited

Loading

ghost commented Jan 20, 2023

Description

Tasks

EgorBo commented Jan 20, 2023

kotlarmilos commented Jan 20, 2023

EgorBo commented Jan 20, 2023

AndyAyersMS commented Jan 20, 2023

kotlarmilos commented Jan 21, 2023

LeVladIonescu commented Feb 2, 2023

vargaz commented Feb 2, 2023

kotlarmilos commented Feb 2, 2023

LeVladIonescu commented Feb 3, 2023 •

edited

Loading

kotlarmilos commented Jul 30, 2024

Mono AOT full program analyses and interprocedural optimizations #80942

Mono AOT full program analyses and interprocedural optimizations #80942

Comments

kotlarmilos commented Jan 20, 2023 • edited Loading

Description

Tasks

ghost commented Jan 20, 2023

Description

Tasks

EgorBo commented Jan 20, 2023

kotlarmilos commented Jan 20, 2023

EgorBo commented Jan 20, 2023

AndyAyersMS commented Jan 20, 2023

kotlarmilos commented Jan 21, 2023

LeVladIonescu commented Feb 2, 2023

vargaz commented Feb 2, 2023

kotlarmilos commented Feb 2, 2023

LeVladIonescu commented Feb 3, 2023 • edited Loading

kotlarmilos commented Jul 30, 2024

kotlarmilos commented Jan 20, 2023 •

edited

Loading

LeVladIonescu commented Feb 3, 2023 •

edited

Loading