You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 24, 2022. It is now read-only.
Is your feature request related to a problem? Please describe.
We've been bothered by the excessive memory usage of the linker for a long time. There has been various tricks (e.g. lazy serialization of Map values) and efforts (e.g. #605) to improve it. In this ticket, we seek a deeper refactoring that addresses a technical debt: needing to load all objects at link-time.
Until now, the linker codebase mostly works with the AsteriusModule type which consists of several symbol-indexed immutable hash maps. This makes implementation simple, but retains a lot of heap data. This is especially true for the input to the gc-sections pass, since most input entities will be thrown away, but they had to be present in the memory for some time.
In the past, we used laziness to reduce memory consumption here. The hash maps in AsteriusModule has lazy values; when deserializing, each lazy value is a thunk which will decode to the actual entity, and the thunks retain the reference to the whole ByteString which is the object file content. This will take a lot less memory and time than strictly deserializing and having the decoded entity in the heap. Still, we have to perform this kind of lazy-loading for all object files, and this requires at least as much memory as the file sizes of input archives and objects.
Production linkers like lld don't load all object files at link-time. It's about the time we do the same and replace the previous "lazy loading" trick with a more decent solution.
Describe the solution you'd like
Refactor the AsteriusCachedModule type and serialization logic. The new logic should support:
Only reading the symbol dependency map, without touching the rest of the file
Given an entity's symbol, deserialize it starting from a certain offset of the file, without reading the whole file.
The gc sections pass is split into smaller passes:
Given the input (archives, objects, in-memory AsteriusModule), scan the files and build up the symbol dependency map. After this completes, there shouldn't be heap resident blobs of input file contents. It's also possible to include a separate archive-level symbol index file into the archive, since we have our own ahc-ar now; but directly scanning the archive entries should have the same effect.
Run gc sections using the symbol dependency map
We now have a list of symbols, and also the origins of these symbols (which object file and what offset into that file). Deserialize the entities indexed by those symbols and form the AsteriusModule as the gc section output. Before we introduce proper streaming logic into the rest of linker logic, the entities should still be lazily decoded; each thunk will read its own object file and perform decoding on its offset.
This should further reduce linker memory usage by avoiding retaining object file content blobs in memory.
Describe alternatives you've considered
Another potential option we've considered: using lld for linking, and adapt to the wasm linking convention as described here. This means we need to emit object files which are valid .wasm modules themselves, containing additional custom sections describing the linking-related metadata.
A major problem with using lld is: we have different atoms (entities) compared to vanilla wasm. lld knows about data segments and functions, but we have JSFFI imports/exports that are also indexed by symbols and need to be garbage collected. It is yet unclear how to make lld garbage-collect such custom entities and report the result.
Another feature which lld lacks is graceful handling of unresolved symbols. We have the Barf primitive which is capable of producing a meaningful runtime error message for unresolved symbols, but lld will only emit unreachable.
And finally: we may still want to perform link-time rewriting on the tree-formed IR in certain cases. There will be need to store the serialized tree IR as custom sections, and it's also unclear how to make lld handle those as well.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
We've been bothered by the excessive memory usage of the linker for a long time. There has been various tricks (e.g. lazy serialization of
Map
values) and efforts (e.g. #605) to improve it. In this ticket, we seek a deeper refactoring that addresses a technical debt: needing to load all objects at link-time.Until now, the linker codebase mostly works with the
AsteriusModule
type which consists of several symbol-indexed immutable hash maps. This makes implementation simple, but retains a lot of heap data. This is especially true for the input to the gc-sections pass, since most input entities will be thrown away, but they had to be present in the memory for some time.In the past, we used laziness to reduce memory consumption here. The hash maps in
AsteriusModule
has lazy values; when deserializing, each lazy value is a thunk which will decode to the actual entity, and the thunks retain the reference to the wholeByteString
which is the object file content. This will take a lot less memory and time than strictly deserializing and having the decoded entity in the heap. Still, we have to perform this kind of lazy-loading for all object files, and this requires at least as much memory as the file sizes of input archives and objects.Production linkers like
lld
don't load all object files at link-time. It's about the time we do the same and replace the previous "lazy loading" trick with a more decent solution.Describe the solution you'd like
AsteriusCachedModule
type and serialization logic. The new logic should support:AsteriusModule
), scan the files and build up the symbol dependency map. After this completes, there shouldn't be heap resident blobs of input file contents. It's also possible to include a separate archive-level symbol index file into the archive, since we have our ownahc-ar
now; but directly scanning the archive entries should have the same effect.AsteriusModule
as the gc section output. Before we introduce proper streaming logic into the rest of linker logic, the entities should still be lazily decoded; each thunk will read its own object file and perform decoding on its offset.This should further reduce linker memory usage by avoiding retaining object file content blobs in memory.
Describe alternatives you've considered
Another potential option we've considered: using
lld
for linking, and adapt to the wasm linking convention as described here. This means we need to emit object files which are valid.wasm
modules themselves, containing additional custom sections describing the linking-related metadata.A major problem with using
lld
is: we have different atoms (entities) compared to vanilla wasm.lld
knows about data segments and functions, but we have JSFFI imports/exports that are also indexed by symbols and need to be garbage collected. It is yet unclear how to makelld
garbage-collect such custom entities and report the result.Another feature which
lld
lacks is graceful handling of unresolved symbols. We have theBarf
primitive which is capable of producing a meaningful runtime error message for unresolved symbols, butlld
will only emitunreachable
.And finally: we may still want to perform link-time rewriting on the tree-formed IR in certain cases. There will be need to store the serialized tree IR as custom sections, and it's also unclear how to make
lld
handle those as well.The text was updated successfully, but these errors were encountered: