-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a custom steps API to run logic on mark #1774
Conversation
namespace Mono.Linker.Steps | ||
{ | ||
|
||
public interface IPerAssemblyStep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs much better name. Looking at the code this runs during Mark step only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions? I tentatively changed it to IMarkAssemblyStep
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the registration-based model I instead called it IMarkHandler
/// Extensibility point for custom steps that run during MarkStep, | ||
/// for every marked assembly. | ||
/// </summary> | ||
public interface IMarkAssemblyStep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the right step forward. You are adding ondemand step in the way it's not very clear to me how it should be used. I'm also not sure OnAssemblyMarked is actually the point where the custom steps would like to hook up. I'd expect them to ignore assembly-mark completely and hook up into OnTypeMarked or OnMethodMarked instead.
Secondly, does this also mean there is no way to actually mark assembly when it's loaded via custom step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current thinking:
- We want the ability for custom steps to react to marking of certain things, for starters assemblies, but eventually also types and maybe even methods.
- Each item type will need its pipeline at least internally - we simply need a list of custom steps to run for each item type - for example for each assembly. Whether we explicitly expose the notion of pipelines or not is currently a secondary concern (I think we might as well)
- To not hurt perf, if a custom step wants to react to only assemblies, it should not need to in any way subscribe to type/methods.
SubStepDispatcher
solves this by making all sub-steps implement everything, but then it uses an enum to tell the system which of these the step is interested in. I would choose a more declarative approach. Have an interface for each, so (names obviously pending discussion):IMarkAssemblyStep
IMarkTypeStep
IMarkMethodStep
- If we want to we could also create base interface
IMarkStep
which would have theInitialize
method and all of the above would "inherit" from it. - Each interface would have basically one method, something like
OnAssemblyMarked
/OnTypeMarked
/OnMethodMarked
. The semantics is rather obvious. - Note that there's a difference between loading and assembly and marking an assembly - steps are allowed to load assemblies as they please (they can also freely call
.Resolve()
) and they are even allowed to mark additional assemblies. Once assembly is marked and theMarkStep
gets to it, it would call all custom steps implementingIMarkAssemblyStep
and "notify" them. - As discussed in other PRs, marking in custom steps would be always deferred (using
Annotations
and actual marking happens once theMarkStep
gets to it). - I think we're doing
OnMarkAssembly
first because we think we'll need it to implement some of existing steps in the linker. Once we start porting over Xamarin steps we should definitely consider addingOnMarkType
or maybe evenOnMarkMethod
(or maybeOnMarkMember
? not sure) - I think assembly and type should suffice for .NET 6, but we'll see.
Other considerations:
- Existing
IStep
infra would remain basically intact. The main difference is that it won't automatically get the entire assembly closure. It could load the closure on demand, but it won't ever be guaranteed to be complete (that's why we're doing Enable processing of dynamically referenced assemblies #1666) if the step us running beforeMarkStep
. - Even the new custom steps could in theory load the closure aggressively, but it's not recommended for perf reasons. That said, linker won't react to assembly loading as an event with anything. The only event which has meaning is marking an assembly.
- "marking an assembly" in the above means that linker will preserve the assembly in some form in the output (trimmer or not). We will need to make sure that the
OnMarkAssembly
gets called for all assemblies written to the output (I don't know if today we internally actually markcopy
assemblies for example, if not we should fix that). - There will be certain overlap between the "preserve" mechanics in annotations (
SetPreserve
andAddPreservedMethod
) and theOnMarkType
. It will be possible to implement the preserve semantics withOnMarkType
and simple marking. I don't think this is inherently bad, just two options which will end up doing the same thing. - We could allow one step to potentially implement more than one such interface - they would all share the same instance (we would ever only create one instance of the custom step), but it could belong to multiple pipelines (this gets a bit tricky with ordering though, not sure yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pretty much agree with Vitek's thoughts - I see us moving in a direction where custom logic will just extend the marking logic by adding conditional dependencies like "source -> destination" which means "if source is marked, destination should be marked". The current mechanism for this is AddPreservedMethod
, which requires custom steps to discover the "source" types by scanning for them eagerly, but in the future extension points like IMarkTypeStep
/OnMarkType
would be better since they avoid an extra scan of unmarked types.
IMarkAssemblyStep
(or whatever we name it) would let most existing steps work with little to no modification, even when we add lazy loading, as long as they only use AddPreservedMethod
/SetPreserve
. The scope of the scanning will be reduced to marked assemblies only.
Any steps which need to scan for types to Mark
unconditionally (rather than conditionally AddPreservedMethod
/SetPreserve
) will have two options:
- If it is enough to discover types in marked assemblies, they can use
IMarkAssemblyStep
without further modification - If they need to discover types in otherwise unmarked assemblies, they can continue using
IStep
and look at the loaded closure (or explicitly load it, when we turn on lazy loading).
I don't have a strong preference on whether the hooks should be declared using interfaces or an enum similar to SubStepTargets
. Either way, if we allow one custom step to hook into multiple events, we need to decide how to handle the filtering capability of ISubStep
. The IsActiveFor
check is nice because it currently prevents unnecessary calls to the substep for types/members/etc for assemblies where it's not active. With IMarkAssemblyStep
it just works the same as it does today. But if we add, say IMarkMethod
, then we would want to avoid calling the custom step method logic for a marked method inside of an assembly for which it's not active. Essentially some of the SubStepDispatcher logic would become built-in to MarkStep or the linker pipeline. I think just adding IMarkAssemblyStep
is a good way to make progress, while leaving that possibility open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two needs which might be better to handle separately.
- Run additional processing after assembly load
It's something that is needed primarily by linker steps. I'm not convinced we need to expose this functionality publicly at this point to custom steps therefore this can be implemented as simple
method calls. The important part is that this logic can be triggered from any step.
Something like this might be enough.
class AssemblyResolver
{
public override AssemblyDefinition Resolve (AssemblyNameReference name, ReaderParameters parameters)
{
// some existing code which loads assembly
// This could be even refactored into single class, etc.
BlacklistStep.Process (context, assembly);
LinkAttributesStep.Process (context, assembly);
}
}
- Conditional Marking
Many of the existing custom steps include logic which marks additional code based on a specific
pattern. This API should be exposed but I think we need to validate it first by trying to
rewrite a few of the existing custom steps. I'd like to see API along following the quick draft.
public interface IMarkStepProcessing
{
void Initialize (LinkContext context, MarkProcessingRegistration register);
}
public class MarkProcessingRegistration
{
public void AsemblyProcessed (Action<AssemblyDefinition> action);
public void TypeProcessed (Action<TypeDefinition> action);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should try to convert the steps which linker does always into non-step code - just like we've converted const prop and basically all of XML loading. Maybe there will be something which would still make sense as a "step" - we'll see as we go through these.
I also agree that it sounds like a good idea to try to convert a couple of Xamarin steps and design the new extensibility along the way.
-
I would like to avoid linker doing almost anything on assembly load. I think the current idea is that linker would do nothing eagerly. LinkAttributes will be loaded on demand when somebody asks for linker attributes on something from an assembly which didn't load them yet. Substitutions will be loaded also on demand once the constant prop touches something from a new assembly. Descriptors should only be processed when the assembly is marked so it will be called from the MarkAssembly (or nearby). There might be some tweaks to this with regard to copy assemblies and their dependencies, but hopefully not.
-
This looks good - I'm not sure I like the word "processed", but that's naming - and I suck at it 😉
0535c23
to
ad3fd8e
Compare
59a3575
to
66c0324
Compare
I made some local changes to test xamarin-android with the new registration-based API: dotnet/android@master...sbomer:illinkMarkHandler. In my analysis of the custom steps, most of them "just work" if we call them only for marked members, because they ultimately either:
Most of the steps have been updated in a straightforward way based on the above. The exceptions are:
|
test/Mono.Linker.Tests.Cases/Extensibility/MarkHandlerDispatcherUsage.cs
Outdated
Show resolved
Hide resolved
protected MarkHandlerDispatcher (IEnumerable<ISubStep> subSteps) => throw null; | ||
|
||
public void Add (ISubStep substep) => throw null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do we have any use case for this? It seems to me this is a more complicated version of MarkContext
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a convenience helper for steps that want to scan for all members (not just marked members) in marked assemblies. It is basically a copy of SubStepsDispatcher that works only for marked assemblies, to make it easier for existing substeps to transition - this way they don't have to re-write the code that scans all members in an assembly.
It's really only used for ApplyPreserveAttribute and PreserveExportedTypes (here) - so maybe it's not worth keeping around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather see the public interface simple. Looking quickly at the XA code is seems that both are not needed
ApplyPreserveAttribute - that's normal step that should just run before MarkStep, it does not need any of the changes
PreserveExportedTypes - should run only when a type is marked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ApplyPreserveAttribute runs before MarkStep it won't see the reference closure and it won't run for dynamically referenced assemblies - is it safe to look for the attribute only in root assemblies? (Do we not care about the link-all case?)
My understanding of ExportAttribute (from https://docs.microsoft.com/en-us/xamarin/android/platform/java-integration/working-with-jni#exportattribute-and-exportfieldattribute) was that it is used on members which are reachable from java interop, but maybe that link doesn't tell the whole story. What is it that marks the required types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, at a closer look, they need to run once assembly is marked. Thinking about it more it's probably fine to have both approaches available.
Few suggestions though.
- I'd rename this type to be
MarkSubStepsDispatcher
to make it clear this type should be used to batch all custom mark substeps. - Ensure
MarkHandlerDispatcher.Initialize
is always called (the user type can implement explicitly IMarkHandler again).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the second suggestion - not sure if this is what you meant, but I changed Initialize to be a virtual method so that the user type can call base.Initialize
.
These steps are initialized once when the pipeline processing begins, and they get called for each marked assembly. This includes a change that allows per-assembly custom steps, and makes SubStepDispatcher a per-assembly step. The per-assembly custom steps do not currently support insertion at arbitrary points in the pipeline. They are run in the order added. Steps which use SubStepsDispatcher will no longer be called on the entire closure of loaded assemblies.
- Rename IPerAssemblyStep to IMarkAssemblyStep - Keep SubStepsDispatcher
- Add test for IMarkHandler - Add missing license header
- MarkHandlerDispatcher -> MarkSubStepsDispatcher - Make Initialize virtual so that it can be called by derived types
a371556
to
2cc0fd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments and LGTM
substeps.Add (substep); | ||
} | ||
|
||
public void Initialize (LinkContext context, MarkContext markContext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be virtual (in ref code as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually made it virtual this time... thanks.
- Always initialize substep lists - Actually make Initialize virtual (and test this) - Remove default ctor and Add method
Context: dotnet/linker#1953 Context: dotnet/linker#1774 Context: dotnet/linker#1774 (comment) Context: dotnet/linker#1979 Context: dotnet/linker#2019 Context: dotnet/linker#2045 Context: dotnet/java-interop@2573dc8 Context: #5870 Context: #5878 Update the custom linker steps to use the new `IMarkHandler` API which runs logic during "MarkStep". (See [this list][0] of pipeline steps for additional context.) As part of this, I've removed the workaround that loads all referenced assemblies up-front and simplified some of the linker MSBuild targets. Some of the `MonoDroid.Tuner` steps were duplicated and changed to implement the new `IMarkHandler` interface, to avoid touching the `MonoDroid.Tuner` code. In my analysis of the custom steps, most of them "just work" if we call them only for marked members, because they ultimately either: - Just call `AddPreserved*()` to conditionally keep members of types (which with the new API will just happen when the type is marked) - In the case of `FixAbstractMethodsStep()`, inject missing interface implementations which will only be kept if they are marked for some other reason. Most of the steps have been updated in a straightforward way based on the above. The exceptions are: - `AddKeepAlivesStep` needs to run on all code that survived linking, and can run as a normal step. - `ApplyPreserveAttribute`: this step globally marks members with `PreserveAttribute`. - The step is only active for non-SDK link assemblies. Usually we root non-SDK assemblies, but it will be a problem if linking all assemblies. - For now, I updated it to use the new custom step API on assembly mark -- so it will scan for the attribute in all marked assemblies -- but we should investigate whether this is good enough. - This can't be migrated to use `IMarkHandler` because it needs to scan code for attributes, including unmarked code. - `PreserveExportedTypes`: similar to the above, this globally marks members with `Export*Attribute`. It's used for java interop in cases where the java methods aren't referenced statically, like from xml, or for special serialization methods. - It's only active for non-SDK assemblies, so like above it will be a problem only if linking all assemblies. - Like above, I've made it scan marked assemblies. - `PreserveApplications`: globally scans for `ApplicationAttribute` on types/assemblies, and sets the `TypePreserve` annotation for any types referenced by the attribute. - This step technically does a global scan, but it's likely to work on "mark type"/"mark assembly", as `ApplicationAttribute` is only used on types/assembies that are already kept. - I've updated it to use the new `IMarkHandler` API. Additionally, as per dotnet/java-interop@2573dc8c, stop using `TypeDefinitionCache` everywhere and instead use `IMetadataResolver` to support caching `TypeDefinition`s across shared steps. For .NET 6, `LinkContextMetadataResolver` implements the `IMetadataResolver` interface in terms of `LinkContext`. [0]: https://github.com/mono/linker/blob/main/src/linker/Linker/Driver.cs#L714-L730
This adds a new API which lets custom steps register an action to be called when a member gets marked. This includes a helper, MarkSubStepsDispatcher, which behaves similarly to SubStepDispatcher and is used to scan in marked assemblies. * PR feedback - Rename IPerAssemblyStep to IMarkAssemblyStep - Keep SubStepsDispatcher * Use --custom-step for both step interfaces * Change to a registration-based model * Rename test file * Cleanup * Add another handler for marked methods * PR feedback - Add test for IMarkHandler - Add missing license header * Add a warning to the API docs * PR feedback - MarkHandlerDispatcher -> MarkSubStepsDispatcher - Make Initialize virtual so that it can be called by derived types * Fix analyzer warning * PR feedback - Always initialize substep lists - Actually make Initialize virtual (and test this) - Remove default ctor and Add method Commit migrated from dotnet/linker@40abde7
This adds a new API which lets custom steps register an action to be called when a member gets marked.
Older notes below - but note that the design has changed from a pipeline that runs per-assembly to a registration-based model.
These steps are initialized once when the pipeline processing begins, and they get called for each marked assembly.
This includes a change that allows per-assembly custom steps, and makes SubStepDispatcher a per-assembly step. The per-assembly custom steps do not currently support insertion at arbitrary points in the pipeline. They are run in the order added. (Do we need to support arbitrary insertion?)
Steps which use SubStepsDispatcher will no longer be called on the entire closure of loaded assemblies.
#1666 will turn more of the built-in steps into per-assembly steps, but here the goal is just to make the API available.
The per-assembly step initialization happens before any of the normal pipeline steps run. (A per-assembly step could in theory load assemblies and operate on the world at this point, but the expected use is just to do some simple setup of global state.)
The --custom-step option still exists, but with #1666, "global" custom steps will no longer get access to all assemblies, unless they do the work to load assemblies explicitly, or run after MarkStep (in which case they will see everything referenced from used code and possibly more).
Contributes to #1735.