-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proposal for enhanced dataflow analysis #988
Conversation
This would let us expand on what reflection use can the linker reason about. The main problem it's trying to tackle is whether we can build a solution that has: 1) Actionable warnings when code is not linker-safe 2) Gives confidence that when there are no warnings, no problems will occur at runtime Being able to make arbitrary code linker-safe is an explicit non-goal – from our past experience we know we can’t provide tools that meet goal 1 & 2 for that: it would be solving the halting problem.
docs/design/reflection-flow.md
Outdated
|
||
* Linker-safe: most code will fall into this category. Linking can be done safely based on information in the static callgraph. | ||
* Potentially unsafe: call to the method is unsafe if the linker cannot reason about a parameter value (e.g. `Type.GetType` with a type name string that could be unknown) | ||
* Always unsafe: calls to these methods are never safe (e.g. `Assembly.ExportedTypes`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this example to be an interesting one.
Consider
int totalExportedTypes = 0;
foreach (var type in typeof(int).Assembly.ExportedTypes)
{
totalExportedTypes++;
}
if (totalExportedTypes != <some expected number>)
throw;
typeof(int)
is easily seen by the linker. Assembly
is easy. and so is ExportedTypes
. The linker could mark all of the exported types. The behavior of the linked assemblies would match the original program.
So the linker could correctly mark all of the dependencies of the reflection API, however, doing so would result in over preserving in most real world use cases since I imagine inside that loop there would normally be some sort of filtering happening on the types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's confusing to call anything just linker-unsafe - linker could keep the entire app and everything would be safe. So maybe a better way is to say that we choose not to support the pattern in the linker as we're not able to achieve good trimming results with it.
Code patterns like this example are typical in Dependency Injection (or similar) scenarios - I think that we can potentially provide different ways of solving that problem. But that's outside of this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confusing to call anything just linker-unsafe
Would it be better to talk about this linker-compatible/incompatible or linker-friendly/unfriendly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
safe/unsafe meaning typically relates to security. This has not much to do with security.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been using linker-friendly in some places already - so I'd be happy with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linker could mark all of the exported types
It could, but it negates a lot of benefits of the linking (one unfortunate API call and suddenly none of the public types get linked away). It doesn't seem to be worth the effort to make this work.
The problem is also that once we have the type back, people are going to do other things with it (not just count them). So we either need to understand IEnumerable<Type>
or just keep all members to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"safe/unsafe meaning typically relates to security" I agree that it's good to not pollute the use of this convention, even though that's all it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, we wouldn't want to support the case of assembly getting types.
However, what about typeof(Foo).GetFields()
? in our linker we will mark all of the fields of Foo
if we see this. Our rational was, if you are getting the fields of the specific type you probably care about those fields and the odds of an undesired ripple effect are much lower. I will say I was/am on the fence about it. I'm curious what stance you guys take.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, what about
typeof(Foo).GetFields()
I think keeping all fields on Foo
in this case would align with how member access is treated in the proposed MemberKinds
enum - type-level granularity feels more acceptable than assembly level granularity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current proposal is to mark this with AllFields
which would get us basically the same behavior as you already have. I personally think that we should be very careful about including more types than "necessary", but I'm much more comfortable including more members.
static void Main() | ||
{ | ||
// SAFE: _otherType was annotated as "some type that has all methods kept" | ||
_otherType.GetMethod("RandomMethod"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this example more or point to where the pattern is used in CoreLib so I can understand the need a little better?
On the surface this pattern could become always safe by putting the GetType
and GetMethod
together.
Type.GetType("Foo, FooAssembly").GetMethod("RandomMethod");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure which pattern in CoreLib had @MichalStrehovsky in mind, but Lazy<T>
almost fits this as well. Some of the .ctors
which don't take a valueCreator
effectively need the T
to have a default .ctor
(and be instantiatable). Internally it's implemented by eventually calling the Activator.CreateInstance
on it.
Now the way the code is currently written can't be directly annotated with the proposed attributes (since it actually doesn't store the Type
instead it stores a boolean
and relies on the T
), but it could be refactored relatively easily to fit the pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put GetMethod
, this this one is more interesting with Activator.CreateInstance
. E.g. here:
Of course they could cache the ConstructorInfo
instead, eliminating the need for the annotation.
But the building block (being able to annotate a field) is necessary to get custom attributes that have System.Type
properties or constructor arguments working. I would like to make those work because they're pretty common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be refactored relatively easily to fit the pattern.
Would this refactoring make Lazy<T>
more expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our linker would detect and correctly mark all of the reflection calls in LicenseInteropProxy()
without the help of any annotations. So those are OK.
You are right that object licContext = Activator.CreateInstance(_licInfoHelper)!;
is the tricky part.
Thinking out loud here. If we detected the type that is assigned to the field, then we could store that knowledge. Then when we come across the Activator.CreateInstance
having the field passed to it we could mark the ctor of all types that had been assigned to that field. So I think we could actually automatically handle this scenario.
I guess what I'm dancing around is I'm trying to figure out is the pros/cons of this proposal vs attributes such as [RequiredMember]
that I've discussed here #797 (comment). We are accumulating support for a variety of these attributes to start trying them out. I don't want to introduce things that would conflict with the wider .NET ecosystem but at the same time our use cases are slightly different which is why I'm trying to understand exactly what scenarios you are trying to solve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case we can store just the Type - it will seem redundant since the T is already there, but if we call it "typeToCreate" or something along those lines it should be OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing Type would be better, but it still not clear how it would be done without introducing regression. Lazy<T>
is perf critical e.g. for FSharp, and great deal of work went into making it as lean as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Lazy<T>
would be solvable by one of two ways:
Mark the Lazy<T>
constructor overload that makes it call into CreateInstance<T>
as LinkerUnsafe
& mark the CreateViaDefaultConstructor()
as "linker shut up, this is safe".
Or,
We can implement the extension Vitek added as a TODO to the doc:
class Lazy<[DynamicallyAccessedMembers(MemberKinds.DefaultConstructor)] T>
{
// ...
}
This would ensure the Activator.CreateInstance<T>
within lazy is not marked linker unsafe.
The drawback of the second approach is that now every use of Lazy<T>
from generic code needs to be annotated because the annotations are contagious, even though the default constructor might not be needed if a factory delegate was provided:
void Frob<T>()
{
// This is now unsafe because the T is not marked [DynamicallyAccessedMembers(MemberKinds.DefaultConstructor)]
// To fix it, we need to mark the T in Frob too
Lazy<T> lazy = ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with annotations that just say "keep this" is that they don't have a connection to the specific reflection API usage within the method - they cannot be used in validating that the actual usage is safe. The linker either has to trust that the annotation on the method itself
Yes, I agree that is a down side. We have users doing things that would fall into the "you should use source generators" category. Many of them want something that is better than nothing which is where [RequireMember]
style attributes come in.
I like the double checking that the connection allows, but I'm wondering if the fact that the connection can be made means that the linker can just auto detect these things. @MichalStrehovsky Is there a use case where the connection is maintained but the linker cannot auto detect the usage? As I mentioned in my previous comment, the example you pointed to could be automatically detected I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To some degree it's a tradeoff between complexity in the linker and explicit behavior in the API. For example, in some cases linker may be able to "propagate" these annotations to callers (over several jumps) - but then if one of the callers doesn't satisfy the necessary requirement (for example it passes in a Type which can't be figured out) the warning message would be relatively confusing as there would be no notion of it in the source code. The warning would say something like "The parameter inputType
of GetSomeData
requires a known type with all its fields" but the method GetSomeData
would have nothing on it which would hint at that fact. Making the annotations explicit means that the behavior is to some degree self-documenting.
As for cases which we can't hope the linker to figure out: Casts (to and from object
specifically), collections, and so on. In these cases the annotations could be used to hint the linker what's going on, but I don't think the linker could figure it out on its own.
Co-Authored-By: Petr Onderka <[email protected]>
dotnet/runtime#33861 Are the API proposals with the attribute additions. |
I think there's one more thing we should make part of the proposal - and that's the behavior of some of the core reflection methods with regard to the proposed attributes. Namely (and similar APIs):
If the linker can determine the name and type statically it will mark only that member, but what happens if it can't determine these values statically? For example, would we mark the I think we'll have to try it, but my gut feel would be something like:
Reasoning is that including all fields is cheap, properties should be relatively small and not bring in more dependencies (same for events), Including all methods effectively means including the entire type... so that is "dangerous". Thoughts? |
And another detail to specify: What is the behavior of these annotations with regard to Static/Instance members. For example if a type is annotated as |
We already put in a precedent for constructors in the prototype (
I'll add a note - it would be both static and instance. We have enough bits that we could expose additional flags on The reason why I went with the public/non-public split is because the "overloads for lazy people" ( The other thing to think about is that these APIs (except for the one that deal with ctors) search in base types too, so if we don't know the specific name, this could be referring to any member from the bases too. |
* `Type.GetProperty` | ||
* `Type.GetField` | ||
* `Type.GetMember` | ||
* `Type.GetNestedType` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are GetDeclaredMethod
, etc. intentionally missing in this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I added this as a response to Vitek's comment - it's not trying to be an exhaustive list. We already e.g. special case methods on System.Linq.Expressions, so this will be a pretty long list. I'll add "for example" because trying to keep this up-to-date would be a vain effort.
* `Type.GetField` | ||
* `Type.GetMember` | ||
* `Type.GetNestedType` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Are going ..." reads like a question, but it is not actually question. Maybe say "The following methods are going to be special cased by linker:" in front of the list instead.
Co-Authored-By: Jan Kotas <[email protected]>
Co-Authored-By: Petr Onderka <[email protected]> Co-Authored-By: Jan Kotas <[email protected]> Commit migrated from dotnet/linker@e4568b5
This would let us expand on what reflection use can the linker reason about.
The main problem it's trying to tackle is whether we can build a solution that has:
Being able to make arbitrary code linker-safe is an explicit non-goal – from our past experience we know we can’t provide tools that meet goal 1 & 2 for that: it would be solving the halting problem.
Cc @vitek-karas @marek-safar