-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S.R.CS.Unsafe: Add unsafe operations for ref returns and locals #17968
Comments
TL;DRI propose changing the API surface to: public static class Unsafe
{
public static ref U As<T, U>(ref T source);
public static ref T Add<T>(ref T source, int elementOffset);
public static bool Equals<T>(ref T a, ref T b);
} My initial thoughts are centered around two things:
NecessityAs far as I can tell these new API additions are there for convenience only, since they can be expressed via the excisting unsafe API surface. AsRefref int r = ref Unsafe.AsRef<byte, int>(ref b[0]); can be expressed via: ref int r = Unsafe.AsRef<int>(Unsafe.AsPointer(ref b[0])); Note how the existing API does not need to specify source type RefAdd (move to index)ref int r1 = ref Unsafe.RefAdd(ref a[0], 1); can be expressed via: ref int r = Unsafe.AsRef<int>(Unsafe.AsPointer(ref a[0]) + Unsafe.SizeOf<int>() * 1); Clearly, RefEqualsUnsafe.RefEquals(ref a[0], ref a[0]); can be expressed via: Unsafe.AsPointer(ref a[0]) == Unsafe.AsPointer(ref a[0]); My ViewPersonally, I think these additions are worth making as it makes a lot of scenarios easier and NamingWhat strock me first is that public static class Unsafe
{
public static ref U As<T, U>(ref T source);
public static ref T Add<T>(ref T source, int elementOffset); // Note I changed offset to elementOffset to make it clear
public static bool Equals<T>(ref T a, ref T b);
} This will allow writing the following: ref int r = ref Unsafe.AsRef<byte, int>(ref b[0]);
ref int a1 = ref Unsafe.RefAdd(ref a[0], 1);
var sameAddress = Unsafe.RefEquals(ref a[0], ref a[0]); instead as: ref int r = ref Unsafe.As<byte, int>(ref b[0]);
ref int a1 = ref Unsafe.Add(ref a[0], 1);
var sameAddress = Unsafe.Equals(ref a[0], ref a[0]); Since I think |
Nope, they cannot. The equivalents that you show are incorrect because the use of
We already have a
Seems confusing. I can imagine one asking "Does this method compare references or does it compare the referenced values?". |
I think the biggest problem with this API is that That said, I wonder if it wouldn't be better to reverse the generic arguments: public static ref U AsRef<U, T>(ref T source); in the hope that a future language version could do some sort of partial type inference so |
Ah yes, I made the incorrect assumption that things would be
For me, the What if in the future generic pointers are allowed, but changing type of the pointer is not allowed via casting for example, would the following API addition make sense? public static U* As<T, U>(T* source); Would this then have to be called this
Yes, I agree with that. Not sure
This just seems counterintuitive to all other conventions in .NET. Perhaps, the compiler might as well infer it from the target e.g. ref byte b = ref a[0];
ref int i = Unsafe.As(ref b); // Infer from assignment? Although it is almost too magical. I suggested a fluent API for something like this when we first discussed the |
The other two IL operations on refs that I have seen used are:
I do not claim that these are good names, but functionality could be useful. |
@VSadov good suggestions. Out of interest, would you elaborate on what |
Yeah, I suppose it doesn't make sense to do that in the hope that C# will ever do partial type inference like C++ does.
Hmm, I suppose it's fine. Whatever name suggests to the user that the references are compared and not the values 😄.
To add to @nietras question: how would this be implemented?! |
Thank you for a great feedback! I like the suggestions.
Byrefs on stack are implicitly pinned, so it can be used to assert that it is safe to convert to raw pointer without pinning - example from CoreCLR. Unfortunately, there is no way to implement it in portable way. It would have to be runtime or platform specific that is not pretty given the current shape of library. If it keeps showing up as needed API, it should be looked into as separate issue.
I agree that it is useful operation to have. BTW: It is less useful with the current byref locals and returns than one may think because of the single assignment limitations of byref locals. I have noticed in my experiments that one tends to operate on indices and then convert to byref as the last step and never go back - the style of the pointer math is different from unmanaged pointers. It may be be also nice to have Subtract variant that takes elementOffset for convenience and symmetry with Add. So the updated proposal is:
More suggestions for refinements are welcomed. Including @jamesqo @KrzysztofCwalina that I forgot to include yesterday. |
Since we're taking 64-bit platforms into account, the return type should probably be long instead.
This overload is bothering me a little. It doesn't read very smoothly in my brain; if I saw something like I think we should instead name this Unsafe.ConvertRef<Foo, Bar>(ref x); // convert *from* ref Foo *to* ref Bar It would also work smoothly if a future version of C# adds type inference based on return type assignment, e.g. as mentioned above ref byte b = ref a[0];
ref int i = Unsafe.ConvertRef(ref b); // We converted a ref, and now we have a ref int
ref int i = Unsafe.As(ref b); // As... what? And why is the parameter passed as ref? :( |
If this is used on array elements, the 32-bit return type is sufficient because of array indices can only be 32-bit integers in mainstream .NET runtimes. The problem only exists if somebody uses it for a general pointer math on unmanaged pointers cast to refs, or on .NET runtime variants that allow arrays with >2MB elements. A similar problem exists for Add as well. 32-bit offset argument is not ideal for general pointer math on 64-bit platforms. Changing the offsets to 64-bit type would make it lower performance on 32-bit platforms, and harder to work with. Another option is to make the offsets native int (IntPtr in C#), but it would again make it harder to work with because of one cannot do much with IntPtr in C# directly today. I think the best way out is to document that these operations only work well on arrays with <2MB elements.
Convert suggests significant change of representation in my mind, e.g. going to from bool to string. I agree with @nietras observation that it should called |
@jkotas Agree with your points on
Unfortunate that we didn't name the existing overload If we are to keep it as Also regarding the other |
I think I agree overall that the common use case is probably gonna be with 32-bit offsets and that these will be signed integers.
@jamesqo I do agree that the readability of Should this not be allowed for
I do not know if there is any reason for If ref T As<T>(ref void source) And be able to write: ref int i = ref a[3]
ref byte b = Unsafe.As<byte>(ref i); I do understand this requires language design changes not only for C# but also for C++/CLI, VB.NET and F# perhaps, but I assume this is needed in some way for ref locals and returns anyway. So is allowing Would it then make sense to allow something like: void* p = ....;
ref void v = Unsafe.AsRef<void>(p); |
I think it would make sense. And you can actually create a The funny thing about this is that if the language allows such a conversion then it should probably allow many other conversion between ref types and that may make this particular
That wouldn't work unless the language is changed to allow void to be used as type argument. I think there were some discussions about that but nothing happened. |
Yes, it is a problem in making C# more functional.
Not sure this would be true, with pointers there is still the issue that generic casts are not allowed. I would assume this would be the same for possible Although, this is mainly due to the restriction of C# not supporting generic pointers, if generic |
I don't see any reason why At least conversions to |
@nietras @mikedn Maybe we could go back to the earlier suggestion using a fluent API? It could look something like this (translated to IL of course): public static Interpreter<T> Interpret<T>(ref T source) => new Interpreter<T>(ref source);
public struct Interpreter<T>
{
private IntPtr _ptr;
public Interpreter(ref T source)
{
_ptr = (IntPtr)source;
}
public ref U As<U>() => (ref U)_ptr;
} That way we could use it like ref int i = Unsafe.Interpret(ref b).As<int>(); @nietras mentioned earlier this could be 'bad from a perf perspective', but since the JIT should basically eliminate these copies in Release mode I don't see why not (even if it's a little more IL to write). It's able to do partial type inference and looks much better than the other prototype which requires to to specify both types. |
Ha ha, no thanks. I think I'll start hating fluent APIs with a passion. They have their uses but these days they're more like abuses. |
@mikedn OK then, I'm going back to my earlier position about switching the type parameters. :) Regarding the |
The implementation you have suggested would not even work. It has GC hole because of intermediate unmanaged pointer.
The prevalent order in the .NET APIs is "source, destination". The Copy methods on unsafe class are intentionally violating it to be in sync with the low-level order used in C and IL (discussion in dotnet/corefx#7966). Hard to come up with the "right" answer. The different order between .NET and C is endless source of mistakes when one is using both. E.g. |
@KrzysztofCwalina I think single name should be selected for both because it is just scalar vs sequence. imo, |
string s1 = "foo";
string s2 = s1;
Unsafe.ReferenceEquals(ref s1, ref s2);
// vs
object.ReferenceEquals(s1, s2); It confuses me. |
Regarding
I don't think there should be I'd split
|
Ah, you seem to be right. I don't know if IL allows you to store a
Actually, in the end I don't think the order of the type parameters really matter; when people start typing the method into VSCode / Visual Studio, they should see the name of the type parameter pop up (e.g.
If someone wrote enumerable.As<int>(); I would think that they were somehow converting an IEnumerable to an int, whereas if someone wrote enumerable.Cast<int>(); I think I'd better understand each element of the enumerable was being cast, although maybe that's just because it's the API we have today.
That's actually a good point; I agree too that calling it
|
I'm not too sure about the idea of returning/accepting an
Redundant, you can just do this for the byte difference: var byteDifference = Unsafe.Subtract(ref a, ref b) * Unsafe.SizeOf<T>(); Guaranteed to not overflow (I believe) since the maximum number of bytes 2 pointers can be apart is 2^63 / 64 or somewhere around there. I also think the redundant div (in Subtract)/mul (in SizeOf) should be eliminated by the JIT. If it's not, it should be. Maybe if the method names were a little shorter then it would be OK... @nietras Regarding Subtract, even though it's redundant (I said so myself earlier) I still think it may be worth including. |
Lots of great input and food for thought. Naming is hard and my comment got pretty long again :|
All these have existing meanings in .NET, none are without prior bagage.
I lack a proper terminology definition for .NET for talking about these in a consistent manner (does any exist?) so hope it is clear from context. Other possible wordings could be For the first version of T As<T>(object o)
void* AsPointer<T>(ref T value)
ref T AsRef<T>(void* source) That is why I still do not like ref byte b = ...;
ref int i = Unsafe.As<byte, int>(ref b);
ref int i = Unsafe.Convert<byte, int>(ref b); // Yes
I am sure there are inconsistencies in my argumentation here :) However, for me I would much rather stick with the existing verb I would then define the method as (if ref TTo As<TFrom, TTo>(ref TFrom source) which leads to the other suggestions for type parameter names and order. @jamesqo suggested For the order of parameters I think one has to look to Wouldn't it be pertinent to ask the Roslyn team what there thoughts are on
I agree bool AreRefsEqual(ref T a, ref T b)
I think all "iterator" operations (can't help to feel that we are pretty much implementing C++ iterator behaviour for I think Whether the offset should be
I can live with adding this as well as |
Maybe just add * and / to IntPtr? I think IntPtr should behave like any other integer type as much as possible. I saw proposed C# language changes about that on the Roslyn Github presence. The pointer difference representation can safely be IntPtr on all platforms. The CLR can just promise to make that work. I see no issues with that. |
Is same format used for Vector reinterpret public static Vector<Byte> AsVectorByte<T>(Vector<T> value) where T : struct
public static Vector<Single> AsVectorSingle<T>(Vector<T> value) where T : struct
https://github.com/dotnet/corefx/issues/10457 Operators should be exposed for |
Stacks of different threads can be interleaved with GC heap segments. It is actually pretty common to have this situation in large workloads (on Windows at least). |
I did not know about this. I always assumed that OS allocates stack segments in the higher addresses separately from heaps. I think I might have seen code in the past that relies on such assumptions. Interesting... |
Here is the updated proposal with feedback incorporated:
|
You can probably ask, but I am 99% sure the answer will be no; I don't think they're going to be very keen on adding further unsafe features to C#, e.g. they chose ref returns over generic pointers. Besides this contrived use case, what other uses could
Was writing this post when I saw @jkotas' update.... :) Everything in the updated proposal looks good. @jkotas I really like |
To avoid specifying two type parameters in the public class Unsafe
{
public static class As<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source)
{
//magic
}
}
. . .
}
int x = 1;
ref uint y = ref Unsafe.As<uint>.From(ref x); //TFrom is inferred from the arg |
@VSadov This appears to be giving a compiler error: http://tryroslyn.azurewebsites.net/#K4Zwlgdg5gBAygTxAFwKYFsDcAoADsAIwBswBjGUogQxBBgGEYBvbGNmfYsmANwHswAExgBBEAB4AKgD4AFHwIArVKWQwAZnz4BKZq3YBffW2MdCJcpRp0xU6aZbsYRg0A== |
@nietras 'ref' in C# does not apply to the type, it applies to the signature of the method. In such prospective 'ref void M()' would not make much sense. Void method does not return anything. And 'ref void' does that by reference? |
@jamesqo class Program
{
static void Main(string[] args)
{
int x = 1;
ref uint y = ref Unsafe.As<uint>.From(ref x);
}
}
public class Unsafe
{
public static class As<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source)
{
//dummy implementation
return ref (new TTo[1])[0];
}
}
} |
@VSadov There is already an existing method |
@jamesqo - I did not realize that Unsafe is an already existing class and has something in it. Not having to specify two types in 'As' would have mostly an aesthetical value. In reality you are still supplying two type arguments, just by splitting them between type/method you could let the compiler to infer the TFrom one from the argument. |
@nietras Since it's possible to elide specifying both types if we have an inner class, do you still stick to your earlier position of using |
I really like this too, much better. MSTest uses it as well. In fact
@jamesqo good question. I did think a bit about it before, but didn't think there was precedence for doing something like that. I would specify it as: public class Unsafe
{
public static class To<TTo>
{
public static ref TTo From<TFrom>(ref TFrom source) { ... }
}
} The question then is, whether: int x = 1;
ref uint y = ref Unsafe.To<uint>.From(ref x); is better than: int x = 1;
ref uint y = ref Unsafe.As<int, uint>(ref x); ? In this case, we don't really save much regarding typing.
@VSadov couldn't the same argument be made with void* Foo(int*) This seems contrary to the logic that I would say |
int x = 1;
ref uint y = ref Unsafe.To<uint>.From(ref x); Doesn't suggest reinterpret "in-place", but transfer and change.
void* ptr;
void* ptr2 = ptr + 1; // nope
void* ptr3 = (void*)((char*)ptr + 1); // ok
void thing1 = *ptr; // nope
auto thing2 = *ptr; // nope
auto thing3 = *(char*)ptr; // ok Is what you are asking the ability to cast a int x = 1;
ref uint y = ref Unsafe.As<int, uint>(ref x);
uint* pY = &y; // cast to pointer
void* pV = (void*)&y; // cast to void pointer |
I like this proposal. One small question: is there anything actually unsafe about |
|
That is true I guess it could be improved by calling it
No, and it wouldn't support the scenarios that
How would you write it using normal C# code? Although, that of course is not the same as it needing to be unsafe...
@jkotas Been thinking about this. Can´t |
I think the parameters to AreSame should be called "left" and "right". This is to mimic conventions we use for operator== overloads. |
For unsafe bool AreSame(ref int left, ref int right)
{
fixed (int* pLeft = &left)
fixed (int* pRight = &right)
{
return pLeft == pRight;
}
} Can't really do it for generic types? Though it might be more a normal thing to test (than unsafe); like if you were given two structs from an array. might want to check if they are the same one. |
Can |
Yes, refs can be used for both managed and unmanaged memory. The GC does not touch them if they point to unmanaged memory. Basically, the algorithm for refs during the GC root scanning is:
Fixed.
Yes. |
In unsafe context only? Would be interesting to have update: May be it is just easier to use pointers then. |
The null refs cannot be manufactured in C# directly, but they are possible to manufacture by unsafe code. They can be checked for null using |
@omariom refs existed in C# since v1.0 in a form of ref parameters. So far there was not a lot of need to null-check them. :-) It is possible to manufacture a "null" ref, but language ignores such possibility as conceptually impossible, - just like regular locals somehow not having actual storage behind them. Behavior in such cases is not specified and at best would lead to NREs. In practice the only cases of "null" refs that I have ever seen were results of bugs in the compiler or JIT. |
Roslyn is adding support for ref returns and locals (dotnet/roslyn#118). S.R.CS.Unsafe should provide operations that allow taking advantage of ref returns and locals in unsafe code.
Edit: Updated with the revised proposal
The text was updated successfully, but these errors were encountered: