-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement copy(::Void) = nothing #15546
Conversation
In the linked issue, it seems the presence of |
The bug that was uncovered was not actually the fact that the copy was happening, but that the wrong value was being returned in the first place, so technically, the copy failing just prevents you from seeing the real problem. I don't care much but copying |
I disagree. If you have |
So copy should fail on anything that's not a mutable collection? That's currently very far from the case: julia> copy(1)
1
julia> copy(1.5)
1.5
julia> copy((1,2.5))
(1,2.5)
julia> copy("foo")
"foo" |
Yes, I suspect a better change would be to remove those behaviors. Or, as you suggest, add a |
I would argue for keeping those behaviors. If you write a function that works with any index-able type it would be annoying if |
Yes, but why are you copying it in that case? If you're copying it in order to mutate the result, it still won't work on an NTuple. |
I would guess that removing these methods would break a lot of code. |
I agree that's possible but I'd like to know why. What is that code doing? It would be very interesting to try it and see exactly what breaks. |
I'm trying this and found the first interesting data point: #3037 For example In that issue I said "I'm fine with adding a copy in those cases." Well, that was 3 years ago, and I no longer find this line of thinking amusing. Are we really going to write |
The reasoning for the |
I don't think a pure function should ever need to call
? |
So you think it's totally ok that |
We need a new term here: alias-stable.
|
If we want to continue down the
|
See @mbauman's comment. Identity is definitely fine since the result always aliases the argument. I don't know about |
I don't like where this is going. Consider my
If we're going to make tuples instead of doing addition, suddenly it doesn't make sense to copy the argument. We need to know whether the argument function returns aliases, or maybe structures that indirectly contain aliases... |
Please, just answer the question about |
Apologies, I did not intend to ignore the point. I advocate just removing the call to Reduce examples:
|
The point of this example is that you can "fix" the case of |
Found another interesting data point, from
|
@JeffBezanson: Do you see any way to fix Regarding |
The reduce example strikes me as a red herring. The The b = convert(T, a)
a === b && (b = copy(b)) |
The advantage of having the caller make the copy is that the caller knows if copying is needed or not from its perspective. The disadvantage of expecting the caller to make the copy is that the caller may not know whether copying is needed or not from the callee's perspective – for example in |
I agree that The particular case of |
That is type-unstable for some array types: julia> function f{T}(::Type{T}, A)
B = convert(AbstractArray{T}, A)
A === B && (B = copy(B))
B
end
f (generic function with 1 method)
julia> @code_warntype f(Int, sub(1:10, 2:3))
Variables:
#s14::Type{Int64}
A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}
B::Any
Body:
begin # none, line 2:
B = A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1} # none, line 3:
unless A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1} === B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}::Bool goto 0
B = (Base.copy!)((top(ccall))(:jl_new_array,(top(apply_type))(Base.Array,Int64,1)::Type{Array{Int64,1}},(top(svec))(Base.Any,Base.Any)::SimpleVector,Array{Int64,1},0,(top(getfield))(B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1},:dims)::Tuple{Int64},0)::Array{Int64,1},B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1})::Array{Int64,1}
goto 0
0: # none, line 4:
return B::Union{Array{Int64,1},SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}}
end::Union{Array{Int64,1},SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}} Personally, I think conversion is ok to sometimes-alias. We just need a more obvious way to write the non-aliasing method in a simple and type-stable way. |
I don't think
but this doesn't work for the The bottom line is that I don't think we can realistically institute a policy of asking every function to be alias-stable. I'm in the "programs should be 80% purely functional" camp. Alias-stability requires worrying about copying everywhere, when it's only a real issue in a minority of cases. Alias-stability is supposed to be convenient for the caller, but the caller needs some way to know which functions are aliasing, so you still need to think about it to some degree. And unlike type-stability, we have no tools for dealing with alias-stability and I don't wish to prioritize developing them. I do think we should have an |
We should then make it easier to, for example, stack allocate mutable arrays alternatively make tuples not be a pain to use as arrays. The reason for 80% of my mutating functions is because creating julia arrays is very expensive and the only way I can get good performance is to preallocate and mutate. Sorry for the slightly off topic comment. |
I think the argument against Data point: Scala has both mutable and immutable collections, and the mutable collections implement One case I've found so far where code was written to need |
The actual issue at hand here is whether it makes sense to write generic code that copies immutable values and expect it to work. I'm not arguing that we should make copies of everything everywhere to ensure perfect alias-stability, I'm arguing that allowing |
I could maybe see that |
|
|
Yes that sounds quite good. I guess it can then make decisions like which arrays to copy, e.g. if one views a small part of another? |
I guess this falls in the same "alias unstable"-camp. Is there an "official" policy regarding things like this? julia> S = sprand(5,5,0.5);
julia> pointer_from_objref(S)
Ptr{Void} @0x00007f8902bd5d20
julia> pointer_from_objref(S*UniformScaling(1))
Ptr{Void} @0x00007f8902bd5d20
julia> pointer_from_objref(S*UniformScaling(2))
Ptr{Void} @0x00007f8902bd6980 |
IMO, the official policy is that the aliasing behavior of any function that might compute a novel value is currently undefined. |
Pardon my ignorance, but what is a "novel value". Didn't manage to get something good out of google. |
What I'm trying to get at is that some functions must return aliases, for example indexing one element out of an array of arrays. Those functions must return a value that already exists somewhere; they don't compute anything new. Functions without this property have undefined aliasing behavior. |
Ok, I get it. Thank you! |
This strikes me as not desirable. Value-dependent "unstable" aliasing is just asking for subtle bugs. We don't have a uniform policy on this stuff yet, but I think we're going to need to settle on something for the sake of predictability. |
While it does sound bad for anything to be "undefined", sometimes the alternatives are worse. At issue is code like this:
If we go down the path of defined aliasing behavior, we are instructing people to write code like this, and to look up the aliasing behavior of But even worse, enacting this policy has a lot of overhead. Ensuring alias-stability is yet another thing everybody writing julia code will have to worry about, and take care to test and document, all for the sake of the marginal use-case of mutating random values in the middle of a computation. And, as I have argued above, in the case of higher-order functions it is nearly unworkable. If aliasing behavior is undefined, the bug surface area is limited to code that does mutation, which is the uncommon case. If we go the other way, the bug surface area is everywhere, still including mutation code, since it needs to make accurate aliasing assumptions (presumably by consulting the docs). |
I'm not sure I have a clear opinion yet on whether it should be part of the contract of the function itself, however I'm pretty sure it should be stable for a given tuple of argument types. Value dependent is just asking for trouble IMO, because no amount of test can test for all value special cases whereas the space of types that you care about you already probably explore. |
👍 to the idea that Julia's preferred mode of operation is non-mutating. The problem seems to come from the interface between two programming models -- mutating and non-mutating. If there is an algorithm where one needs to have a new copy of a value, then it should be possible to write this algorithm in an exclusively mutating style. For example, instead of a = f(b)
mutate!(a) you might write a::Type
f!(a, b)
mutate!(a) which makes it clear that |
If this use case were marginal, we wouldn't need to do it so often. Unless you restrict yourself to code that only uses immutable types or never modifies mutable objects, this is a real issue - you need to know what you're writing to and which bindings will be modified. Designing API's such that you have to read not just docs, but also source for all methods you might call on any mutable objects, in order to answer these questions doesn't seem like something we should try to do intentionally. |
Yes, reading the source would be even worse. That's just a special case of a very general principle that you shouldn't couple your code to implementation details of libraries. If you find yourself reading library source hopefully it raises a red flag in a way that reading docs doesn't. I'm not saying mutation is rare. Rather I think the vast majority of uses of mutation are obviously safe, e.g. allocating an array and using only mutating functions on it as @eschnett describes. |
I'm also in the non-mutating-by-default camp, but I can think of a few problems where
|
That's a good example. It's similar to the case of an iterator that has a length if the iterator it wraps does. Maybe the problem generators could belong to a type hierarchy that has a default |
See JuliaDebug/Gallium.jl#52. This seems like the only reasonable definition for copying nothing. I couldn't figure out a good place to put this and I wonder if we shouldn't make copying singletons (i.e. instances of types with no fields) work automatically like this, although I'm not sure how.