Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement copy(::Void) = nothing #15546

Closed
wants to merge 1 commit into from
Closed

Conversation

StefanKarpinski
Copy link
Member

See JuliaDebug/Gallium.jl#52. This seems like the only reasonable definition for copying nothing. I couldn't figure out a good place to put this and I wonder if we shouldn't make copying singletons (i.e. instances of types with no fields) work automatically like this, although I'm not sure how.

@JeffBezanson
Copy link
Member

In the linked issue, it seems the presence of nothing was actually a bug, so the lack of a copy method helped catch it sooner. Given that this hasn't come up much, I would err on the side of not adding the method.

@StefanKarpinski
Copy link
Member Author

The bug that was uncovered was not actually the fact that the copy was happening, but that the wrong value was being returned in the first place, so technically, the copy failing just prevents you from seeing the real problem. I don't care much but copying nothing should fairly obviously produce nothing.

@JeffBezanson
Copy link
Member

I disagree. If you have copy(f(x)), and f returning nothing is a problem, then failing in copy catches the problem sooner than having copy propagate the value. I think if some code is bothering to copy something, it's expecting a mutable collection.

@StefanKarpinski
Copy link
Member Author

So copy should fail on anything that's not a mutable collection? That's currently very far from the case:

julia> copy(1)
1

julia> copy(1.5)
1.5

julia> copy((1,2.5))
(1,2.5)

julia> copy("foo")
"foo"

@JeffBezanson
Copy link
Member

Yes, I suspect a better change would be to remove those behaviors. Or, as you suggest, add a copy(x::Any) that allows no-op copying a larger class of values by default. Right now we're somewhere in between. I guess I would like to know if there are good use cases for copy(1) working.

@KristofferC
Copy link
Member

I would argue for keeping those behaviors. If you write a function that works with any index-able type it would be annoying if copy worked when you gave it a Vector but not a NTuple.

@JeffBezanson
Copy link
Member

Yes, but why are you copying it in that case? If you're copying it in order to mutate the result, it still won't work on an NTuple.

@StefanKarpinski
Copy link
Member Author

I would guess that removing these methods would break a lot of code.

@JeffBezanson
Copy link
Member

I agree that's possible but I'd like to know why. What is that code doing? It would be very interesting to try it and see exactly what breaks.

@mbauman
Copy link
Member

mbauman commented Mar 18, 2016

Ref #9246 and #9270, where I went down a very similar path... Also stemming from some package code that hit a method error for copy(::Date).

@JeffBezanson
Copy link
Member

I'm trying this and found the first interesting data point: #3037

For example power_by_squaring(x, p) is the identity function when p==1. It was requested to call copy in this case, so that "computing something" always involves returning a new object. This in turn requires defining copy for integers and other immutable things.

In that issue I said "I'm fine with adding a copy in those cases." Well, that was 3 years ago, and I no longer find this line of thinking amusing. Are we really going to write copy(x) every frigging time the answer is just x?

@StefanKarpinski
Copy link
Member Author

The reasoning for the power_by_squaring case remains the same: you don't want to return x, you want to return a value equal to x but not identical to it – which can be distinguished when the argument is mutable. I'm not sure what you're proposing instead. That power_by_squaring(x,k) simply return x when k == 1? Then it becomes unpredictable whether mutating A or A^k after computing A^k is safe or not – it depends on the specific value of k with 1 being a special and dangerous value.

@JeffBezanson
Copy link
Member

I don't think a pure function should ever need to call copy. Should we define

identity(x) = copy(x)

?
The fact that this leads to needing copy methods for all sorts of immutable values is part of the problem.

@StefanKarpinski
Copy link
Member Author

So you think it's totally ok that A and A^k do or don't alias each other depending on the value of k?

@mbauman
Copy link
Member

mbauman commented Mar 18, 2016

We need a new term here: alias-stable.

identity is just fine because it is alias-stable. Without that copy call, power_by_squaring would be alias-unstable, depending upon both the type of the first argument and the value of the second.

@JeffBezanson
Copy link
Member

If we want to continue down the copy path, we're going to need a lot more calls:

julia> a = Vector{Int}[[1]];

julia> reduce(+, a) === a[1]
true

julia> a = [1];

julia> convert(Vector{Int}, a) === a
true

@StefanKarpinski
Copy link
Member Author

See @mbauman's comment. Identity is definitely fine since the result always aliases the argument. I don't know about convert but at least whether the result aliases or not is predictable from the types of the arguments and not their values, which is the situation with power_by_squaring.

@JeffBezanson
Copy link
Member

I don't like where this is going. Consider my reduce example above. It's alias-unstable. We need to add a call to copy. But what if I change the example to this:

julia> a = Vector{Int}[[1]];

julia> reduce(tuple, a) === a[1]
true

If we're going to make tuples instead of doing addition, suddenly it doesn't make sense to copy the argument. We need to know whether the argument function returns aliases, or maybe structures that indirectly contain aliases...

@StefanKarpinski
Copy link
Member Author

Please, just answer the question about power_by_squaring – it doesn't commit you to anything, but we can't have a real discussion if you just keep ignoring that point. The reduce examples would be considerably clearer if you gave both aliasing and non-aliasing cases.

@JeffBezanson
Copy link
Member

Apologies, I did not intend to ignore the point. I advocate just removing the call to copy; I thought that was clear.

Reduce examples:

julia> a = Vector{Int}[[1]];

julia> reduce(+, a) === a[1]    # aliases
true

julia> a = Vector{Int}[[1],[2]];

julia> reduce(+, a)    # new value; doesn't alias anything
1-element Array{Int64,1}:
 3

julia> a = Vector{Int}[[1]];

julia> reduce(tuple, a) === a[1]    # also aliases; `tuple` is never called
true

julia> a = Vector{Int}[[1],[2]];

julia> reduce(tuple, a)    # returns a novel value, but containing aliases to `a`
([1],[2])

@JeffBezanson
Copy link
Member

The point of this example is that you can "fix" the case of + by adding a call to copy when there is one element in the collection, but if the passed function is tuple you no longer expect any copies at all. This is why, when writing a pure function, I don't want to have to think about where to put in defensive copy calls.

@JeffBezanson
Copy link
Member

Found another interesting data point, from test/read.jl:

    to = IOBuffer(Vector{UInt8}(copy(text)), false, true)

text is a string. The intent seems to be to copy the data, but since strings are immutable copy is identity and nothing is actually copied. If copy were not defined the problem would have been caught.

@toivoh
Copy link
Contributor

toivoh commented Mar 19, 2016

@JeffBezanson: Do you see any way to fix reduce then so that it would behave correctly in both cases?

Regarding power_by_squaring maybe it's better to return the input times one when the exponent is one?

@StefanKarpinski
Copy link
Member Author

The reduce example strikes me as a red herring. The tuple example always returns a structure that aliases the input by the nature of the tuple function. The real concern with reduce is that the result aliases the input when reducing with + (for example) over a single value, but not over many values. But that's arguably an issue with the reduction function, +, not reduce itself.

The convert issue is a much tougher one because it does mean that it's hard to write generic code that converts a value to a different type and then mutates the result expecting not to modify the original. The way around the issue in the current scheme of things would seem to be to guard against identity like so:

b = convert(T, a)
a === b && (b = copy(b))

@StefanKarpinski
Copy link
Member Author

The advantage of having the caller make the copy is that the caller knows if copying is needed or not from its perspective. The disadvantage of expecting the caller to make the copy is that the caller may not know whether copying is needed or not from the callee's perspective – for example in A^k when k == 1 or in convert(T, a) when a is already of type T. Both approaches are suboptimal: in one case unnecessary copies may be made, in the other, it's far too easy to accidentally write subtly broken code due to unexpected aliasing. I tend to prefer erring on the side of inefficiency than incorrectness here, but a third approach that avoids the pitfalls of both of these two would be better.

@nalimilan
Copy link
Member

I agree that copy(::Void) should be supported. It is true that not providing this method can detect some bugs, but that's true of many other corner-case methods too (basically, all of those which are no-ops in some cases, like vec, full, float...).

The particular case of convert has already been discussed at #12441.

@mbauman
Copy link
Member

mbauman commented Mar 21, 2016

b = convert(T, a)
a === b && (b = copy(b))

That is type-unstable for some array types:

julia> function f{T}(::Type{T}, A)
           B = convert(AbstractArray{T}, A)
           A === B && (B = copy(B))
           B
       end
f (generic function with 1 method)

julia> @code_warntype f(Int, sub(1:10, 2:3))
Variables:
  #s14::Type{Int64}
  A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}
  B::Any

Body:
  begin  # none, line 2:
      B = A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1} # none, line 3:
      unless A::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1} === B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}::Bool goto 0
      B = (Base.copy!)((top(ccall))(:jl_new_array,(top(apply_type))(Base.Array,Int64,1)::Type{Array{Int64,1}},(top(svec))(Base.Any,Base.Any)::SimpleVector,Array{Int64,1},0,(top(getfield))(B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1},:dims)::Tuple{Int64},0)::Array{Int64,1},B::SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1})::Array{Int64,1}
      goto 0
      0:  # none, line 4:
      return B::Union{Array{Int64,1},SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}}
  end::Union{Array{Int64,1},SubArray{Int64,1,UnitRange{Int64},Tuple{UnitRange{Int64}},1}}

Personally, I think conversion is ok to sometimes-alias. We just need a more obvious way to write the non-aliasing method in a simple and type-stable way.

@JeffBezanson
Copy link
Member

I don't think reduce is a red herring, What should it do in the 1-element case? If it wants to "inherit" the aliasing behavior of the passed function it could call it with one argument, which could work if we had

+(x) = copy(x)

but this doesn't work for the tuple function, and we also might not want to require that any function passed to reduce also accept 1 argument.

The bottom line is that I don't think we can realistically institute a policy of asking every function to be alias-stable. I'm in the "programs should be 80% purely functional" camp. Alias-stability requires worrying about copying everywhere, when it's only a real issue in a minority of cases. Alias-stability is supposed to be convenient for the caller, but the caller needs some way to know which functions are aliasing, so you still need to think about it to some degree. And unlike type-stability, we have no tools for dealing with alias-stability and I don't wish to prioritize developing them.

I do think we should have an isalias(a, b) function that takes two arrays of any kind and tells you whether they share data in any way. That would be very handy. Algorithms that require distinct arrays could begin with @precondition !isalias(a,b).

@KristofferC
Copy link
Member

programs should be 80% purely functional

We should then make it easier to, for example, stack allocate mutable arrays alternatively make tuples not be a pain to use as arrays. The reason for 80% of my mutating functions is because creating julia arrays is very expensive and the only way I can get good performance is to preallocate and mutate. Sorry for the slightly off topic comment.

@JeffBezanson
Copy link
Member

I think the argument against copy(::Void) is stronger than just "it might help catch some bugs". For some functions, being a no-op is a perfectly legitimate implementation of the meaning of the function. If the contract for full is "return the data in a dense representation", then full(x::Array) = x is simply correct, and not covering any bugs. But copy is inherently tied to mutability. Without mutation, copy makes no sense.

Data point: Scala has both mutable and immutable collections, and the mutable collections implement clone but the immutable ones do not.

One case I've found so far where code was written to need copy(::Number) was in FFTW.jl. The user passes in a region argument that can be an array or a number. The function saves a reference to region, so it makes a "defensive copy". It's notable that the copy is just defensive and not actually needed. But it would also be easy to rewrite this code any number of ways, for example accepting only a tuple as the region, or intercepting a Number argument and wrapping it in an array first.

@StefanKarpinski
Copy link
Member Author

The actual issue at hand here is whether it makes sense to write generic code that copies immutable values and expect it to work. I'm not arguing that we should make copies of everything everywhere to ensure perfect alias-stability, I'm arguing that allowing copy to work on immutable values – in particular nothing – is not wrong or bad. In particular, I think that the case of power_by_squaring is a case where we should be alias-stable and thus where applying copy to mutable and immutable values makes sense.

@JeffBezanson
Copy link
Member

I could maybe see that copy is part of the collections API, and ok to provide on immutables for the sake of making defensive copies. Hence OK to provide for numbers, since they are iterable. But I don't see the case for copy(nothing).

@eschnett
Copy link
Contributor

copy_if_aliasing(a, b...)?

@StefanKarpinski
Copy link
Member Author

a, b = dealias(a, b) and generalized to arbitrarily many arguments?

@JeffBezanson
Copy link
Member

Yes that sounds quite good. I guess it can then make decisions like which arrays to copy, e.g. if one views a small part of another?

@KristofferC
Copy link
Member

I guess this falls in the same "alias unstable"-camp. Is there an "official" policy regarding things like this?

julia> S = sprand(5,5,0.5);

julia> pointer_from_objref(S)
Ptr{Void} @0x00007f8902bd5d20

julia> pointer_from_objref(S*UniformScaling(1))
Ptr{Void} @0x00007f8902bd5d20

julia> pointer_from_objref(S*UniformScaling(2))
Ptr{Void} @0x00007f8902bd6980

@JeffBezanson
Copy link
Member

IMO, the official policy is that the aliasing behavior of any function that might compute a novel value is currently undefined.

@KristofferC
Copy link
Member

Pardon my ignorance, but what is a "novel value". Didn't manage to get something good out of google.

@JeffBezanson
Copy link
Member

What I'm trying to get at is that some functions must return aliases, for example indexing one element out of an array of arrays. Those functions must return a value that already exists somewhere; they don't compute anything new. Functions without this property have undefined aliasing behavior.

@KristofferC
Copy link
Member

Ok, I get it. Thank you!

@tkelman
Copy link
Contributor

tkelman commented Mar 28, 2016

undefined aliasing behavior.

This strikes me as not desirable. Value-dependent "unstable" aliasing is just asking for subtle bugs. We don't have a uniform policy on this stuff yet, but I think we're going to need to settle on something for the sake of predictability.

@JeffBezanson
Copy link
Member

While it does sound bad for anything to be "undefined", sometimes the alternatives are worse. At issue is code like this:

x = f(y)
mutate!(x)

If we go down the path of defined aliasing behavior, we are instructing people to write code like this, and to look up the aliasing behavior of f in the docs to see if it will do what they expect. I would argue that this state of affairs, even if we can bring it about, is not highly desirable. Needing to look up the aliasing behavior of functions is fiddly, and encourages a very subtle kind of coupling. It would be very difficult to change or deprecate these behaviors.

But even worse, enacting this policy has a lot of overhead. Ensuring alias-stability is yet another thing everybody writing julia code will have to worry about, and take care to test and document, all for the sake of the marginal use-case of mutating random values in the middle of a computation. And, as I have argued above, in the case of higher-order functions it is nearly unworkable.

If aliasing behavior is undefined, the bug surface area is limited to code that does mutation, which is the uncommon case. If we go the other way, the bug surface area is everywhere, still including mutation code, since it needs to make accurate aliasing assumptions (presumably by consulting the docs).

@carnaval
Copy link
Contributor

I'm not sure I have a clear opinion yet on whether it should be part of the contract of the function itself, however I'm pretty sure it should be stable for a given tuple of argument types. Value dependent is just asking for trouble IMO, because no amount of test can test for all value special cases whereas the space of types that you care about you already probably explore.

@eschnett
Copy link
Contributor

👍 to the idea that Julia's preferred mode of operation is non-mutating.

The problem seems to come from the interface between two programming models -- mutating and non-mutating.

If there is an algorithm where one needs to have a new copy of a value, then it should be possible to write this algorithm in an exclusively mutating style. For example, instead of

a = f(b)
mutate!(a)

you might write

a::Type
f!(a, b)
mutate!(a)

which makes it clear that a doesn't alias b. Here I'm proposing a rule: a mutating function (with exclamation mark) must not introduce aliasing.

@tkelman
Copy link
Contributor

tkelman commented Mar 28, 2016

for the sake of the marginal use-case of mutating random values in the middle of a computation

If this use case were marginal, we wouldn't need to do it so often. Unless you restrict yourself to code that only uses immutable types or never modifies mutable objects, this is a real issue - you need to know what you're writing to and which bindings will be modified. Designing API's such that you have to read not just docs, but also source for all methods you might call on any mutable objects, in order to answer these questions doesn't seem like something we should try to do intentionally.

@JeffBezanson
Copy link
Member

Yes, reading the source would be even worse. That's just a special case of a very general principle that you shouldn't couple your code to implementation details of libraries. If you find yourself reading library source hopefully it raises a red flag in a way that reading docs doesn't.

I'm not saying mutation is rare. Rather I think the vast majority of uses of mutation are obviously safe, e.g. allocating an array and using only mutating functions on it as @eschnett describes.

@cstjean
Copy link
Contributor

cstjean commented Mar 31, 2016

I'm also in the non-mutating-by-default camp, but I can think of a few problems where copy being the identify is useful. For instance:

type ProblemSolver{T, U}
     ....
     next_problem_generator::T
     learner::U
end

Base.copy(ps::ProblemSolver) = ProblemSolver(copy(ps.next_problem_generator), copy(ps.learner))

next_problem_generator might be stateful, but it might also be stateless (solving the same problem over and over again), or nothing to trigger some default behaviour. Likewise, learner might be a "dummy" that doesn't actually learn anything.

@JeffBezanson
Copy link
Member

That's a good example. It's similar to the case of an iterator that has a length if the iterator it wraps does.

Maybe the problem generators could belong to a type hierarchy that has a default copy method? Globally adding copy(x) = x seems pretty extreme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants