-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Preserve type on integer arithmetic #8028
Conversation
This would be a good time to "genericize" the code in |
The inevitable fight during bootstrap makes "boxing" an eerily appropriate term :) |
@StefanKarpinski: Do you mean by "genericize" to have for loops over the integer types generating the respective functions? Regarding the spec: Well, |
Yes, that's what I meant. Glad you understood my very vague description of what I meant :-) |
I think "genericize" also meant using type parameters. Last I looked, you could have collapsed a lot of the methods down to just one with a |
Ok I generalized some of the code, which was mostly hitting I have not yet looked at the issue with the unicode functions. If some unicode expert here has a hint this would be great. |
I might have missed something, but do we really want to use -{T<:Integer}(a::T) = [...] instead of for T in (Int8, Int16, Int32, Int64, Int128, [...])
-(a::T) = [...]
end It seems to me like these definitions rely on implementation details of the integers, rather than a generic |
@ivarne: No you have not missed something. This is just me digging into a new area of base where I have no experience yet. Would it be an alternative to define some union like the |
ok I have fixed the utf16 code and the entire test suite runs successfully now. I have further introduced the |
I have changed the title from WIP to RFC because the test suite passes and beside the utf16 change the transition was mostly straight forward. Implementing the |
I wouldn't merge this, though, until the issues with Also: while I think this is great and I am truly excited about it, strategically I wonder if we should wait a bit before merging this. This could well be the most painful breaking change in 0.4. With the possible exception of @quinnj 's awesome Dates addition (and |
Tim, sure there is absolutely to haste in doing so and I am fine letting this laying around for some time. I did this for fun and because I have argumented at some places in the past for this change. So why just argumenting and not making a PR? However, independent of this PR, lets please merge things more early in the 0.4 cylce. In the past 8 month master was basically on hold (with the exception of the REPL) because many people thought that 0.3 would be released soon. So I would say once 0.3 is out: Merge all the important things to master even if master is unstable for two month and breaks every package out there. Once we move from It is of course great to keep master stable but effectively 0.3 was a little like a rolling release and people where encouraged to use 0.3pre instead of 0.2 because the later is outdated and some packages where not functional on 0.2. |
And it would indeed be a good idea to point to a stable branch in the "Source Download and Compilation" section of the README.md. |
I added this to |
Ignoring |
It is also a change that is likely to give silent errors that does not trigger test failures. If you don't explicitly test for overflow, you will often not get a test failure. |
Good point. |
@ivarne: Is my implementation with the Regarding the reduction methods: This is about |
I'm not sure it is complete but reading base/reduce.jl suggests:
|
Thanks! Regarding the "danger" of this PR I don't expect major breakage. We currently have
and this PR is mostly about consistency. And if I am wrong and everything breaks and nobody likes the change it is better to determine this early rather than later. |
No, it's going to be hugely breaking. I'm not arguing we shouldn't do it, I'm just saying we need to be realistic. The real problem is that (1) people probably haven't written tests to catch this, and (2) tests mostly look like this:
Even if you had written that test with |
@tknopp, you raise some good points. As long as we have new versions of the reduction functions in place, I won't object to merging this. |
Reductions should default to using Int or larger for accumulation. Reducing mod 2^8 is not a commonly useful operation. The type stability arguments for + don't really apply to sum. |
Let me expand on that reasoning a bit: the reason you want |
@StefanKarpinski that's how I felt too, but then I started changing my thinking in #8028 (comment). Naturally I still find your arguments pretty compelling. What do you think? |
Why not just implement reductions with an arbitrary accumulator type and then default the accumulator type to |
Please note that I am also unsure what the right thing is for the reductions. When I prepared the PR the aim was to remove all special casing to reach consistency. And type stability is indeed the main motivation behind all this. I am also fine to question the hole idea again. Now that we have this branch, people can try it out and look if they are fine with the idea. |
No, we should definitely do this. @StefanKarpinski, the reason I made those points is that there seemed to be some thought that it would be OK to merge these changes before fixing the reduction functions. |
I am fine reverting the reduction changes for the moment. It was even not intentional. I had remove |
convert{T<:Signed64}(::Type{T}, x::Float32) = box(T,checked_fptosi(T,unbox(Float32,x))) | ||
convert{T<:Signed64}(::Type{T}, x::Float64) = box(T,checked_fptosi(T,unbox(Float64,x))) | ||
convert{T<:Unsigned64}(::Type{T}, x::Float32) = box(T,checked_fptoui(T,unbox(Float32,x))) | ||
convert{T<:Unsigned64}(::Type{T}, x::Float64) = box(T,checked_fptoui(T,unbox(Float64,x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these have an oddity (shared with many BLAS methods) that they don't specify quite what you intended, which could lead to incorrect answers / incorrect dispatch, since the type signature Type{TypeVar(:T, Signed64)}
also matches any union-of-types Type. For example:
julia> Type{Union(Int8,Int16)} <: Type{TypeVar(:T,Base.SmallSigned)}
true
Instead, what you really want is:
typealias Signed64Types Union([Type{T} for T in Signed64.types]...)
convert{T<:Signed64Types}(::T, x::Float32) = ...
(this change also needs to be made to many of the BLAS/linalg functions in base – replacing Type
with Array
– to avoid dispatching to the wrong method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash: Ok, I don't understand this. But this maybe because I do not understand the difference between Int32
and Type{Int32}
. Is the difference described in the manual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash I don't think that way of specifying type in convert
is specific to BLAS. grep -r "::Type{T}" base
gives quite a bit output so I'd say it has been pretty standard.
Is the problem you mention something you have seen in practice? The old, but wrong, way is a bit nicer to the eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is frequently correct. However, sometimes, such as when interacting with ccall or other IntrinsicFunctions, it is probably not. For example, the following method match was probably not expected by the author:
julia> @which convert(Union(Int32, Int64), float16(2))
convert{T<:Integer}(::Type{T<:Integer},x::Float16) at float16.jl:103
And the author almost certainly was not expecting this:
a = Array(Union(Float32,Float64),1)
@which copy!(a,1:1,a,1:1)
copy!{T<:Union(Float32,Complex{Float64},Float64,Complex{Float32}),Ti<:Integer}(dest::Array{T<:Union(Float32,Complex{Float64},Float64,Complex{Float32}),N},rdest::Union(UnitRange{Ti<:Integer},Range{Ti<:Integer}),src::Array{T<:Union(Float32,Complex{Float64},Float64,Complex{Float32}),N},rsrc::Union(UnitRange{Ti<:Integer},Range{Ti<:Integer})) at linalg/blas.jl:828
generally, these eventually get caught somewhere else before turning into a ccall parameter.
@tknopp x::Int32
asserts that x is of type 32-bit integer. x::Type{Int32}
asserts that x is Int32
, the type of a thing that is a 32-bit integer.
@quinnj Could you elaborate on your comment in #7291 (comment) if this PR is a step in the right direction? According to @ivarne your tests pass on this branch. |
Sure, it basically came down to promote_type(Uint8,Uint16) == Uint64 on 64-bit while on 32-bit promote_type(Uint8,Uint16) == Uint32 While on your branch both promote to Additionally, these tests were kind of particular because I specifically needed to compare the promoting operations to On another note, It seems like your branch hasn't incorporated the Dates module yet. It'd be great if you could rebase it in to make sure everything's happy over there, though I would imagine it'll be ok since everything gets pushed to Int64 pretty quick. |
Thanks @quinnj. I had observed a similar issue when looking at the factorial code (see #6579 (comment)). When one has a pattern like
the function would become unstable if an
|
convert{T<:SignedUpto64}(::Type{T}, x::Float32) = box(T,checked_fptosi(T,unbox(Float32,x))) | ||
convert{T<:SignedUpto64}(::Type{T}, x::Float64) = box(T,checked_fptosi(T,unbox(Float64,x))) | ||
convert{T<:UnsignedUpto64}(::Type{T}, x::Float32) = box(T,checked_fptoui(T,unbox(Float32,x))) | ||
convert{T<:UnsignedUpto64}(::Type{T}, x::Float64) = box(T,checked_fptoui(T,unbox(Float64,x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cross-reference: #8028 (comment)
@JeffBezanson @StefanKarpinski any thoughts on this observation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have partly understood the issue but I think it is worth opening a new issue dedicated to this problem, especially if this is used more often in base.
From this dedicated issue it would be great if we could get a recommendation how to implement this. Using the for loop with eval or using what Jameson proposed in #8028 (comment)
The motivation to change these lines were actually to remove the eval
because I have read several messages by @vtjnash that eval
should be rarely used.
@quinnj I have rebased and with Dates included it still passes the test. |
superseded by #8420 |
This is my try to make
Int8 + Int8 = Int8
. It would fix #3759 which now has the 0.4-projects tag indicating that there is hope that this gets accepted :-)I had to fix some tests and the unicode test still fails with