Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itrunc for integer truncation is a bad pun #8646

Closed
StefanKarpinski opened this issue Oct 10, 2014 · 20 comments
Closed

itrunc for integer truncation is a bad pun #8646

StefanKarpinski opened this issue Oct 10, 2014 · 20 comments

Comments

@StefanKarpinski
Copy link
Member

I brought this up immediately, even before the itruncopalypse change was merged, but using the itrunc function like this is a complete abuse and exactly the kind of operator punning that we've tried so hard to avoid in Julia. English happens to use the word "truncate" for both converting large integer types to smaller integer types by lopping off their high bits, and mapping floating-point values to integral values by lopping off the bits after the decimal point. However, these are completely different behaviors and cramming them into the same function is a bad pun. Unfortunately, this wasn't fixed immediately and now everyone's using itrunc for this everywhere. I'm working on trying to change this, but I'm afraid is going to cause almost as much trouble as the original change did since everyone will have to change the usages of itrunc they just added to something else. I've suggested mod(n::Int, Int8) for this.

@johnmyleswhite
Copy link
Member

Maybe time to use coerce?

@StefanKarpinski
Copy link
Member Author

Possible, but it's a little vague. This really is a modulus operation.

@JeffBezanson
Copy link
Member

I agree with this issue, but I think modding something by a type is also a bit strange. itrunc at least already had forms with a type as the first argument. LLVM calls this operation trunc. I'm not sure what other standard names are; the C tradition of "casting" for this certainly obscures things.

@StefanKarpinski
Copy link
Member Author

The other option is that we could say that itrunc truncates both high and low bits, keeping only the bits that correspond to bits that fit in the output type. That covers the current behavior. We'd have to make sure that calling itrunc on large floating-point values behaves correctly, but that's certainly doable.

@jiahao
Copy link
Member

jiahao commented Oct 10, 2014

quotient?

It looks like itrunc(T, x) has the structure of a quotient group: given a number x and a domain T, find y ∈ T such x is in the same equivalence class as y modulo bit truncation.

@JeffBezanson
Copy link
Member

@StefanKarpinski That's not a bad idea; itrunc on large floats sometimes gives typemin currently, which is not very helpful.

@StefanKarpinski
Copy link
Member Author

It is minimally disruptive and makes this no longer a pun.

@JeffBezanson
Copy link
Member

One downside is that itrunc is currently the fastest way to go from float to int since it has no checks at all. Well-defined large float behavior requires extra instructions. However it's still probably the right thing to do.

@StefanKarpinski
Copy link
Member Author

It can probably be implemented with a couple of integer ops: decode the exponent, shift and mask.

@StefanKarpinski
Copy link
Member Author

Hmm. Handling negative floats really makes this hard.

@simonbyrne
Copy link
Contributor

I still think the itrunc([T,] ::FloatingPoint) should match the behaviour of iceil, ifloor and iround in how it handles out-of-range values (i.e. throwing InexactErrors). However this would seem to rule out the itrunc(T, x::Integer) behaviour, as otherwise itrunc(Int32,2.0^32) and itrunc(Int32,2^32) would give very different behaviour (as indeed they already do).

So how about an even crazier idea: deprecate itrunc, iceil, ifloor and iround with (::FloatingPoint) or (::Type, ::FloatingPoint) arguments in favour of plain trunc, ceil, floor, and round called with (::Type, x::FloatingPoint), e.g. trunc(Int, 1.2). The current (fast) behaviour of itrunc could be exported via unsafe_trunc. Or accessed via a general @unsafe macro (similar to @inbounds).

This would then leave itrunc free to be used for the proposed modulo behaviour (though I quite like the mod idea as well).

@StefanKarpinski
Copy link
Member Author

I like the idea of folding itrunc(x) into trunc(Int,x). For a bit of historical perspective, iirc, we introduced itrunc before we had singleton types like Type{Int} or the ability to specialize methods on them, so the only option we had was to give the function a different name from trunc.

@StefanKarpinski
Copy link
Member Author

In particular, I think the argument that we should have itrunc(T, x) == itrunc(T, y) for x == y is crucial. As you note, this currently does completely different things for different types of x and y even when they're equal values.

@stevengj
Copy link
Member

My inclination is that uint8(x) should be coercing and Uint8(x) == convert(Uint8, x) should throw an error if the conversion is inexact. Isn't lowercase==coercing the de-facto convention? That was my suggestion in #1470.

@StefanKarpinski
Copy link
Member Author

That's fine but what's the generic version? I also think it may be better to be explicit about how you are mapping one type to another – rounding, truncation, modulus, etc.

@stevengj
Copy link
Member

I think the generic coercing version should be modulus (for converting a wider integer to a narrower integer) or bitwise copy (for signed to unsigned of the same width). i.e. "do an exact bitwise copy of as many bits as you can, starting from low-order bits", ala C.

I think functions like iround and itrunc are well-defined mathematical operations that should throw an InexactError if the result cannot be exactly stored in the requested type. It seems to me that truncated coercion is the sort of thing that would be rarely needed; it's not clear to me why we need to provide this function in Base.

@StefanKarpinski
Copy link
Member Author

I think the generic coercing version should be modulus (for converting a wider integer to a narrower integer) or bitwise copy (for signed to unsigned of the same width). i.e. "do an exact bitwise copy of as many bits as you can, starting from low-order bits", ala C.

This would seem to indicate that you support the behavior of my modpocalypse branch, i.e. the generic function that's used for "truncating" integer types being mod. Note that the bitwise copy between twos complement signed type and an unsigned type of the same size is the same as a modulus operation – abstractly you're lifting each value to it's equivalent class and then choosing a different set of representatives for those same equivalence classes.

It seems to me that truncated coercion is the sort of thing that would be rarely needed; it's not clear to me why we need to provide this function in Base.

I agree that one generally does not want both behaviors at the same time, but it is a single operation that has both kinds of truncation as limiting cases, thereby retroactively justifying the use of itrunc.

@stevengj
Copy link
Member

What is the use-case for truncated coercion that would not be better served by i > 255 ? foo : bar?

@JeffBezanson
Copy link
Member

The problem with using "lowercase" functions for coercion is that sometimes you have the type; i.e. you know you want to truncate an integer down to type T. In our code, this tends to be used for bit-level kinds of functions like hashing and random number generation.

@vtjnash
Copy link
Member

vtjnash commented Oct 16, 2014

Spitballing slightly here, but what if we rolled this into an Base.Unsafe.Convert function (which defaults to calling Base.convert). In the user-defined bounds check PR I outlined this idea some. I can expand more when I get to a computer if this sounds good to someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants