Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to infer expression type when adding u32 to if-else number literals #137

Open
thejoshwolfe opened this issue Apr 12, 2016 · 17 comments
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Milestone

Comments

@thejoshwolfe
Copy link
Contributor

thejoshwolfe commented Apr 12, 2016

pub fn foo(b: bool) u32 {
    return u32(0) + if (b) 0 else 0;
}
comptime {
    _ = foo;
}
test.zig:2:28: error: cannot store runtime value in type 'comptime_int'
    return u32(0) + if (b) 0 else 0;
                           ^
@thejoshwolfe thejoshwolfe added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Apr 12, 2016
thejoshwolfe added a commit that referenced this issue Apr 12, 2016
@andrewrk
Copy link
Member

Also this should work:

const x = u32(switch (foo) {
    3 => 1,
    4 => 2,
    else => 5,
});

@pavildon
Copy link
Contributor

pavildon commented Sep 26, 2016

i had some problems when defining consts and then using them to make values for variables :
something like this would fail:

const TINY_QUANTUM_SHIFT = 4;
const TINY_QUANTUM_SIZE = 1 << TINY_QUANTUM_SHIFT;

var block_aligned_stuff: usize = (@sizeOf(variable) + TINY_QUANTUM_SIZE) & ~(TINY_QUANTUM_SIZE -1 );

i resolved this casting the const to the appropriate type every time i need to use them.

in the code example here, i think the compiler does not know what type to use for TINY_QUANTUM_SIZE, so it cant do the negation and the and operation on 32 nor 64 bits, same with any other bitwise operator, I would be a good idea to define literals as java/c++ does :

const sizevalue = 3123UL; // to be used as a unsigned long
const floatvalue = 1.0f;  // to be used as a float
const doublevalue = 1.0; // to be used as a double

// etc..

@andrewrk
Copy link
Member

The Zig equivalent of 3123UL is usize(3123). You can do either of these:

const size_value = usize(3123); // option 1
const size_value: usize = 3123; // option 2

I just tested your code example @Aindigo and it crashed the compiler, so I'll take a look at this. (See #201)

@andrewrk
Copy link
Member

@thejoshwolfe I don't currently have an elegant way to solve this problem. Do you know how it should work? Currently there is no concept of "expected type". Instead there is implicit casting (passing null to ?i32 parameter) and peer type resolution (a + b where a and b are i16 and i32).

fn foo(b: bool) {
    const a = if (b) 1 else 2;
    bar(a)
}

What should the above code do?

@dkopko
Copy link

dkopko commented Jan 17, 2017

My guess would be that the type of a should be the narrowest signed type able to satisfy storage needs of both clauses of the if/else. If bar(a) then accepts a wider type, a can be extended. So literals can be parsed under the widest type but then narrowed in terms of the type they are then given.

@andrewrk
Copy link
Member

My guess would be that the type of a should be the narrowest signed type able to satisfy storage needs of both clauses of the if/else.

Consider this code:

fn foo(condition: bool) {
    const a = if(condition) 50 else 100; // narrowest signed type is i8
    const b = a + 100; // crashes the program due to integer overflow
}

This seems like it would be really easy to accidentally cause a crash (or undefined behavior in release mode). Also, I'm considering adding integers with arbitrary bit width, so then it would make the choice kind of weird. If the numbers are 0 and 1, do we choose only 1 bit?

@thejoshwolfe
Copy link
Contributor Author

Despite the const in @andrewrk's example, a is actually a runtime variable; it's just restricted to single assignment. I think an explicit type should be required somewhere in const a = if (runtimeThing) 1 else 0; just like it is for var a = 0; if (runtimeThing) { a = 1; }. This would be consistent with zig as it is today.

Unfortunately, the proposal in the above paragraph would mean that the expression if (runtimeThing) 1 else 0 needs an explicit type too, because that expression has a runtime-known value. I don't like this idea, because it seems so silly to explicitly put a type on 0 and 1.

Java has a solution to this dilemma, which is assume number literals are s32. this works great for casual math, but breaks down when you deal with arithmetic that would overflow s32 in subtle ways, and it's also pretty silly when you try to pass a 0 as a byte argument: you get a "int not implicitly castable to byte" compile error unless you explicitly cast (byte)0. (maybe java 7 or 8 fixes this; I haven't been keeping up with the latest java.)

So here's a different proposal. Similar to the integer literal type that zig currently uses at compile time for constants like const a = 300;, we also have a "integer that isn't known at compile time, but its bounds are known at compile time".

  • The type of if (runtimeThing) 1 else 0 would be "integer between 0 and 1".
  • The type of if (runtimeThing) -1 else 0 would be "integer between -1 and 0".
  • The type of (if (a) 50 else 100) + (if (b) -25 else 100) would be "integer between 25 and 200".
  • The type of (if (a) -1000 else -1001) + 1001 would be "integer between 0 and 1".

Note that despite the presence of negative numbers in an expression, the concept of signed or unsigned in the inputs does not infect the output. When it comes time to implicitly cast one of these bounded integers to a runtime type, like u8, then the rule is simply that both the bounds have to fit into the destination type. For peer type resolution with a runtime type (like u32), the bounded integer must implicitly cast to the peer type.

A suggestion for implementing this is that the current concept of number literals can be implemented as "integer between 15 and 15". This would fit nicely into a case where you multiply a bounded integer by 0.

I tried to write down the algorithms for how the (min,max) would be calculated for various operators, but the multiplicative ones are pretty tricky. Worst case for any operator is simply calculate all 4 possible values and pick the min and max of the set. This entire proposal would only work on the assumption that more extreme inputs result in more extreme outputs for all operators, which I believe is true as long as overflow is not allowed. We may want to carefully study every single operator individually to make sure this is true.

The above describes how the compiler would track the bounds of the values, but what about the generated runtime code? All operators need types at runtime. We can prevent overflow by enforcing that the inputs and output of every intermediate operation fit into some runtime type, and then using appropriate operators and type casts to make it all work. I'm not super confident about this, but I think it will work. One implication of this is that bounded integers are not allowed to be arbitrary precision; they all need to work at runtime. This might be an argument for not implementing number literals as bounded integers with equal upper and lower bounds.

This proposal cannot accurately predict when the runtime will do integer division by 0. But as long as that's undefined behavior, we shouldn't have any problem with an inability to detect that. (Of course, we could still detect if you would definitely divide by 0, just like we can now.)

@dkopko
Copy link

dkopko commented Jan 23, 2017

Asked to put this here by Andrew... Examples of possibly troublesome type unifications.

function f(runtime_x: int8_t) -> ??? {
    var a = if (runtime_x > 5) then 0 else 255;  //typeof(a) can unify to a uint8_t
    var b = if (runtime_x == 9) then -2 else 8;  //typeof(b) can unify to an int8_t
    var c = if (runtime_x < 5) then 0 else 10;  //would typeof(c) unify to a uint8_t or an int8_t?
    var d = if (runtime_x < 5) then 0 else 250 + runtime_x;  //would typeof(d) unify to a uint8_t or an int16_t/uint16_t? (i.e. is it smart enough to recognize the constraint met by the if clause?)
    var e = if (runtime_x < 30) then runtime_x else 250 + (runtime_x % 5);  //would typeof(e) unify to a uint8_t or an int16_t/uint16_t?  (i.e. is it smart enough to recognize the constraint met by the modulo expression?)

    return a;  //or b, or c, or d, or e
}

@andrewrk andrewrk added this to the 0.2.0 milestone Apr 23, 2017
andrewrk referenced this issue in tiehuis/zig-bn May 25, 2017
 - bitlen
 - abs
 - neg
 - isZero
 - sign
 - popcount
@andrewrk
Copy link
Member

andrewrk commented Nov 8, 2017

I want to note that rust has not solved this problem, because their number literals are of type i32:

fn thing(x: i32) {
    let y = 2147483646 + match x {
        3 => 1,
        4 => 2,
        _ => 3
    };
    println!("{}", y);
}

fn main() {
    thing(1);
}
andy@xps ~/tmp> rustc test.rs
andy@xps ~/tmp> ./test
thread 'main' panicked at 'attempt to add with overflow', test.rs:2:13
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Zig at least catches this with a compile error.

@markfirmware
Copy link
Contributor

markfirmware commented Apr 20, 2019

if (c) ‘a’ else ‘b’

does not resolve

if (c) @intCast(u8, ‘a’) else ‘b’

does

I think I was stuck on assuming a character literal was u8 since a string literal is [*]const u8.

error: values of type ‘comptime_int’ must be comptime known

Might be better worded that the type must be known. The value is obviously known since it is literal.

FYI the elm programming language is exemplary in compiler error messages that provide guidance.

@hryx
Copy link
Contributor

hryx commented Jul 7, 2019

Related: #2749

@Tetralux
Copy link
Contributor

Related:

var i: isize = 4; // adjust
var x: E = if(i < 0) .A else .B

where E is an enum.

@andrewrk andrewrk modified the milestones: 0.5.0, 0.6.0 Sep 5, 2019
@andrewrk andrewrk added the stage1 The process of building from source via WebAssembly and the C backend. label Nov 27, 2019
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Dec 9, 2019
@Mouvedia
Copy link

Mouvedia commented Sep 30, 2020

I think I was stuck on assuming a character literal was u8 since a string literal is [*]const u8.

I presumed the same as well; you are not alone. This is unexpected and not consistent.

@andrewrk andrewrk added frontend Tokenization, parsing, AstGen, Sema, and Liveness. and removed stage1 The process of building from source via WebAssembly and the C backend. labels Oct 9, 2020
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 9, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 Jun 4, 2021
@acarrico
Copy link
Contributor

acarrico commented Sep 2, 2021

Yikes, a 5+ year old issue! Doesn't every newcomer run into this on the first day?

My recommendations for addressing this issue in priority order:

  1. Please issue a better error message.
  2. Please add a syntax for type annotation.
  3. Please fix the type checker.

The following code "day one" code breaks in Zig:

const oflags : u32 =
                std.os.O_CLOEXEC
                | std.os.O_NOATIME
                | (if (flags.read_only) std.os.O_RDONLY else (std.os.O_RDWR | std.os.O_CREAT));

The current error, cannot store runtime value in type 'comptime__int', makes no sense: the error occurs inside a pure expression, there is no storage. Unfortunately, the most likely naive interpretation is that this error message refers to the initialization of the constant (conflating 'comptime' with 'const', and 'storage' with 'initilization'). This creates a garden path which misinforms beginners that, "constants can't be initialized with runtime expressions" (untrue). In fact Zig does support such expressions, the real issue is that the compiler (currently) requires a type annotation on some literals with unresolved types:

const oflags : u32 =
                std.os.O_CLOEXEC
                | std.os.O_NOATIME
                | (if (flags.read_only) @intCast(u32, std.os.O_RDONLY) else  (std.os.O_RDWR | std.os.O_CREAT));

This error is surprising, since the programmer understands that every leaf in the expression tree is a comptime__int, and every node in the expression legally combines comptime_ints.

Even a slight variations such as annotating the if expression will currently fail. Apparently two comptime_ints just can't pass through a conditional. This just seems like a bug in the type checker (if should remain polymorphic on comptime_int). While this issue remains, the error message should be improved:

  • Don't mention storage in a pure computation--this is very confusing.

  • Do indicate that a type annotation is sometimes necessary on literals--the current compiler even has a little green arrow pointing to exactly where the annotation is needed.

Finally, it should be possible work around this issue without a cast, but I could not find a syntax for type annotation in the Zig docs. Something like this:

const oflags : u32 =
                std.os.O_CLOEXEC
                | std.os.O_NOATIME
                | (if (flags.read_only) (std.os.O_RDONLY : u32) else  (std.os.O_RDWR | std.os.O_CREAT));

Type annotations can be a very useful sanity checks for a programmer, even when they are not explicitly required by the compiler.

Sorry for the long message. Thanks for all the hard work! I'm enjoying exploring Zig.

@leecannon
Copy link
Contributor

Regarding "Finally, it should be possible work around this issue without a cast, but I could not find a syntax for type annotation in the Zig docs.".

@as allows this:

const oflags =
    std.os.O_CLOEXEC |
    std.os.O_NOATIME |
    if (flags.read_only) @as(u32, std.os.O_RDONLY) else (std.os.O_RDWR | std.os.O_CREAT);

@acarrico
Copy link
Contributor

acarrico commented Sep 2, 2021

Good. In the semantics of Zig, I was wrong to say literals are unresolved types (although maybe they should be), so a cast does makes sense, and @as is certainly better than @intCast. Thank you for the helpful comment.

Type annotation would be a useful tool as an assertion to make sure the programmer and the type checker agree on what is going on without casting, but that is a separate issue.

@g-w1
Copy link
Contributor

g-w1 commented Sep 2, 2021

@as is basically a type annotation. It is not a cast, it will only coerce things that can coerce normally.
unresolved literal ints are of type comptime_int

@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 21, 2021
@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Jun 28, 2022
@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness.
Projects
None yet
Development

No branches or pull requests