Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Allow TypeOf to be used immediately after a name of a parameter is declared #6615

Closed
Rocknest opened this issue Oct 8, 2020 · 29 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@Rocknest
Copy link
Contributor

Rocknest commented Oct 8, 2020

Currently a function parameter is only visible to the @TypeOf after next parameter's scope starts.

fn example1(x: @TypeOf(x)) void {}
// error: use of undeclared identifier 'x'

fn example2(x: anytype, y: @TypeOf(x)) void {}
// works; when used as `example2(arg, arg)` it is equivalent to example1

I propose to allow to use @TypeOf after only a name of a parameter is declared. Can it be useful? Yes, that would allow to put type constraints in a function signature instead of a function body. This basically #1669 without the new syntax.

// status quo
fn isWriter(comptime T: type) bool {...}

fn write1(w: anytype, data: []const u8) void {
    if (!comptime isWriter(@TypeOf(w))) @compileError("wrong type");
    w.write(data);
}

// with this proposal
fn Constraint(comptime T: type, predicate: fn (type) bool) type {
    if (!comptime predicate(T)) @compileError("wrong type");
    return T;
}

fn write2(w: Constraint(@TypeOf(w), isWriter), data: []const u8) void {
    w.write(data);
}

What happens when i provide a type which is different from the type that @TypeOf returned? Compiler will check if the value of the argument is coerceable to the new type, if it is not then its a compile error.

Optional Part

If this proposal is implemented then these two expression will be equivalent:

fn example1(x: @TypeOf(x)) void {}
fn example2(x: anytype) void {}

This means we could remove keyword 'anytype'. x: @TypeOf(x) looks a bit verbose to me, see #1669 (comment) for an alternative builtin @Infer(), in short instead of overloading @TypeOf create a new builtin that implements proposed functionality x: @Infer(), x: Constraint(@Infer(), isWriter).

@marler8997
Copy link
Contributor

This is a pretty interesting idea. It took me a few minutes to see what you were getting at.

So this would mean that when the compiler is evaluating each parameter type, if the symbol for that parameter appears in that type expression, then it resolves to the value passed in for that parameter by the caller. So if I have

fn foo(x: @TypeOf(x)) void { }

foo("hello");
foo(123);

On each of the calls above, @TypeOf(x) would evaluate to @TypeOf(123) and @TypeOf("hello").

This also adds an new semantic feature, the ability to normalize parameter types. For example, @TypeOf("hello") would evaluate to something like *[5:0]const u8, but we could normalize it to a string slice

fn NormalizedString(comptime T: type) type {
   // if a pointer to an array, return the slice type
   ...
}
fn foo(x: NormalizedString(@TypeOf(x))) void { }

Wow this is a really interesting idea!

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 8, 2020

On each of the calls above, @TypeOf(x) would evaluate to @TypeOf(123) and @TypeOf("hello").

Exactly, that would make possible type constraints aka concepts, arbitrary type manipulation and more. For example a function that widens the type of a generic parameter:

fn WidenIntOrFloat(comptime T: type) type {
    if (T == u8 or T == u16 or T == u32 or T == u64) {
        return u64;
    } else if (T == i8 or T == i16 or T == i32 or T == i64) {
        return i64;
    } else if (T == f32 or T == f64) {
        return f64;
    }
    @compileError("unsupported type");
}

fn example(x: WidenIntOrFloat(@TypeOf(x))) void {}

// const a: u8 = 5; example(a); calls example(u64)
// const b: u32 = 55; example(b); also calls example(u64)
// const f: f32 = 1.5; example(f); calls example(f64)

@codehz
Copy link
Contributor

codehz commented Oct 9, 2020

@Rocknest I disagree the second idea, it make parameter becomes to a undecidable state(both for human and compiler)

Consider following code

fn Evil(comptime T: type) type {
    return struct { // or anything that will create new type
        const I = (if (@hasDecl(T, "I")) T.I else 0) + 1;
    };
}
fn problem(arg: Evil(@TypeOf(arg))) void {
    @compileLog(@TypeOf(arg).I); // how to solve the equation 
}
comptime {
    problem(.{});
}

Although it is theoretically possible to report errors early by limiting the evaluation depth.

Update: use .{} insteads of struct { const I = 0; }

@daurnimator
Copy link
Contributor

I think this is a great idea that solves #1669 in an intuitive way.

@daurnimator
Copy link
Contributor

@codehz I'm not sure where the issue is with your snippet; comptime calls are cached if that's what you're worried about?

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@codehz Where is the problem exactly? Given your code:

fn Evil(comptime T: type) type {                               // line 1
    return struct {  const I = T.I + 1;  };                    // 2
}
fn problem(arg: Evil(@TypeOf(arg))) void {                     // 4
    @compileLog(@TypeOf(arg).I);                               // 5
}

problem(struct { const I = 0; }{});                            // 8

Lets see how it is executed at comptime step by step:

#1 problem(struct { const I = 0; }{})
   | types[1] = struct(line=8, I=0)
   | call("problem", (type=1, value={}))

#2 fn problem(arg: Evil(@TypeOf(arg))) void {
   | args[1] = (type=1, value={})
   | var[1] = TypeOf(args[1]) // (type=type, value=1)
   | var[2] = call("Evil", (var[1]))

#3 return struct { const I = T.I + 1; };
   | args[1] = (type=type, value=1)
   | var[1] = (types[args[1].value]) // struct{line=8, I=0}
   | var[2] = (var[1].I + 1) // 0+1
   | types[2] = struct(line=2, I=(var[2]))
   | return: (type=type, value=2)

#4 fn problem(arg: Evil(@TypeOf(arg))) void {
   // continue #2
   | var[2] = (type=type, value=2)
   | var[3] = (types[result.value]) // struct{line=2, I=1} 
   | var[4] = (types[args[2].type]) // struct{line=8, I=0}
   | var[5] = canCoerce(var[3], var[4]) // false
   | if (var[5] == false) compileError(msg="expected type 'struct:2', found 'struct:8'")

As you can see control flow will not even reach compileLog statement. A compile error would be emited:

error: expected type 'struct:2', found 'struct:8
| problem(struct { const I = 0; }{});  
|         ^
note: type defined here
| fn problem(arg: Evil(@TypeOf(arg))) void {
|                 ^

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

Simulated in status quo Zig: https://godbolt.org/z/6KKP47
As you can see no problem here, a nice compile error is emitted.

fn Evil(comptime T: type) type {         
    return struct { const I = T.I + 1; }; 
}
fn problem(_arg: anytype, arg: Evil(@TypeOf(_arg))) void {
    @compileLog(@TypeOf(arg).I);                               
}
comptime {
    const a = struct { const I = 0; }{};
    problem(a, a);
}

@codehz
Copy link
Contributor

codehz commented Oct 9, 2020

fn Evil(comptime T: type) type {         
    return struct { const I = (if (@hasDecl(T, "I")) T.I else 0) + 1; }; 
}
fn problem(_arg: anytype, arg1: Evil(@TypeOf(_arg)), arg2: Evil(@TypeOf(arg1)), arg3: Evil(@TypeOf(arg2))) void {
    @compileLog(@TypeOf(arg1).I); // 1
    @compileLog(@TypeOf(arg2).I); // 2
    @compileLog(@TypeOf(arg3).I); // 3
}
comptime {
    problem(.{}, .{}, .{}, .{});
}

https://godbolt.org/z/P7a5af
fixed/// and original post has been updated
@Rocknest

@kristoff-it
Copy link
Member

I find this syntax to be a bit confusing. While the concept is legitimate, it's an edge case that requires a leap in understanding compared to other uses of @Type(). With anytype you still have to take a moment to think about what it does exactly, the first time you encounter it, but at least you don't have to fight any "weird" tautological syntax while doing so.

I'm not sure getting rid of a keyword is reward enough for the price of "hiding" an otherwise very friendly and convenient feature.

@tadeokondrak
Copy link
Contributor

I think solving the issues this solves is great, but this syntax is pretty hard to understand and explain.
The rule that when the parameter name is mentioned in the type expression the function becomes a generic function is pretty subtle.

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@kristoff-it it is not a goal to remove the keyword, the idea is to give more power to userland so features like type constraints, type manipulation etc can be implemented. I think anytype not much less weird and confusing, but its a more familiar concept because it exists in other languages.

@tadeokondrak it does not have to be @TypeOf, it can be a new builtin with a more descriptive name, see the last paragraph of the proposal.

@kristoff-it
Copy link
Member

it is not a goal to remove the keyword, the idea is to give more power to userland so features like type constraints, type manipulation etc can be implemented.

@Rocknest for type constraints we can already implement everything (simply by putting the typechecking at the beginning of the function), what we would get from #1669 (or this proposal) is basically only convenience, I think. What do you mean with "type manipulation"?

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@kristoff-it yes we can do that in status quo Zig, but it is not a part of a function prototype, and most of the time users do not bother to add checks and if something not used properly you get a compile error somewhere deep in the code. Its a convenience that would encourage to write more expressive, type-safe and fail-fast code. For a basic example of "type manipulation" see #6615 (comment)

@michal-z
Copy link
Contributor

michal-z commented Oct 9, 2020

Nice idea but I agree with @kristoff-it - this feature just adds a bit more convenience by introducing a new way of doing the same thing. I think that type checking at the beginning of the function is perfectly fine.

Actually I find below easier to understand.

fn WidenIntOrFloat(comptime T: type) type {
    if (T == u8 or T == u16 or T == u32 or T == u64) {
        return u64;
    } else if (T == i8 or T == i16 or T == i32 or T == i64) {
        return i64;
    } else if (T == f32 or T == f64) {
        return f64;
    }
    @compileError("unsupported type");
}

fn example(x: anytype) void {
  WidenIntOrFloat(@TypeOf(x));
}

@ikskuh
Copy link
Contributor

ikskuh commented Oct 9, 2020

Nice idea but I agree with @kristoff-it - this feature just adds a bit more convenience by introducing a new way of doing the same thing. I think that type checking at the beginning of the function is perfectly fine.

I don't think so. I really like the idea, as it allows reducing code bloat. Consider the std.fmt namespace which currently explodes into myriads of functions, each of which does a pretty similar job.

With this proposal, the explosion of types can be reduced a lot:

const AnyConstSlice(comptime T: type) type {
    return []const std.meta.Child(T);
}

fn car(slice: AnyConstSlice(@TypeOf(slice))) std.meta.Child(@TypeOf(slice)) {
    return slice[0];
}

fn cdr(slice: AnyConstSlice(@TypeOf(slice))) @TypeOf(slice) {
    return slice[1..];
}

test "explosion"
{
    var first_0 = car("1");
    var first_1 = car("12");
    var first_2 = car("123");
    var first_3 = car("1234");
}

This would only instantiate one car function (which is fn car([]const u8) u8) instead of four functions:

  • fn car(*const [1]u8) u8
  • fn car(*const [2]u8) u8
  • fn car(*const [3]u8) u8
  • fn car(*const [4]u8) u8

But it's obvious in this example also that @TypeOf(x) isn't the best syntax and maybe something like @Inferred() is better here:

fn car(slice: AnyConstSlice(@Inferred(slice))) std.meta.Child(@TypeOf(slice)) {
    return slice[0];
}

or even more restrictive:

fn car(slice: AnyConstSlice(@Inferred())) std.meta.Child(@TypeOf(slice)) {
    return slice[0];
}

To me, it looks like this proposal is better than #1669, as it solves the same problem, but also allows code optimizations as the ones above.

@michal-z
Copy link
Contributor

michal-z commented Oct 9, 2020

Good point. One more concern with readability.

fn example(x: anytype) void {
  Condition1(@TypeOf(x));
  Condition2(@TypeOf(x));
}

fn example(x: Condition2(Condition1(@TypeOf(x)))) void {
}

Combining Condition1 and Condition2 into one function is not always wanted because we can get many functions to cover all the cases.

@LemonBoy
Copy link
Contributor

LemonBoy commented Oct 9, 2020

most of the time users do not bother to add checks and if something not used properly you get a compile error somewhere deep in the code

Preach brother, too many code paths in the stdlib take the yolo approach and crash and burn if the constraints are not satisfied.

From the syntax pov I prefer the anytype(x) syntax as it complements the bare anytype. While x: @typeOf(x) makes sense (if you squint hard enough) it introduces a weird special-case in the parameter scoping rules that may be hard to explain and justify.

@codehz
Copy link
Contributor

codehz commented Oct 9, 2020

The recursive TypeOf is actually create a equation, and the compiler needs to solve the equation before it can instantiate the correct type. so I think @Inferred() is a better idea, it can be limited to only one level.

Comparison

  1. recursive TypeOf
    a: Evil(@TypeOf(a)) is actually represent equation A = Evil(A) and the compiler have to find the fixed point of the Evil function...
  2. But if we introduce the Inferred as a "compiler magic", it can be limit to only evaluation in top level, so
    a: Evil(@Inferred()) => <to be decided>: Evil(@TypeOf(<initial input type>)). the initial input type and
    to be decided type can be completely different types, so there are no more equation.

@ikskuh
Copy link
Contributor

ikskuh commented Oct 9, 2020

I don't think the proposed application of @TypeOf here is recursive, but uses a once applied action to further refine or reject the type passed to a function. If I understood @Rocknest correctly, the proposal is equivalent to this:

fn proposed(x: Foo(@TypeOf(x)) void { }
fn current(_x: anytype) void {
    const x: Foo(@TypeOf(x)) = _x;
}

And is thus not even a semantic change, but only a syntactical transformation (in theory) which can further be used to reduce the number of function instantiations

@codehz
Copy link
Contributor

codehz commented Oct 9, 2020

@MasterQ32 but the @TypeOf(x) may not equal to Foo(@TypeOf(x)) (see WidenIntOrFloat), if we let @TypeOf(x) = Foo(@TypeOf(x)), then we need re-eval the Foo again (or it will make @TypeOf(x) become ambiguous: two version of x(before Foo, after Foo), two version of @TypeOf(x), it is really bad for human.
Only solution is make @TypeOf behavior different in "parameter" context and "normal" context, I don't think it is a good idea, and what's about (a: Evil(@TypeOf(b)), b: Evil(@TypeOf(a))) (Evil will try to generate a new type each time)?

@ikskuh
Copy link
Contributor

ikskuh commented Oct 9, 2020

Only solution is make @typeof behavior different in "parameter" context and "normal" context, I don't think it is a good idea, and what's about (a: Evil(@typeof(b)), b: Evil(@typeof(a))) (Evil will try to generate a new type each time)?

That's what this proposal proposed. And i also find the double-naming bad (as you can read above) and proposed that we should do some other builtin for this kind of type-inference

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@codehz there is no recursion or any possibility of circular dependency since parameters are evaluated from left to right, even if you can reference self, you cannot reference the next parameter, so a: Evil(@TypeOf(b)), b: Evil(@TypeOf(a)) is a compile error.

Also i don't see a problem in this example x: Func(@TypeOf(x)), y: @TypeOf(x), yes TypeOf calls can potentially return different types, but the first call to TypeOf is isolated in the scope of x.

@jayschwa jayschwa added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Oct 9, 2020
@jayschwa jayschwa added this to the 0.8.0 milestone Oct 9, 2020
@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

There is also another way to transform this code into status quo Zig:

fn call(x: Func(@TypeOf(x)), y: @TypeOf(x)) void {}
call("abc123", "xyz");

As you can see there is only one scope, no new compiler magic required, and @TypeOf evaluates to different types! https://godbolt.org/z/43Yznc

const P = struct { val: anytype };

fn p(val: anytype) P {
    return .{ .val = val };
}

fn Func(comptime T: type) type {
    return []const u8;
}

comptime {
    var param_x = p("abc123");
    @compileLog(@TypeOf(param_x.val)); // | *const [6:0]u8
    param_x = p(@as(Func(@TypeOf(param_x.val)), param_x.val));
    @compileLog(@TypeOf(param_x.val)); // | []const u8
    var param_y = p("xyz");
    @compileLog(@TypeOf(param_y.val)); // | *const [3:0]u8
    param_y = p(@as(@TypeOf(param_x.val), param_y.val));
    @compileLog(@TypeOf(param_y.val)); // | []const u8
}

By the way without indirection through function p i get different result, this probably a bug in the compiler, it wrongfully assumes that some expression is constant and caches it.

@marler8997
Copy link
Contributor

marler8997 commented Oct 9, 2020

@Rocknest looking at your last example:

fn foo(x: Func(@TypeOf(x)), y: @TypeOf(x)) void {}

I think this example demonstrates the "weirdness" of @TypeOf(x) meaning 2 different things, depending on if it appears in the parameter type for x or another parameter type.

This gives us reason to use different syntax and/or builtin to refer to the "caller's version of x" and the "function's version of x". anytype(x) was one suggestion, @Infer() or @Inferred() were other suggestions. In plain english, this value is referring to the "caller value" , so maybe something like @CallType() would also make sense? That would mean writing it this way:

fn foo(x: Func(@CallType()), y: @TypeOf(x)) void {}

Note that this builtin doesn't take any arguments. If we can think of use cases, we may want to access another parameters call type, or multiple call types in a single expression. So we could have it take an argument as well, like this:

fn foo(x: @CallType(x), y: CheckAndTransform(@CallType(x), @CallType(y))) void { }

EDIT: since we can use capitalization to indicate it is a type, maybe @Call(x) is all we need

fn foo(x: @Call(x), y: CheckAndTransform(@Call(x), @Call(y))) void { }

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@marler8997 well it depends how you look at it, if we get rid of 'parameter type' scope and transform function prototype evaluation into a comptime block you will see that weird behaviour is preserved, and TypeOf is not a source of it, its anytype. Therefore is perfectly reasonable not to merge these two concepts, and instead create Infer/Inferred/Call/Arg/Param builtin (keep in mind that we have another place where it could be used: struct { f: anytype } = struct { f: @TypeOf(f) }).

@Rocknest
Copy link
Contributor Author

Rocknest commented Oct 9, 2020

@marler8997 lets consider how your example would be evaluated fn foo(x: @Call(x)) void {}.
foo("123") -> fn foo(x: @Call("123")) void {}, so what would prevent me from using typeof here? I think it doesn't solve anything.

Edit: Its probably wrong how i framed this proposal, the question here is what happens when a parameter is referenced inside the scope of its type evaluation?
foo(x: { @compileLog(x); unreachable; })
In status quo its: error: use of undeclared identifier
With this proposal: | "abc" error: reached unreachable code

Another solution could be keep existing use of undeclared error and introduce a new builtin that does not reference parameters by name:
foo(x: @TypeOfArg(0)), for example foo("abc") -> foo(x: *const [3:0]u8)
or more generic (but verbose)
foo(x: @TypeOf(@valueOfArg(0))), foo("abc") -> foo(x: @TypeOf("abc")) -> foo(x: *const [3:0]u8)

Edit2: also struct needs to be solved:
struct { f: @TypeOf(@valueOfInit(0)) } { "123" } -> struct { f: @TypeOf("123") } { "123" }

@marler8997
Copy link
Contributor

marler8997 commented Oct 10, 2020

@Rocknest, @Call(x) is supposed to be equivalent to @TypeOf(x) in your original proposal.

@marler8997
Copy link
Contributor

marler8997 commented Oct 10, 2020

Another solution could be keep existing use of undeclared error and introduce a new builtin that does not reference parameters by name:
foo(x: @TypeOfArg(0)), for example foo("abc") -> foo(x: *const [3:0]u8)

Yeah I had similar thoughts. You could also reference the parameter by name using a string:

fn foo(x: @Call("x"))

Depending on how builtin calls are analyzed, this might make the implementation more simple, otherwise there might have to be custom argument evaluation for this particular @Call builtin function to elide symbol resolution.

Another alternative to allow us to use the symbols without special casing for the arguments in this case would be to implement this at the syntax level, i.e.

fn foo(x: calltype(x))

In any case, there's a million ways to skin this cat. I think among everyone's comments we've enumerated alot of them. Hopefully we'll land on one people can mostly agree with since I think the feature at it's core would be quite useful.

@SpexGuy
Copy link
Contributor

SpexGuy commented Dec 6, 2020

Discussed this issue with @andrewrk and @marler8997 .

This proposal grants two new abilities:

  1. Ability to put type validation for a function in the parameter list (similar to Proposal: User definable type constraints on polymorphic parameters #1669)
  2. Ability to modify types received by a function, in order to force coercion

This proposal has clear ergonomic benefits for writing generic code. However, everything that this proposal introduces is already possible by writing two functions - one which accepts anytype and performs coercion and validation, and another that receives the modified parameters. So this is purely an ease of use feature.

There are a lot of problems with generic code. Generic code is harder to read, reason about, and optimize than code using concrete types. Even if it compiles successfully for one type, you may see bugs only later when a user passes a different type. Generic code with type validation code has an even worse problem - the validation code has to match with the implementation when it changes, and there’s no way to validate that. So the position of Zig is that code using concrete types should be the primary focus and use case of the language, and we shouldn’t introduce extra complexity to make generics easier unless it provides new tools to solve these problems.

Since everything in this proposal is possible with a wrapper function, we don’t think this is worth the complexity it adds. Especially since this feature is only for generic functions, which should be used sparingly.

Because of that, we've decided to reject this proposal, with the aim of keeping the language simple. We know this may be somewhat unexpected, given the popularity of this issue. However, having a simple language means that there will always be places where it would improve ergonomics to have a little bit more language. In order to keep the language small, we will have to reject many proposals which introduce sugar for existing features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests